Lncident: A Tool for Rapid Identification of Long Noncoding RNAs Utilizing Sequence Intrinsic Composition and Open Reading Frame Information

Int J Genomics. 2016:2016:9185496. doi: 10.1155/2016/9185496. Epub 2016 Dec 27.

Abstract

More and more studies have demonstrated that long noncoding RNAs (lncRNAs) play critical roles in diversity of biological process and are also associated with various types of disease. How to rapidly identify lncRNAs and messenger RNA is the fundamental step to uncover the function of lncRNAs identification. Here, we present a novel method for rapid identification of lncRNAs utilizing sequence intrinsic composition features and open reading frame information based on support vector machine model, named as Lncident (LncRNAs identification). The 10-fold cross-validation and ROC curve are used to evaluate the performance of Lncident. The main advantage of Lncident is high speed without the loss of accuracy. Compared with the exiting popular tools, Lncident outperforms Coding-Potential Calculator, Coding-Potential Assessment Tool, Coding-Noncoding Index, and PLEK. Lncident is also much faster than Coding-Potential Calculator and Coding-Noncoding Index. Lncident presents an outstanding performance on microorganism, which offers a great application prospect to the analysis of microorganism. In addition, Lncident can be trained by users' own collected data. Furthermore, R package and web server are simultaneously developed in order to maximize the convenience for the users. The R package "Lncident" can be easily installed on multiple operating system platforms, as long as R is supported.