African Journal of
Biotechnology

  • Abbreviation: Afr. J. Biotechnol.
  • Language: English
  • ISSN: 1684-5315
  • DOI: 10.5897/AJB
  • Start Year: 2002
  • Published Articles: 12487

Full Length Research Paper

A novel stepwise support vector machine (SVM) method based on optimal feature combination for predicting miRNA precursors

Limei Wang1#, Jin Li1#, Rongsheng Zhu1,2, Liangde Xu1, Ying He1, Ruijie Zhang1* and Shaoqi Rao1*
1Department of Bioinformatics and Computer, Harbin Medical University, Harbin 150081, China. 2College of Science, Northeast Agricultural University, Harbin 150030, China.
Email: [email protected], [email protected]

  •  Accepted: 24 October 2011
  •  Published: 23 November 2011

Abstract

MicroRNAs (miRNAs) are a class of non-coding RNAs that are produced from miRNA precursors (pre-miRNAs) with stem-loop structure. At present, development of computational approach for pre-miRNA identification continues to be a challenging task, in which feature selection is greatly important. Here, we first extracted feature subsets by a hybrid algorithm of genetic algorithm (GA) and support vector machine (SVM) from 124 sequence and secondary structure features. Next, based on the high-frequency features taken from the feature subsets, we proposed a novel stepwise SVM method to identify the optimal feature combinations. The cooperative effect was found among different features in our study. Finally, we obtained 10 feature combinations with strong combined effect which possessed high classification performance for predicting pre-miRNAs. In external validation, all the 10 combinations could predict accurately over 13 pre-miRNAs from 16 new confirmed human pre-miRNAs in miRBase 14.0. The best one could reach 15 (93.75%), which significantly outperformed triplet-SVM (13, 81.25%) in predicting pre-miRNAs.

 

Key words: MicroRNA precursor, feature selection, genetic algorithm, support vector machine.

Abbreviation

 miRNAs, MicroRNAspre-miRNAs, miRNA precursorsSVM,support vector machineGA, genetic algorithm.