African Journal of
Biotechnology

  • Abbreviation: Afr. J. Biotechnol.
  • Language: English
  • ISSN: 1684-5315
  • DOI: 10.5897/AJB
  • Start Year: 2002
  • Published Articles: 12487

Full Length Research Paper

A novel ensemble and composite approach for classifying proteins based on Chou’s pseudo amino acid composition

Jie Lin1, Yan Wang1* and Xu Xu1
Department of Information Management and Information System, College of Economics and Management, Tong Ji University, Shanghai 216000, China
Email: [email protected]

  •  Accepted: 11 November 2011
  •  Published: 23 November 2011

Abstract

For the fact that the location of proteins gave some details about the function of a protein whose location was uncertain, protein classification was regarded as a very important task in the field of biological data mining. However, the success of a human genome project led to a protein sequence explosion. There is a great need to develop a computational method for fast and reliable prediction of the locations of proteins according to their primary sequences. In this paper, we used the composite classifier system that was formed by a set of k-nearest neighbor (K-NN) classifiers, each of which was defined in a different pseudo amino composition vector. In the pseudo amino composition vector space, protein can be presented by Pseudo amino acid composition. The location of a queried protein is determined by the outcome of choice made among these constituent individual classifiers. It is shown through the outcome that the classifier outperformed the single classifier widely used in biological literature. So the composite classifier can be employed as a robust method to predict protein location in the field of biological data mining.

 

Key words: Composite classifier system, biological data mining, atomic classifiers, pseudo amino acid composition.