Full Length Research Paper
Abstract
Most positions of the human genome are typically invariant (99%) and only some positions (1%) are commonly invariant which are associated with complex genetic diseases. Haplotype information has become increasingly important in analyzing fine-scale molecular genetics data, due to the mutated form in human genome. Haplotype assembly is to divide aligned single nucleotide polymorphism (SNP) fragments, which is the most frequent form of difference to address genetic diseases, into two classes, and thus inferring a pair of haplotypes from them. Minimum error correction (MEC) is an important model for this problem but only effective when the error rate of the fragments is low. MEC/GI as an extension to MEC, employs the related genotype information besides the SNP fragments and so results in a more accurate inference. The haplotyping problem, due to its NP-hardness, may have no efficient algorithm for exact solution. In this paper, we focus to design serial and parallel classifiers with two classifiers. Genetic algorithm and K-means were two components of our approaches. This combination helps us to cover the single classifier’s weaknesses.
Key words: Multiple classifier systems, parallel classifiers, serial classifiers, haplotype, SNP fragments, genotype information, classification, reconstruction rate.
Copyright © 2024 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0