Gene sequence analysis is a key-step for genomic research, which help to understand the genome of species once it has been sequenced. It includes pair-wise, comparative or multiple sequence analysis. The Genome On-Line Database (GOLD) provides information about the number of completed, meta, incomplete and targeted genome projects. The statistics of GOLD show 2942 of completed, 7687 of incomplete, 340 of meta and 440 of targeted genome projects. The Support Vector Machine (SVM) is a widely used technique that analyzes the gene expression or micro array data. In the present study, we performed inter and intra species comparative nucleic acid as well as protein sequence analysis of Leucine Rich Repeat (LRR) and Ice-recrystallization Inhibition (IRI) domain containing plant antifreeze proteins (AFPs), which provide extensive understanding of their sequential characteristics and help in their classification and in production of transgenic constructs to improve the agricultural yields. Here, classification based on their sequential characteristics was made accordingly, the AFPs from Daucus carota bearing only LRR domains were placed in Class I group while AFPs with both LRR and IRI domains fromTriticum aestivum, Deschampsia antarctica, Lolium perenne and Hordeum vulgare were placed in Class II group. In Class II groups, the entries with less than ten occurrences of IRI were placed in a subgroup A, while the other with more than ten incidences of IRI was placed in a subgroup B. Later, the entries in A and B which has single LRR patterns were placed separately under the Group A1 and B1, whereas those with more than one occurrence were placed in the groups A2 and B2 respectively. Again, the entries in B1 were reclassified based on the conservation of LRR into C1 and C2 groups respectively. LRR regions were found to be enriched with alpha and beta sheet whereas IRI regions contain coil and sheets. The reported classification scheme and proposed methodology facilitate the identification, annotation and construction of synthetic plant AFPs in near future. Ongoing efforts are directed towards the development of comprehensive database integrated with the prediction server for identification of new class of plant AFPs and their homology in an extensive manner.
Key words: Antifreeze protein, leucine rich repeat, over-wintering plants, comparative sequence analysis, ice-recrystallization inhibition protein.
Copyright © 2021 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0