Full Length Research Paper
Abstract
The polyadenylation of messenger RNA (mRNA) in eukaryotes is an essential step in gene expression. Currently, with the in-depth sequencing, a considerable amount of alternative poly(A) sites have been found in the coding sequences and introns, while there was little study on these unconventional poly(A) sites and their signals. To study the signals of mRNA polyadenylation, an effective poly(A) signal pattern recognition model was established to select and analyze the nucleotide patterns in the poly(A) site-related regions from large scale sequences generated from Sanger and next generation sequencing technologies. Our model, integrating a pattern and an assembly analysis pipelines and several visualization methods could be applied to various species. Through recognition of poly(A) patterns in three species including rice, Arabidopsis and Chlamydomonas reinhardtii, the experimental results showed that this model was able to select effective poly(A) signal patterns for poly(A) sites and alternative poly(A) sites to compare the poly(A) signals in different species and different regions, and to enhance the accuracy of poly(A) sites recognition to a larger extent.
Key words: Polyadenylation signal, pattern recognition, alternative polyadenylation.
Copyright © 2024 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0