UniDPlot: A software to detect weak similarities between two DNA sequences

Marc Girondot; Jean-Yves Sire

doi:10.5897/JBSA.9000028

Journal of
Bioinformatics and Sequence Analysis

Abbreviation: J. Bioinform. Seq. Anal.
Language: English
ISSN: 2141-2464
DOI: 10.5897/JBSA
Start Year: 2009
Published Articles: 49

Full Length Research Paper

UniDPlot: A software to detect weak similarities between two DNA sequences

Marc Girondot1,2* and Jean-Yves Sire3

1Laboratoire d’Écologie, Systématique et Évolution, UMR 8079 Centre National de la Recherche Scientifique, Université Paris Sud et ENGREF, 91405 Orsay cedex 05, France. 2Département de Systématique et Evolution, Muséum National d’Histoire Naturelle de Paris, 25 rue Cuvier, 75005 Paris, France. 3Université Pierre and Marie Curie-Paris 6, UMR 7138 "Systématique, Adaptation, Evolution", 7 quai St-Bernard, 75005 Paris, France.
Email: [email protected]

Article Number - BD4E9352879
Vol.2(5), pp. 69-74 , October 2010
https://doi.org/10.5897/JBSA.9000028

Accepted: 21 June 2010
Published: 31 October 2010

Copyright © 2024 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0.

Abstract

Search for DNA sequence similarity is a crucial step in many evolutionary analyses and several bioinformatic tools are available to fulfill this task. Basic local alignment search tool (BLAST) is the most commonly and highly efficient algorithm used. However, it often fails in identifying sequences showing very weak similarity. An alternative method is to use Dot Plot, but such a graphical method is not suitable for the analysis of large sequences (e.g. hundreds of kilobases) as this is now more often required in the context of genome sequencing programs. As an alternative to the classical Dot Plot method, we designed UniDPlot, which permits to search for weak similarity either between two large sequences (e.g., genome regions, ...) or between one large sequence and a short one (e.g., exons, …). UniDPlot methodology contracts the output of the Dot Plot similarity matrix along the length of the largest sequence, while defining statistical limits of significance using a bootstrap procedure. To illustrate the efficiency of this method, we used UniDPlot to search for the fate of the gene that encodes the major enamel protein, amelogenin, in chicken. Although we showed that amelogenin was invalidated through a pseudogeneization process, we recovered the entire sequence in the chicken genome. Using UniDPlot, we have identified a pseudogene, which was not detected by classical methods. UniDPlot can be used to search for missing genes, or motifs of various sizes in different genomic contexts.

Key words: DNA sequence similarity, UniDimensional plot (UniDPlot) software, genome.

This article is published under the terms of the Creative Commons Attribution License 4.0

Back to Vol. 2 No. 5

Back to articles

Views: 0
Downloads: 0

Related Articles:
On Google
On Google Scholar

Articles on Google by:

Journal of Bioinformatics and Sequence Analysis

UniDPlot: A software to detect weak similarities between two DNA sequences

Marc Girondot1,2* and Jean-Yves Sire3

Journal of
Bioinformatics and Sequence Analysis