ISSN 1684–5315 © 2003 Academic Journals

DNA sequencing is the deciphering of hereditary information. It is an indispensable prerequisite for many biotechnical applications and technologies and the continual acquisition of genomic information is very important. This opens the door not only for further research and better understanding of the architectural plan of life, but also for future clinical diagnosis based on the genetic data of individuals. Bioinformatics can be broadly defined as the creation and development of advanced information and computational techniques for problems in biology. More narrowly, bioinformatics is the set of computing techniques used to manage and extract useful information from the DNA/RNA/protein sequence data being generated (at high volumes) by automated techniques (e.g., DNA sequencers, DNA microarrays) and stored in large public databases (e.g., GenBank, Protein DataBank). Certain method for analyzing genetic/protein data has been found to be extremely computationally intensive, providing motivation for the use of powerful computers. The advent of the Internet and the World Wide Web (WWW) has substantially increased the availability of information and computational resources available to experimental biologists. This review will describe the current on-line resources available, including protein and nucleic acids sequence alignment. Key words : Sequence alignment, DNA, Protein, ClustalW, FASTA. 
African Journal of Biotechnology Vol. 2 (12), pp. 714-718, December 2003


INTRODUCTION
Bioinformatics is the application of Information technology to store, organize and analyze the vast amount of biological data which is available in the form of sequences and structures of proteins (the building blocks of organisms) and nucleic acids (the information carrier).The biological information of nucleic acids is available as sequences while the data of proteins is available as sequences and structures.Sequences are represented in single dimension where as the structure contains the three dimensional data of sequences.Sequence alignment is by far the most common task in bioinformatics.Procedures relying on sequence comparison are diverse and range from database searches (Altschul et al., 1990) to secondary structure prediction (Rost et al., 1994).Usually sequences either protein or DNA come in families.Sequences in a family have diverged from each other in their primary sequence during evolution, having separated either by a duplication in the genome or by speciation giving rise to corresponding sequences in related organisms.In either case they normally retain a similar function.If you have already a set of sequences belonging to the same family is available, one can perform a database search for more members using pairwise alignments with one of the known family members as the query sequence (e.g.BLAST).However pairwise alignments with any one of the members may not find sequences distantly related to the ones you already have.An alternative approach is to use statistical features of the whole set of sequences in the search.Such features can be captured by a multiple sequence alignment.This review summarizes and extends the below-mentioned on-line tools, which are publicly available, in the context of the analysis procedure for sequence alignment, and gives an overview of the most versatile and efficient websites.

SEQUENCE ALIGNMENT METHODS
Alignment provides a powerful tool to compare related sequences, and the alignment of two residues could reflect a common evolutionary origin, or could represent common structural and/or catalytic roles, not always reflecting an evolutionary process.Deletions, insertions and single residue substitutions are generally emphasized by alignments.Deletions or insertions are represented by null characters, added to one of the sequences, which will be aligned with letters in the other sequence (Rehm, 2001).There are various forms of sequence alignment.Alignments can be made between sequences of the same type (for example, between the primary structures of proteins) or between sequences of different type (for example, alignment of a DNA sequence to a protein sequence or of a protein to a threedimensional structure).Pairwise alignment involves only two sequences, whereas multiple sequence alignment involves more than two sequences (although the term sometimes encompasses pairwise alignment also).Global alignment aligns whole sequences, whereas local alignment aligns only parts of sequences.
Database searches to extract homologous sequences are at the heart of sequence analysis, hence a variety of methods have been developed and applied in widely available packages or as network servers.In general, there is a trade-off between speed and sensitivity of the algorithms.The quick word search program FASTA (Pearson and Lipman, 1988) and the more recent and even faster BLAST (Altschul et al., 1990) are now the workhorses of database searching.
The immense number of nucleotide and protein sequences that can be accessed through public databases on the Internet is an invaluable resource to scientists working in the fields of molecular biology, Abd-Elsalam 715 protein chemistry and molecular diagnostics.These servers allow investigators to cut and paste their sequences into forms on their Web sites and set various parameters, such as penalty values associated with the insertion of gaps into the sequences, to optimize the overall alignment (Gaskell, 2000).Sequence aliginment search tools may be divided into two groups, illustrated below.

PAIRWISE SEQUENCE ALIGNMENT (PSA)
Pairwise and multiple alignment therefore continue to be among the most active areas of bioinformatics research.
Pairwise sequence alignment given two DNA or protein sequences, find the best match between them.In such a match, there is a penalty for opening gaps or extending gaps for each of the sequences and for nucleotide/amino acids that are different.The best match is the one with the minimum sum of such penalties.Pairwise comparison provides computer tools to directly compare two sequences, either nucleic acid or peptide.They are the starting points for all kinds of sequence analysis.These tools are very useful when verifying sequence data, cloning projects, PCR analysis, and many more.
Most sequence alignment methods seek to optimize the criterion of similarity.There are two modes of assessing this similarity, local and global.Local methods try to determine if subsegments of one sequence (A) are present in another (B).These methods have their greatest utility in data base searching and retrieval (e.g.BLAST, Altschul et al., 1990).Although they may be of utility in detecting sequences with a certain degree of similarity that may or may not be homologous, in phylogenetic analysis it is assumed that the sequences being compared are orthologous.Global methods make comparisons over the entire lengths of the sequences; in other words, each element of sequence A is compared with each element in sequence B. Global comparison is the principal method of alignment for phylogenetic analysis.Several pairwise sequence alignment programs available on the Web (Table 1).

MULTIPLE SEQUENCE ALIGNMENT (MSA)
Comparison of multiple sequences can reveal gene functions that are not evident from simple sequence homologies.As a result of genome sequencing projects, new sequences are often found to be similar to several un-characterized sequences, defining whole families of novel genes with no informative BLAST or FASTA similarities.However, such a family enables the application of efficient alternative similarity search methods.Software packages are available that derive profiles from a multiple sequence alignment.Profiles incorporate position-specific scoring information that is   , 1998;Krogh, 1998).There are a numerous webbased resources for multiple sequence alignment (Table 2).

OUTLOOK
The growth in output of DNA sequence data has run in parallel with the well known exponential growth rate of computing power, as well as with the advent and exploitation of the Internet and the World Wide Web (WWW).These circumstances have encouraged the development of biological databases, and nucleotide and protein analysis tools, so that a vast array of tools and databases is now available.I have focused on a number of different sequence alignment tools available as services over the Web, it is important to realize that the results they generate can be used in other analytical tools, such as those designed for molecular phylogenetic or protein molecular modeling studies.Also, these tools are very useful when verifying sequence data, cloning projects, PCR analysis, and many more.Sequence alignment plays a central role in the bioinformatics research no matter whether it is realized or not.
8-BCM Search Launcher: Pairwise Sequence Alignment http://searchlauncher.bcm.tmc.edu/seqsearch/alignment.html Smith et al., 1996.9-PipMaker: computes alignments of similar regions in two DNA sequences.The resulting alignments are summarized with a ``percent identity plot'', or ``pip'' for short.All pairwise alignments with the first sequence are computed and then returned as interleaved pips.http://bio.cse.psu.edu/pipmaker/database searches.Most new profile software is based on statistical HMMs.Much more comprehensive reviews of the literature on profile HMM methods are available elsewhere (Eddy 1996; Baldi and Brunak, 1998; Durbin

Table 2 .
Multiple sequence alignment links.