Bioinformatics tools for development of fast and cost effective simple sequence repeat ( SSR ) , and single nucleotide polymorphisms ( SNP ) markers from expressed sequence tags ( ESTs )

The development of current molecular biology techniques has led to the generation of huge amount of gene sequence information under the expressed sequence tag (EST) sequencing projects on a large number of plant species. This has opened a new era in crop molecular breeding with identification and/or development of a new class of useful DNA markers called genic molecular markers (GMMs). These markers represent the functional component of the genome in contrast to all other random DNA markers (RMMs). Many recent studies have demonstrated that GMMs may be superior to RMMs for use in the marker assisted selection, comparative mapping and exploration of functional genetic diversity in the germplasms adapted to different environment. Therefore, identification of DNA sequences which can be used as markers remains fundamental to the development of GMMs. Amongst others; bioinformatics approaches are very useful for development of molecular markers, making their development much faster and cheaper. Already, a number of computer programs have been implemented that aim at identifying molecular markers from sequence data. A revision of current bioinformatics tools for development of genic molecular markers is, therefore, crucial in this phase. This mini-review mainly provides an overview of different bioinformatics tools available and its use in marker development with particular reference to SNP and SSR markers.


INTRODUCTION
Most of the agriculturally important traits such as yield, quality and tolerance and/or resistance to biotic and abiotic stress are polygenic in nature and are often termed as 'quantitative traits'.The regions within genomes that contain genes associated with a particular quantitative trait are known as 'quantitative trait loci' (QTLs) (Collard et al., 2005).Genetic markers are specific loci in chromosomes of particular organisms associated with a trait and can be used as tool for marker assisted selection (MAS) in plant breeding.Genetic marker assisted breeding is more efficient, effective, reliable and cost effective as compared to conventional plant breeding (Collard et al., 2005).Genetic marker system can be broadly classified into three types: (i) morphological markers, (ii) biochemical markers and (iii) molecular (DNA) markers (Winter and Kahl, 1995).The phenotypic traits which can be visually characterized such as leaf colour, seed shape and size, flower colour, etc., are termed as morphological markers (Winter and Kahl, 1995).Isozymes are the most common biochemical markers used in plant breeding.The major disadvantages of morphological and biochemical markers are that they are limited in number and also in some cases influenced by environmental factors (Varshney et al., 2005).Moreover, their expression may be restricted to specific developmental stages or tissues.Biochemical markers are superior to morphological markers in that they are generally independent of environmental growth conditions (Varshney et al., 2005).The third and the most advance form of genetic markers are molecular markers which reveal DNA sequence variations called 'polymorphisms' (Collard et al., 2005).Polymorphic markers can be dominant or co-dominant markers based on whether markers can discriminate between homozygotes and heterozygotes loci (Collard et al., 2005).Molecular markers are broadly classified into three classes based on the method of their detection: (i) hybridization based markers such as restriction fragment length polymorphisms (RFLP), (ii) PCR based markers such as random amplification of polymorphic DNA (RAPD), amplified fragment length polymorphisms (AFLP) and microsatellite or simple sequence repeat (SSR), and (iii) sequence based markers such as single nucleotide polymorphisms (SNP) (Gupta and Rustgi, 2004).
With the recent advancement of functional genomics, several gene discovery projects such as genome sequencing, EST generation and analysis has resulted in the accumulation of enormous amount of sequence data from complete or partial genes (Varshney et al., 2005).ESTs are short DNA sequences corresponding to a fragment of a complimentary DNA (cDNA) molecule and which may be expressed in a cell at a particular given time.ESTs are currently used as a fast and efficient method of profiling genes expressed in various tissues, cell types or developmental stages (Adams et al., 1991).These sequences are mainly stored into three databases which are again interconnected.These three databases are (i) GenBank in the National Centre for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/),(ii) the European Molecular Biology Laboratory nucleotide sequence database (EMBL, http://www.ebi.ac.uk/embl/) and (iii) the DNA database bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp/).Also, recently many specific databases are set up for specific researches or specific species, for example databases in The Arabidopsis Information Resource (TAIR, http://www.arabidopsis.org/).These nucleotide sequences have become a valuable and cheap source for developing molecular markers which has opened up a new chapter in molecular markers as genic molecular markers (GMMs) which are developed directly from coding sequences like ESTs or fully characterized genes (Anderson and Lubberstedt, 2003).
The identification of sequences among all others which can be used as markers thus is fundamental to development of GMMs.Amongst others, bioinformatics approaches are very useful for the development of GMMs, making their development much faster and cheaper (Anderson and Lubberstedt, 2003).A number of software programs have been implemented for identification of molecular markers from sequence data.SSRs and SNPs markers are abundant in genomic sequences as well as in ESTs which can be detected automatically with the help of different programs and pipelines developed for mining these markers from public sequences (Ching et al., 2002).An understanding of the different tools and bioinformatics techniques for marker identification and/or development will enable plant breeders and researchers working in other relevant disciplines to work together towards a common goal of increasing the efficiency of global food production.

SNP MARKER IDENTIFICATION AND DEVELOPMENT
Single nucleotide polymorphism (SNP) is a DNA sequence variation occurring when a single nucleotide -A-T-C or Gin the genome differs between members of a species (or between paired chromosomes in an individual) (Ching et al., 2002).There are three different categories of SNPs: transitions (C/T or G/A), transversions (C/G, A/T, C/A, or T/G) and small insertions/deletions (indels).SNPs at any particular site could be principle in bi-, tri-or tetra-allelic, however tri-and tetra-allelic SNPs are rare, and in practice SNPs are generally biallelic (Doveri et al., 2008).SNPs may occur in the coding, non-coding and intergenic regions of the genome, thus enabling the discovery of genes as a result of the differences in the nucleotide sequences.In recent years, many research papers have reported SNPs as excellent markers for association mapping of polygenic traits with highest map resolution (Botstein and Risch, 2003;Brookes, 1999;Bhattramakki et al., 2002).Also, SNPs are reported to be the most frequent type of variation found in DNA (Brookes, 1999;Cho et al., 1999), with their discovery together with insertions/deletions has formed the basis of most differences between alleles.In Arabidopsis, over 37, 000 SNPs have been identified through the comparison of two accessions (Jander et al., 2002).Ching et al. (2002) reported that they occur in a frequency of one non-coding SNP per 31 bp and 1 coding SNP per 124 bp in 18 maize genes assayed in 36 inbred lines.A number of EST collections have been used to describe and detect SNPs in maize (Zea mays L.) (Ching et al., 2002) and Soybean (Glycine max L.) (Zhu et al., 2003).
Different strategies used for development of new SNP markers can be broadly classified under two categories.The first is a wet lab method (experimental) and the other is the computational (bioinformatics) methods.The experimental method of SNPs discovery is expensive and time consuming (Schlotterer, 2004;Useche et al., 2001).Also, the infra structure needed for, may be unavailable to laboratories in the under and developing world.In contrast, a computational approach to discover potential SNPs from publicly available sequences makes the development of SNP markers rapid and less expensive.For computational SNP discovery, two important points should be considered.First, the program should be able to distinguish allelic variation from sequence variation between paralogous sequences (Marth et al., 1999;Le Dantec et al., 2004;Batley et al., 2003).Secondly, the program should be able to recognize sequencing errors which are usually caused by poor quality sequences, especially for EST data (Picoult-Newberg et al., 1999;Garg et al., 1999;Batley et al., 2003;Matukumalli et al., 2006).
Mining of SNPs from EST sequences is an attractive method for marker development in plants where genome sequences are not yet available.The steps involved in SNP discovery from EST sequences include clustering, sequence assembly and SNP detection (Batley et al., 2003).There is several bioinformatics software to handle each of these steps.A number of methods used to identify SNPs in aligned sequence data rely on sequence trace file analysis to filter out sequence errors by their dubious trace quality (Marth et al., 1999).The major drawback to this approach is that the sequence trace files required are rarely available for large sequence datasets collected from a variety of sources.In cases where trace files are unavailable, two complementary approaches have been adopted to differentiate between sequence errors and true polymorphisms: (i) assessing redundancy of the polymorphism in an alignment, and (ii) assessing co-segregation of SNPs to define a haplotype.The most important limitation for use of EST for SNP marker development is that EST data provides very limited polymorphisms (Matukumalli et al., 2006).Also, other factors such as alternative splicing, reverse transcription errors and RNA editing interfere with the predictions even after including sequence quality scores.But SNP discovery from EST sequences was successfully implemented for maize (Rafalski, 2002) and pine (Le Dantec et al., 2004) species by constructing a software data analysis pipeline.Thus, the selection of optimal tool for SNP identification and/or discovery basically depends on the nature of input sequences.A number of pipelines have been developed to automatically detect SNPs in sequences which have been listed in Table 1.

TOOLS REQUIRING TRACE FILES
In late 1990s, efforts were being made to develop computer programs to automate base calling (Phred), sequence assembly (Phrap) and sequence assembly editing (Consed) to analyze the results of fluorescence based sequencing.Nickerson et al. (1997) came forward with a program called 'PolyPhred' that automatically detects the presence of heterozygous single nucleotide substitutions by fluo-rescence -based sequencing of PCR products.When sequences containing known variants were analysed using this program, approximately 99% accuracy was found.Polyphred is widely used because it can detect heterozygous bases from two alleles within an individual (Matukumalli et al., 2006).This was one of the major developments with regard to automated detection of SNPs.Another tool which requires sequence trace files is PolyBayes which uses a Bayesian-statistical model to find differences within assembled sequences based on the depth of coverage, the base quality values and the expected rate of polymorphic sites in the region (Marth et al., 1999).
Another software tool, which came forward in due time for automated identification of SNPs and mutations in fluorescence-based re-sequencing reads is SNPdetector (Zhang et al., 2005).This software tool was designed to model the process of human visual inspection with a very low false positive and false negative rate.The author states superior performance of SNPdetector in SNP and mutation analysis by comparing its results with those derived by human inspection, PolyPhred and independent genotype assays in three different large-scale investigations (Zhang et al., 2005).SNPdetector runs on Unix/ Linux platform and is available publicly (http://lpg.nci.nih.gov).Another user friendly, freely available software tool for inspecting SNP based genetic variations is novoSNP (Weckx et al., 2005) and InSNP (Manaster et al., 2005).The author of both software tool states it to perform better than that of PolyPhred and PolyBayes.
An improved version of novoSNP (Weckx et al., 2005) came as novoSNP3 (Rijk et al., 2007) that along with discovering SNPs and indels polymorphisms in sequence trace files, can also be used to create databases containing annotated reference sequences, add and align trace data, keep track of validation status of variants, annotate variants, and produce reports on validated variants and genotypes.novoSNP is available from http://www.molgen.ua.ac.be/bioinfo/novosnp.There are versions for MS Windows as well as Linux.Software tool SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis and GenBank (-dbSNP) submissions) was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs (Matukumalli et al., 2006).This package is being made available at open source at http://bfgl.anri.barc.usda.gov/ML/snp-phage/.
SNP-PHAGE uses PolyBayes (Marth et al., 1999) and PolyPhred (Nickerson et al., 1997) for analysis, storing and editing of polymorphisms information in a relational database through a user friendly web interface.SNP-PHAGE was used to analyze sequences from diverse soybean genotypes with discovery of 10,000 SNPs.SNP-PHAGE is freely available at http://bfgl.anri.barc.usda.gov/ML/snp-phage/.

Tools detecting SNPs without trace files
AutoSNP software program was developed to detect SNPs and indels from EST sequences (Batley et al., 2003).This program uses d2cluster (Burke et al., 1999) and cap3 (Huang and Madan, 1999) to cluster and align EST sequences, and uses redundancy to differentiate between candidate SNPs and sequence errors.Candidate polymorphisms are identified as occurring in multiple reads within an alignment.AutoSNP calculates two associated measurements of confidence in the validity of SNPs for each polymorphism.The frequency of occurrence of a polymorphism at a particular locus provides a primary measure of confidence in the SNP representing a true polymorphism and is referred to as the SNP redundancy score.The co-segregation of multiple SNPs within an alignment to define a haplotype provides a second measure of confidence in SNP validity and is referred to as the co-segregation score.QualitySNP (http://www.bioinformatics.nl/tools/snpweb/)was reported to be an efficient tool for SNP detection, storage and retrieval in diploid as well as polyploidy species.It can be run on Linux or UNIX system (Tang et al., 2006).It uses a haplotype-based strategy to detect reliable synonymous and non-synonymous SNPs from public EST data without the requirement of trace/quality files or genomic sequence data.Haplotypes in this context represent the different alleles of a gene in a dataset.The haplotype reconstruction is based on a mathematical algorithm.It uses three filters for the identification of reliable SNPs.Filter 1 screens for all potential SNPs and identifies variation between or within genotypes.Filter 2 is the core filter that uses a haplotype-based strategy to detect reliable SNPs.Clusters with potential paralogs as well as false SNPs caused by sequencing errors is identified.Filter 3 screens SNPs by calculating a confidence score, based upon sequence redundancy and quality.Non-synonymous SNPs are subsequently identified by detecting open reading frames of consensus sequences (contigs) with SNPs.The pipeline includes a data storage and retrieval system for haplotypes, SNPs and alignments.QualitySNP's versatility was demonstrated by the identification of SNPs in EST datasets from potato, chicken and humans (Tang et al., 2006).
HaploSNPer (http://www.bioinformatics.nl/tools/haplosnper/) is web-based SNP discovery and allele detection tool based on QualitySNP (Tang et al., 2008).It is a flexible web-based tool for detecting SNPs and alleles in user-specified input sequences from both diploid and polyploidy species.It includes BLAST for finding homologous sequences in public EST databases, CAP3 or PHRAP for aligning them, and QualitySNP for discovering reliable allelic sequences and SNPs.Also, HaploSNPer provides a user friendly interface for visualization of SNP and alleles.Singhal et al. (2011) used HaploSNPer and found 40589 reliable SNPs in Sorghom bicolor genome.Although, HaploSNPer is able to detect SNP, allele, haplotype reconstruction but it does not extend the analysis to diversity, linkage disequilibrium or haplotype network study.Another web based tool which fulfil this need came forward in 2011 called SNiPlay (Dereeper et al., 2011) which is expected to assist biologists in extracting and analyzing polymorphism data in a simple and robust way.SniPlay (http://sniplay.cirad.fr/) is reported to be a user-friendly and integrative web-based tool dedicated to polymorphism discovery and analysis.It integrates pipeline which is freely accessible through the internet, combining existing software's with new tools to detect SNPs and to compute different types of statistical indices and graphical layouts for SNP data.It is able to detect SNPs and indels from standard sequence alignments, genotyping data or Sanger sequencing.Furthermore, the pipeline allows the use of external data (such as phenotype, geographic origin, taxa, stratification) to define groups and compare statistical indices.It also integrates database for storing polymorphisms, genotyping data and grapevine sequences released by public and private projects which allows the user to retrieve SNPs using various filters (such as genomic position, missing data, polymorphism type, allele frequency).Also, it can be used to compare SNP patterns between populations (Dereeper et al., 2011).
SNPServer (Savage et al., 2005) is a real time implementation of the autoSNP method, accessed via a web server.It uses autoSNP software by providing a web interface for sequence input, comparison and assembly and permits rapid discovery of SNPs.SNPServer (http://hornbill.cspp.latrobe.edu.au/snpdiscovery.html)uses BLAST to identify related sequences, and CAP3, to cluster and align these sequences.The alignments are parsed to the SNP discovery software autoSNP.
All the above mentioned tools were developed to discover single nucleotide polymorphisms (SNPs) derived from re-sequencing.Whether an identified SNP is indeed a novel SNP or is already contained in dbSNP was a big question and sometimes confusing.Chang et al. (2009) came forward with freely available software called 'Seq-SNPing' (http://bio.kuas.edu.tw/Seq-SNPing), which is Java-based software for SNP discovery, and ID identification and editing and visualization of sequence alignments.According to its author, it is easy to use, fast, and provides an accurate method for searching and organizing SNP IDs from multiple sequence inputs, thereby greatly facilitating genetic studies.
Different software tools described above were designed based on the needs of different developers.InSNP is windows based package and can be helpful for users not familiar with Linux.SNPdetector scripts work only on Unix/Linux platforms and use the Smith-Waterman algorithm for aligning reads, as well as a modified version of the NQS (Altshuler et al., 2000) method for detecting homozygous SNPs among different individuals.Also, SNP detector requires a minimum of a 30% threshold for secondary peak intensity for detecting heterozygous SNPs.NovoSNP works on windows as well as Unix/Linux based platforms.NovoSNP uses BLAST (Altschul et al., 1990) for aligning sequence reads and uses a series of filters to reduce false positives.This package is configured to work with a database, and, hence, it makes polymorphism discovery and data storage convenient.Other polymorphism discovery software, such as auto SNPrely on redundancy and co-segregation of markers within a sequence are useful when trace data are not available.

SSR MARKER IDENTIFICATION AND DEVELOPMENT
Microsatellites or SSRs are shot tandem repeats of 1-6 nucleotides that occur with high frequency throughout the genomes of many organisms (Weber, 1990).There polymorphisms consists of variations in the number of repeats, which was suggested to be due to slippage of polymerase (Kruglyak et al., 1998).SSRs have been reported to be superior to other molecular markers because (i) multiple SSR alleles may be detected at a single locus using a simple PCR based screen, (ii) SSRs are evenly distributed all over the genome, (iii) they are co-dominant, (iv) very small quantities of DNA are required for screening and (v) analysis may be semiautomated (Varshney et al., 2005).Due to these features, SSRs have become valuable genetic markers for linkage mapping, QTL mapping, association mapping and diversity analysis (Jones et al., 1997;Powell et al., 1996;Varshney et al., 2005).Conventional methods for developing SSRs is laborious, time consuming and expensive (Powell et al., 1996) which involves construction of genomic libraries and subsequent screening for the presence of SSR repeat motifs in the clones (Powell et al., 1996).With the recent advancement and establishment of EST sequencing projects in several plant species, a wealth of DNA sequence information has been generated and deposited in public databases (Rudd, 2003).Also, sequence data for many fully characterized genes and full length cDNA clones have been generated for some plant species (Varshney et al., 2005).Genic SSRs have certain noticeable advantages over genomic SSRs.They are (i) quickly obtained by electronic sorting, (ii) represents functional region of the genome and (iii) more transferable between related species (Gao et al., 2003;Cordeiro et al., 2001;Decroocq et al., 2003;Yu et al., 2004;Varshney et al., 2005;Tang et al., 2006).The presence of SSR in expressed region of genomes suggests that they may have a role in gene expression or function.For example the waxy gene in rice has been found to contain a poly(CT) microsatellite in the 5'-untranslated region (UTR) whose length polymorphisms is associated with amylase content (Ayres et al., 1997).In general, approximately 5% of plants ESTs contain SSRs with a minimum length of 20 nucleotides (Varshney et al., 2005;Kantety et al., 2002;Ghislain et al., 2004;Poncet et al., 2006).Thus, in silico approaches for screening SSRs from sequences have become efficient and inexpensive alternative for plant species.Different software tools that have been developed to detect SSRs are listed in Table 2.
Sputnik is a C language program that searches DNA sequence file in FASTA format for microsatellite repeats.It uses a recursive algorithm to search for repeated patterns of nucleotides of length between 2 and 5 (Abajian, 1994) and finds perfect, compound and imperfect repeats.The output is a file of SSRs in tabular format.Unix, Linux and windows versions of sputnik are available from http://espressosoftware.com/pages/sputnik.jspand http://cbi.labri.fr/outils/Pise/sputnik.html.Sputnik has been applied for SSR identification in many species including Arabidopsis and barley (Cardle et al., 2000).Tandem Repeats Finder (TRF) (Benson, 1999) (http://tandem.bu.edu/trf/trf.html)can find very large SSR repeats, up to a length of 2000 bp.It uses a set of statistical tests for reporting SSRs, which is based on four distributions of pattern length, the matching probability, the indel probability and the tuple size.TRF finds perfect, imperfect and compound SSRs, and is available for Linux.TRF has been used for SSR identification in cowpea (Chen et al., 2007).The tool Simple Sequence Repeat Identification Tool (SSRIT) (http://www.gramene.org/db/searches/ssrtool, Temnykh et al., 2001) uses Perl script to find perfect SSR repeats (2 to 10 bp in length) within a sequence.Kantety et al. (2001) used SSRIT to mine SSR in ESTs from Barley, maize, rice, sorghum and wheat. Singh et al. (2011) used SSRIT to mine SSRs in wheat rust Puccinia sp.Another SSR identification tool is TROLL (Tandem Repeat Occurrence Locator, Castelo et al., 2002) which draws a keyword tree and matches it with a technique adapted from bibliographic searches, based on the Aho-Corasick algorithm.One of the major disadvantages of TROLL is that it cannot handle very large sequences and cannot process large batches of sequences as the tree takes up large amounts of memory.
The microsatellite (MISA) tool (http://pgrc.ipkgatersleben.de/misa/)identifies perfect, compound and interrupted SSRs.It requires a set of sequences in FASTA format and a parameter file that defines unit size and minimum repeat number of each SSR.The output includes a file containing the tables of repeat found, and a summary file.MISA can also design PCR amplification primers either side of SSR.The tool is written in Perl and is therefore platform independent, but it requires as installation of Primer3 for primer search (Thiel et al., 2003).MISA has been applied for SSR identification in coffee (Aggarwal et al., 2007), barley (Thiel et al., 2003;Kota et al., 2001), wheat (Yu et al., 2004), rye (Khlestkina et al., 2004) and peanut (Liang et al., 2009).Another SSR search tool called as 'Repeat Finder' (Volfovsky et al., 2001) (http://www.cbcb.umd.edu/software/RepeatFinder/)finds SSRs in four steps.The first step is to find all exact repeats using Repeat Match or REPuter.The second step merges repeats together into repeat classes and the third step includes merging all of the other repeats that match those already merged, into the same classes.Finally, step four matches all repeats and classes against each other in a non-exact manner using BLAST.The input is a genome or set of sequences, and the output is a file containing the repeat classes and number of merged repeats found in each class.Repeat Finder can finds repeats of any length.Also it finds perfect, imperfect and compound repeats and runs on Unix or Linux.It has been used to identify SSRs in peanut (Jayashree et al., 2005).
SSRPrimer combines Sputnik and the PCR primer design software Primer3 to find SSRs and associated amplification primers (Robinson et al., 2004, Jewell et al., 2006).It takes multiple sequences in FASTA format as input and produce lists of SSRs and associated PCR primers in tabular format.SSRPrimer has been applied to a wide range of species including shrimp (Perez et al., 2005), citrus (Chen et al., 2006), mint (Lindqvist et al., 2006), strawberry (Keniry et al., 2006), Brassica (Batley et al., 2007;Burgess et al., 2006;Hopkins et al., 2007;Ling et al., 2007), Sclerotinia (Winton et al., 2007) and Eragrostiscurvula (Cervigni et al., 2008).Maia et al. (2008) came with an interesting tool for SSR discovery integrated with primer design and PCR simulation called SSR Locator (http://www.ufpel.edu.br/).SSR Locator detects SSR and minisatellite motifs between 1 and 10 bp, design primer for each locus found, amplify fragments with different primer pairs from a given set of FASTA files, produce global alignment between amplicons generated by the same primer pair and estimates alignment scores and identities between amplicons thus generating information on primer specificity and redundancy.Victoria et al. (2011) used SSR Locator to study the pattern of EST derived microsatellite markers for model plants.
All the SSR identification tool described above are not able to identify polymorphic SSRs.The only tool which is capable of identifying polymorphic SSRs from DNA sequence data is SSRPoly (http://acpfg.imb.uq.edu.au/ssrpoly.php).The input is a file of FASTA format sequences.SSRPoly includes a set of Perl scripts and MySQL tables that can be implemented on UNIX, Linux and Windows platforms (Tang et al., 2008).

CONCLUSION
The recent advances in bioinformatics role in genic molecular marker development will assist molecular biologists to address many evolutionary, ecological and taxonomic research questions.The development of bioinformatics tools will improve marker identification with reducing cost and therefore will help plant breeders to include more diverse species and a greater variety of traits.Bioinformatics tools have been developed to mine sequence data for markers and present these in a biologist friendly manner.The SNP and SSR marker have many uses in plant genetics such as the detection of alleles associated with disease, genome mapping, association studies, genetic diversity and inferences of population history.With the help of these tools molecular plant breeders will be able to develop new/novel markers and use these markers in diverse applications.The availability of large sequence data makes it an economical choice to develop SSR and SNP marker from it.EST SSR and SNP are gene specific and thus functional molecular markers.Several computational tools described here for the identification of SNPs and SSRs in sequence data as well as for the design of PCR amplification primers will help plant breeders new to molecular breeding and marker assisted selection to opt SSR and SNP marker to solve crop breeding related problems.

Table 1 .
Tools for single nucleotide polymorphisms Identification.

Table 2 .
Bioinformatics tools for microsatellites identification.