Genetic diversity of sweet potatoes collection from Northeastern Brazil

The sweet potato, Ipomoea batatas (L.) Lam has its origin in Tropical America. In Sergipe State (Brazil), its production is very important, and to explore its potential in local agriculture in the State, the Embrapa Coastal Tableland created a collection with 52 accessions located in Umbaúba City. Some accessions were from germplasm belonging to Embrapa vegetables and others from local farmers of Sergipe. Here, we provide the first data on the genetic diversity and structure of sweet potato collection of SPGB using random amplified polymorphic DNA (RAPD) markers. RAPD data were used to determine genetic variability via a model-based Bayesian procedure (structure) and molecular variance analysis (AMOVA). In addition, Shannon index, genetic diversity and Jaccard coefficients were also estimated. RAPD was efficient for the analysis of genetic diversity to identify groups and measure the genetic distance between accessions. The markers showed that the collection had a high level of polymorphism. By UPGMA, we separated three groups of genotypes and identified two reconstructed populations by structure software.


INTRODUCTION
Sweet potato [Ipomoea batatas (L.) Lam] has storage roots and belongs to the Convolvulaceae family (Zhang et al., 1998). The principal producers are China, Nigeria, United Republic of Tanzania, Uganda and Indonesia (FAOSTAT, 2013).
In Brazil, sweet potato constitutes the seventh largest production of temporary crops and the principal country producing it is Latin America, with approximately 500 tons per year (Cavalcante et al., 2009;IBGE, 2010). Sweet potatoes are consumed because their nutrients (Yoshimoto, 2001) contain beta carotene that prevents vitamin A deficiency in many developing countries (Wang et al., 2010). It is generally accepted that sweet potato has American origin (Zhang et al., 1998). Some species of the genus, Ipomoea section batatas occur in Brazil which has regions with variation in culture (Ritschel and Huamán, 2002). Studies with genetic markers have shown significant and critical application in the assessment and conservation of genetic variation of sweet potato (Veasey et al., 2008).
Some cultivars, although being the same genetically, often have different names depending on the cultivated area, while different cultivars often have the same name (Daros et al., 2002;Santos et al., 2011). Despite the high genetic variability, changes in consumer habits and the lack of research on culture have contributed to the loss of important genotypes. It is extremely important for the maintenance of sweet potato accessions in Germplasm Banks (GB's), and the subsequent evaluation of different growing regions (Neiva et al., 2011). By having a high genetic variability, sweet potatoes can be selected for numerous purposes (Gonçalves et al., 2011;Neiva et al., 2011). It can be designed for human consumption, animal food and ethanol (Gonçalves et al., 2012).
The primary importance of GBs is that they have the ability to provide genetic variability for improvement of programs and also help to reconcile the conservation of agricultural biodiversity and sustainable development. Wild relatives and crop landraces are important resources for modern cultivars.
The analysis of popu-lation structure provides insight into how diversity is divided within a species, and that may help to define subpopulations of germplasm with high frequencies of particular alleles and allow researchers to explore the relationship between phenotype and genotype in the materials. The information provided is how a specific combination of genes and alleles interacts in varieties and allows breeders to compare the phenotypic effects of genes or chromosome segments that were inherited from a common ancestor and selected in various combinations (Garris et al., 2003).
Cultivars and breeding lines of sweet potato and nearly 26,000 accessions of other Ipomoea species are maintained at 83 gene banks in the world (Rao et al., 1994). In Brazil, three GBs of sweet potatoes are referenced in the literature: 1) Embrapa -CNPH created in the 1980s with 324 accessions, located in Brasilia City (Ritschel and Huamán, 2002); 2) Universidade Federal dos Vales do Jequitinhonha e Mucuri (UFVJM) situated in Diamantina City (Minas Gerais state) with 65 accessions (Neiva et al., 2011); and 3) Universidade Federal do Tocantins (Tocantins State) with 20 accessions, created in the 1990s (Martins et al., 2012). We used the Embrapa Coastal Tableland collection, located in Umbaúba City (Sergipe State), the first in the Northeastern region.
In the collection, there are 52 accessions from Embrapa vegetables germplasm bank and local farmers of Sergipe State. The correct identification and variability in the GBs and collections allow the identification of duplicates in order to estimate genetic linkage among accessions and to quantify genetic diversity in the collection for future breeding programs. Morphological traits and biochemical markers have been employed in sweet potato germplasm studies (Ritschel and Huamán, 2002). However, these markers are subject to developmental and environmental variations. Molecular markers are tools used for the detection of variability at the DNA level. Between them, RAPD technique (Random Amplified Polymorphic DNA) is the most accessible (Oliveira et al., 2007). With the help of statistical methods, RAPD technique has been effective in the first method in detecting the diversity of population in various types of specimens (Carvalho et al., 2013). Besides its low cost and speed, this technique has the advantage, even without prior knowledge of the genome, of requiring little amount of DNA for analysis (Goulão et al., 2001). From the genetic characterization, optimization and ordering of groups allows them to complement the classification and prevents erroneous inferences adopted in the allocation of materials within a particular subgroup of genotypes (Costa et al., 2011).
In our study, we related the first GB of sweet potato from the Northeast of Brasil (Sergipe State) and reported the molecular characterization of accessions using RAPD markers.

Plant material and DNA extraction
DNA was isolated from young leaves as described by Doyle (1991). Fifty-two (52) genotypes were used (Table 1) from Brazilian Agricultural Research Corporation's (Embrapa Vegetable) and local farmers.

Electrophoresis and visualization
Fragments were visualized on 1.5% agarose gel (1X TEB -89 mM TRIS, 89 mM boric acid, 2.5 mM EDTA, pH 8.3) in a horizontal electrophoresis system (Sunrise, Gibco BRL), run at a constant voltage of 100 V for 90 min. The gel was stained with ethidium bromide solution (5 mg/ml) for 15 min. RAPD amplification products were visualized under ultraviolet light using a Gel Doc L-Pix image system (Loccus Biotecnologia, Brazil).

Data analysis
RAPD markers were scored as binary matrix. Bootstrap proce-dure was applied to calculate variance of the genetic distance obtained from markers, and was obtained from 5,000 bootstrap random draws using the DBOOT software (Coelho, 2000). Polymorphic information content (PIC) was calculated according to Ghislain et al. (1999) and the marker index (MI) was determined as described in Zhao et al. (2007). A data matrix of the RAPD scores was generated and similarity coefficients were calculated using Jaccard's arithmetic complement index . A dendrogram was constructed using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster algorithm. In order to determine the robustness of the dendrogram, the data were bootstrapped with 5,000 replicates using FreeTree software (Hampl et al., 2001) and TreeView (Page, 1996).
Inferences regarding genetic structure within sweet potato genotypes were made using STRUCTURE version 2.2 (Pritchard et al., 2000;Falush et al., 2007); K was estimated as the number of reconstructed panmictic populations (RPPs) of individuals, using values ranging from 1 to 10 and assuming that the sampled genotypes were from anonymous plants of unknown origin (usepopinfo and popflag set to 0). We set up runs with a 20,000iteration burn-in period and a Monte Carlo Markov chain (MCMC) of 20,000 iterations, with five repetitions. The program structure estimates the most likely number of clusters (K) by calculating the log probability of data for each value of K . We assessed the best K-value supported by the data according to Evanno et al. (2005).

RESULTS
The nine primers generated a total of 50 fragments (100% polymorphic). Primers with the highest number of fragments were IDT17 and K20 (10 and 8). The PIC value ranged from 0.10 to 0.34 and MI from 0.49 to 2.67 (Table 2). There is a direct proportional relationship between the number of fragments analyzed and the coefficient of variation (CV) (Figure 1). The results indicate a clear decreasing CV with increasing number of fragments.

Genetic similarity
Genotypes were clustered by UPGMA using the Jaccard similarity (JS) estimated from the binary data of 52 genotypes (Figure 2A). The distribution of genotypes in the clusters showed the separation of the different groups, as well as the high divergence of some genotypes.
The genotypes, 1213, 1190, 33P10 between the 1213 were the most divergent in the SPGB (Sweet Potato Germplasm Bank), with 0.00 of similarity to 58% of sweet potatoes. The main cultivars were divided among sub groups, and the first group differed at 0.2 JS, with 'Olho Roxo' and 'Beauregard' as well as 'Italiana' and 'Laranjeira' varieties. The 'Rainha' was differentiated at 0.3 SJ and isolated.
The 'Ciganinha' and 'Roxinha' differentiated into groups with larger number of genotypes. The genotypes are associated with cultivars. 'Beauregard' created in US in 1980(Cervantes-Flores et al., 2002 was connected with 1207 and 1209 genotypes.   (Pritchard et al., 2000) to 52 sweet potatoes accessions from collection belonging to Embrapa Coastal Tablelands (Umbaúba city, Sergipe State, Brazil, 2012). determine the genetic structure among the sweet potatoes genotypes. This clustering approach assigns individuals to RPPs based on genotype. The program structure estimates the most likely number of clusters (K) by calculating the log probability of data for each value of K, and using ΔΚ statistics described by Evanno et al. (2005), as recommended by Barnaud et al. (2008) and Santos et al. (2011). The best K for representing the RPP genotypes was K = 2 (RPP1 and RPP2) ( Figure 2B). The first RRP (RPP1) included 24 genotypes, of which 22 had a probability of membership (qI) > 80% with four cultivars (Beauregard, Olho Roxo, Italiana e Laranjeira). Two genotypes were assigned with a qI< 80% (P7 and P2). The second RPP (RPP2) included 29 genotypes, 27 of which had a qI> 80%, including three cultivars (Roxinha, Ciganinha e Rainha), and two genotypes with a qI< 80% (1204 and 1232). SPGB demonstrated similar mean genetic diversity to RPPs. The Shannon index was 0.39 for RPP1, with an increment of RPP2 (0.44). The genetic diversity was 0.25 for RPP1 and 0.29 for RPP2 (Table 3). When an AMOVA was performed with the 52 genotypes grouped by RPPs, genetic differentiation accounted for 17% (Table 4).

DISCUSSION
The polymorphism found was high (100%), and similar values were found in other studies (Costa et al., 2011). In a study on sweet potato, a total of 150 were scored and 145 were polymorphic using 18 RAPD primers (Moulin et al., 2012); and with 15 primers, 86 fragments were produced, which were 100% polymorphic (Zhang et al., 1998). Many studies have reported that DNA fingerprinting techniques are better than phenotypic descriptors for discriminating between related genotypes and for analysis of genetic similarity (Spooner et al., 2005;Solis et al., 2007). According to the study of Moura et al. (2005), there is a point where increased number of fragments does not show a significant increase in experimental accuracy and does not justify the extra effort in labor. From 45 fragments, there is a stabilization of the coefficient of variation, with value less than 10%. This suggests that the results obtained by the fragments used in this study (50) can be used for analysis of diversity. In our results, sweet potato landraces exhibited high variability, with the Jaccard similarity index varying from 0.0 to 0.83. The high level of diversity found in accessions of sweet potato may be associated with spontaneous mutations, knowing that sweet potato presents a high frequency of somatic mutations and is very common in species selection, geographical and environmental factors, which make the species an important genetic resource (He et al., 2006;Love et al., 1978). Oliveira et al. (2000) also observed high genetic divergence between 51 clones of sweet potato originating from various Brazilian regions. Other studies on amplified fragment length polymorphism (AFLP) markers also detected greater variability (79.8%) in the comparison between two groups (20.2%); one with 14 genotypes from the New Ireland Island and another with 117 genotypes from New Guinea Island, collected in 26 farm plots in four provinces of Papua New Guinea (Fajardo et al., 2002). Genetic diversity indicated by I (0.42) and H (0.27) in SPGB can be considered low with means of low diversity for SPGB, being necessary to insert new accesses to promote increased diversity and more likely to use these resources. It has been suggested that during the spreading of sweet potato accessions from one country to another, some accessions became known by other new name, which can explain the high similarity of some accessions (McGregor et al., 2001). The average genetic diversity (H) obtained is in agreement with those found for other species like Solanum tuberosum (0.49) . In contrast to sweet potato, some regions of the globe have a genetic diversity very high as Mesoamerica (0.71), Venezuela-Colombia (0.70) and Peru-Ecuador (0.52). This reflects the richness and evenness of Latin American sweet potato gene pool and center of genetic diversity .
The assessing diversity is also important for the construction of a 'core collection' (Zhang et al., 1998); in our case, this makes possible the first preliminary analysis of the first germoplasm bank of Northeastern Brazil and a base for the creation of a 'core collection'. The genotype, 1223 has a commercial production of 155% higher than the national average and 162% higher than the average of Sergipe State (Nunes et al., 2009). Genetically, in the SPGB, three types of sweet potatoes are associated (1192, 1219 and 1226). According to the study of Nunes et al. (2009), the genotypes 1228, 1225 and 1226 have a similar commercial production with 'Italiana'; however, in our study, this genotypes are not interrelated. Knowledge of the genetic variability of sweet potato using RAPD markers may contribute to the development of strategies to guide conservation and management program. We recommend creating a strategic plan for the genotypes with good commercial production characteristics and formation of new commercial varieties. Furthermore, it is important to insert new sweet potatoes for increasing genetic variability in the germoplasm bank.

Conclusion
The genetic variation and genetic relationships among genotypes were efficiently determined using RAPD markers. The discrimination of sweet potatoes from Sergipe State (Brazil) and identification of genotypes more genetically and divergently may contribute to the efforts put in breeding programs and commercial exploitation.