Genetic variability of indigenous cowpea genotypes as determined using inter-simple sequence repeats markers

Cowpea [ Vigna unguiculata (L.) Walp.] is cultivated widely by small farmers in the semiarid region of Northeastern Brazil for subsistence purposes, especially to complement the family income. However, owing to the limited availability of water in this region, there is an urgent need for novel highly productive drought-tolerant cultivars. The aim of the present study was to establish the genetic variability of 14 cowpea populations (60 indigenous genotypes from 13 microregions of Rio Grande do Norte and 4 domesticated cultivars produced by Embrapa) using inter-simple sequence repeats (ISSR) markers. The set of 13 selected primers generated a total of 257 loci, 247 (96.11%) of which were polymorphic, with sizes ranging between 200 and 2000 bp. Genetic similarities between accessions were estimated from Jaccard coefficients and genetic relationships were determined from the dendrogram constructed using the unweighted pair group method with arithmetic average (UPGMA) technique. Bayesian statistics coupled with the Markov chain Monte Carlo technique was applied to determine population structure, while the genetic variability was established by analysis of molecular variance. UPGMA analysis allowed the separation of the genotypes into three groups, but no relationship between the genetic and geographical distances was observed. The fixation index was considered intermediary ( F ST = 0.0818), the average heterozygosity was low ( H S = 0.39) and the coefficient of endogamy was high ( f = 92.6%). The results show the presence of genetic diversity among the studied populations and revealed that such variability could be attributed mainly to intra-population variability (91.82%).


INTRODUCTION
The climatic factors that most influence the production of beans are temperature and degree of precipitation ( Barteko et al., 2010). In the semiarid region of the State of Rio Grande do Norte in Northeastern Brazil, water availability is very limited owing to scarce and irregular rainfall and high prevailing temperatures. Hence, there is available to family farmers in this area. In this context, Vigna unguiculata (L.) Walp. (Fabaceae), commonly known as cowpea, represents an excellent target because of its high genetic variability and its versatility in terms of cropping possibilities and application of the final product (Freire Filho et al., 2005). Furthermore, the species constitutes one of the main sources of protein for the populations of Northern and Northeastern Brazil, and provides an alternative source of income for small family farmers who have little or no access to advanced production technologies.
An understanding of the genetic variability within a species, as assessed by the proximity and diversity between genotypes, is an essential prerequisite in plant improvement. Such knowledge not only permits the identification of divergent and complementary genotypes that can be used as progenitors in a breeding program, but also increases the chance of selecting elite genotypes in segregating generations (Cruz and Regazzi, 2006). Molecular markers are valuable tools in the investigation of genetic variability within a plant collection or among wild species. Amongst available markers, inter-simple sequence repeats (ISSR; Zietkiewicz et al., 1994) have been widely used by virtue of the repeatability and reproducibility of the banding patterns obtained and the simplicity of the techniques involved. A number of reports have been published recently on the application of ISSR in the determination of genetic diversity in cowpea (Silva et al., 2009;Ghalmi et al., 2010;Santos et al., 2013).
Since small farmers in the Northeast of Brazil mainly grow indigenous cultivars of cowpea, it is expected that the genetic variability within this species would have been preserved. This valuable resource could be exploited in the development of novel cultivars with enhanced traits by transferring desirable characters from their indigenous counterparts. On this basis, the present study aimed to establish, with the aid of ISSR markers, the genetic variability of indigenous genotypes collected in Rio Grande do Norte and of commercial genotypes produced by the Empresa Brasileira de Pesquisa Agropecuária (Embrapa).

Plant material
A total of 64 cowpea genotypes were analyzed, 60 of which were indigenous genotypes collected in 13 of the 19 microregions of Rio Grande do Norte ( Figure 1) and four were the domesticated

ISSR fingerprinting
DNA was extracted from leaf material using Invisorb® Spin Plant Mini Kits (Stratec Molecular, Berlin, Germany) following the recommendations of the manufacturer. Quantification of the extracted DNA was carried out by electrophoresis on 0.8% agarose gel in Tris-borate-EDTA (0.5  TBE) buffer, staining with GelRed™ (40 ; Biotium, Hayward, CA, USA), and comparison with coanalyzed diluted λ DNA standards (50 and 100 ng). The quality and quantity of genomic DNA samples were verified spectrophotometrically using a NanoDrop 2000 spectrophotometer (ThermoFisher Scientific, Waltham, MA, USA). DNA samples were diluted to 7.0 ng/μL and stored at -20°C until required for ISSR analysis. In order to select appropriate primers for ISSR reactions, genomic DNA from five cowpea genotypes was initially amplified using 40 primers obtained from the University of British Columbia, Vancouver, Canada. The 13 primers that generated the largest numbers of amplified loci with the best band resolution and the highest levels of polymorphism (Table 1) were chosen for the PCR amplifications of DNA from the 64 cowpea accessions. Assays were performed according to the method described by Silva et al. (2009). The reaction mixture employed in PCR amplifications contained 1.0  buffer [20 mM Tris-HCl (pH 8.0), 0.1 mM EDTA, 1 mM DTT, 50% glycerol (v/v)], 2.0 mM MgCl2, 0.8 mM dNTPs, 0.8 µM primer, 1U Invitrogen Taq DNA polymerase (Life Technologies Corporation, São Paulo, Brazil), 0.5 μL DNA template (7.0 ng/μL) and ultrapure distilled water to a final volume of 10 μL. Amplifications were carried out in a Veriti 96 Well Thermal Cycler (Applied Biosystems, Foster City, CA, USA) under the following conditions: initial denaturation at 94°C for 4 min, 40 cycles each comprising denaturation at 94°C for 1 min, annealing at a temperature that varied according to the melting temperature of the primer (Table 1) for 1 min, extension at 72°C for 1 min, and final extension at 72°C for 5 min. The resulting amplicons were separated by electrophoresis on 1.5% agarose gel in 0.5  TBE buffer for 4 h at 110 V, stained with GelRed (40 ), visualized under a UV transilluminator and subsequently photographed. The sizes of the amplicons were estimated by comparison with Invitrogen 100 bp and 1 kb DNA ladder.

Cluster analysis
Analysis of the band pattern generated by each of the 13 primers allowed the construction of a binary matrix in which "1" indicated the presence of a band and "0" the absence. Genetic similarities between cowpea genotypes were estimated from the Jaccard similarity coefficients (sgij), calculated according to Rohlf (1992), and a dendrogram was constructed using the unweighted pair group method with arithmetic average (UPGMA) clustering technique. Cophenetic correlation coefficients (r) were determined from the similarity matrix and the dendrogram, and the bootstrap confidence index was calculated based on 1000 permutations. Cluster analysis was performed with the aid of the software PAST version 1.34 (Hammer et al., 2001).

Analysis of population structure
The presence of population structure in the samples was identified using the program STRUCTURE version 2.3.4 (Pritchard et al., 2000), which generates a posterior distribution based on a Bayesian model and Markov chain Monte Carlo (MCMC) simulation. Application of this approach with an admixture model enabled the proportion of the genome deriving from another population to be assessed for each genotype in the absence of a priori information. Each analysis considered a fixed number of 300,000 MCMC iterations with a "burn-in" of 50,000 iterations, and three runs were performed for each K value. The most probable number for K in relation to the proposed population structure was determined from the values ΔK (Evanno et al., 2005).
Analysis of molecular variance (AMOVA), performed using ARLEQUIN software version 3.1 (Excoffier et al., 1992), was employed to estimate the components of variance attributed to differences between: (i) the 14 populations (13 relating to indigenous genotypes and one comprising the four commercial genotypes), (ii) sites within the microregions, and (iii) individuals within the sites. The significances of the variance components were obtained using 1000 permutations. The magnitude of genetic differentiation between populations was expressed by the fixation index (FST). The average heterozygosity (HS) and the coefficient of inbreeding (f; an analogue of the coefficient FIS for dominant markers) were estimated using the Bayesian approach implemented in HICKORY software version 1.1 (Holsinger et al., 2002).

Genetic diversity
The set of 13 primers selected for ISSR analysis of the 64 genotypes amplified 257 loci of which 247 (96.11%) were polymorphic. The average number of loci per primer was 19.77 and the sizes of the fragments ranged between 200 and 2000 bp. The loci generated by primers UBC 810, UBC 822, UBC 827, UBC 828 and UBC834 were all polymorphic (100% polymorphism), while the remaining primers (except for UBC 826) presented loci with ≥ 90% polymorphism ( Table 1). The efficiency of the selected ISSR primers is exemplified by the electrophoretic profile generated by primer UBC 834 ( Figure 2). The average H S value, which is considered to be a measure of genetic diversity, was 0.39 (95% confidence interval 0.37 to 0.40) with minimal variation (standard deviation of 0.0001).

Genetic similarity
Genetic relationships between the 64 cowpea genotypes were established from the Jaccard coefficients, the values of which ranged between 0.203 and 0.796. The dendrogram shown in Figure 3 was constructed from the 257 amplified loci using the UPGMA method. A cut-off  point of 0.40 was established from the average coefficient of similarity between the genotypes, and this enabled the 14 cowpea populations to be discriminated into three genotypic groups. The excellent performance of cluster analysis based on ISSR markers was verified by the high value of the cophenetic correlation coefficient (r = 0.92).
Group I comprised 42 genotypes representing populations from all 13 microregions of Rio Grande do Norte, together with the four commercial genotypes from Embrapa. Group II was composed of eight genotypes from the microregions Pau dos Ferros, Chapada do Apodi, Mossoró, Borborema Potiguar and Agreste Potiguar located at the extremes of the State. Group III included 10 genotypes from the geographically close microregions of Angicos Borborema Potiguar, Agreste Potiguar and Litoral, Sul. Consequently, the strongest association between genetic similarity and geographic distance was observed in this group.
Pair wise comparison of populations revealed that genotypes from Serra de São Miguel (39-1) and Baixa Verde (42-1) presented the highest coefficient of genetic similarity (0.796) even though the distance between sampling sites was 380 km. The lowest coefficient of genetic similarity (0.203) was observed between the genotypes from Agreste Potiguar (1-1) and Pau dos Ferros (50-1) and, in this case, the distance between sampling sites was 441 km.

Population structure
Application of the Bayesian MCMC strategy using the program STRUCTURE confirmed the existence of admixture between the studied genotypes, thus corroborating the results from cluster analysis. However, the Bayesian MCMC approach indicated that the genotypes could be organized into two main groups (K = 2) (Figures 4 and 5) rather than three as suggested by cluster analysis. It should be noted, however, that in all of the analyses, K = 3 was identified as the second best value for this parameter (Figure 5), and that the cut-off point in the dendrogram represents a somewhat tenuous limit since a small dislocation to the left would divide the samples into just two groups (Figure 3).
Bayesian MCMC analysis revealed that five of the 64 studied genotypes had ancestors from the other group, as indicated by individuals presenting different coloured bars in Figure 4. Furthermore, results of the analysis allowed relationships between individual genotypes to be identified. Thus, all genotypes from Serra de São Miguel, Vale do Açú, Macau, Baixa Verde and Macaíba were grouped together with the commercial genotypes as in the cluster analysis, The results of AMOVA indicated that the total variability among the genotypes studied was explained mainly by intra-population differences (91.82%), while interpopulation differences were very low (F ST = 0.0818) ( Table 2). The coefficient of endogamy determined using the Bayesian model was high (f = 92.6%), while the average heterozygosity value was low (H S = 0.39).

DISCUSSION
According to Grativol et al. (2011), the efficiency of a molecular marker can be assessed by the amount of polymorphism detected. The efficiencies of all of the ISSR primers employed in the present study, with the exception of UBC 822, had been previously established by Muthusamy et al. (2008) in an investigation of the genetic diversity of Vigna umbellate (rice bean). In this study, four of the primers generated 100% of polymorphic bands and the overall polymorphism was 61.79%.
Additionally, Silva et al. (2009) assessed the genetic variability of 46 short-cycle erect cowpea genotypes using eight of these ISSR primers and reported that, of the 62 loci generated, 49 (79.04%) were polymorphic with an average of 6.12 polymorphic bands per primer. In the present study, an average of 19 polymorphic bands were generated per primer, implying that the data obtained using the ISSR primers could be employed with a high degree of confidence in explaining the genetic diversity of the cowpea genotypes.
The high degree of polymorphism (96.11%) detected in the present study can be attributed mainly to the 60 indigenous genotypes, since they had been subjected only to natural selection. During the domestication of a plant species, the frequencies of alleles relating to desirable characteristics are augmented until they are fixed in the offspring. The implementation of such strong selective pressure generally results in a considerable loss of genetic diversity (Wang et al., 1999). Selection is one of the main tools employed by plant breeders irrespective of the breeding method, but this strategy depends on the availability of populations presenting the genetic variability that is normally present in indigenous varieties (Bespalhok et al., 2007a). In our study, the 60 indigenous cowpea genotypes analyzed showed an average heterozygosity (Hs) of 0.39, a value that would justify the development of a breeding program. The genetic richness of the indigenous genotypes may be exploited not only for genetic studies but also for the purpose of crossing divergent progenitors in order to maximize heterosis and augment the possibility of creating commercial cultivars with novel traits. Most of the genotypes (46/64), including the domesticated cultivars, could be combined into a single group (group I) after cluster analysis, and this result may raise doubts regarding the genetic distance between individuals. However, it should be emphasized that these genotypes originated from small subsistence farms where seed interchange and, consequently, crossbreeding within a crop is common. Nonetheless, it was possible to observe the existence of two well-defined subgroups (Figure 3), one of which included 16 genotypes from 23-1 to 46-1, while the other included 30 genotypes from 48-1 to the commercial genotype Marataoã. This division was confirmed through analysis of resampling data (Bootstrap index = 91%).
The results presented herein reveal that the indigenous genotypes from Rio Grande do Norte and the domesticated cultivars produced by Embrapa are closely related and suggest that crossing between these populations may lead to the transfer of desirable characters. In this context, Soares (2012) has recently identified an indigenous cultivar, AM-63-3-Lizão Carioquinha or Lizão, originating from the Macau microregion that is highly resistant to the seed beetle Callosobruchus maculatus (Fabr. 1775) (Coleoptera: Crysomelidae).
Group II contained genotypes sampled in distant microregions, such as 20-1 from the municipality of Itaú (Pau dos Ferros) and 12-1 from the municipality of Passa e Fica (Agreste Potiguar), locations that are 400 km apart. Paradoxically, this group also contained the genotypes 20-1 and 21-1 collected in the municipalities of Itaú and Severino Melo, respectively, which are located 13 km apart in the Pau dos Ferros microregion. These results confirm that the local farmers cultivate seeds from various origins in the same geographical area. Interestingly, genotype 20-1 has been screened by Soares (2012) and shown to exhibit moderate resistance to the seed beetle. Considering that the genetic similarity between the indigenous genotype 20-1 and the domesticated cultivars is less than 30%, crossing would likely result in a novel cowpea cultivar with enhanced resistance traits.
Previous studies have shown that genetic and geographical distances are not always correlated. For example, Oliveira et al. (2003) evaluated the genetic divergence among 16 cowpea genotypes originating from Brazil and Nigeria, and observed that genotypes from the same geographical origin exhibited wide genetic distances.
Analogous results were reported by Bezerra (1997) and Vidal et al. (2006), but these authors justified their findings by emphasizing that the sampling site is not necessarily the site of origin of the plant. Indeed, the misconception among some researchers that geographical distance between cultivated species is an indicator of genetic divergence has been the focus of some criticism since, in many cases, a relationship between genetic diversity and geographical distance cannot be verified (Cruz, 1990;Cruz and Carneiro, 2003).
In order to understand better the genetic distances between cowpea genotypes, we employed the program STRUCTURE that has been used in various studies to verify the groupings of bean genotypes generated by dendrograms (Blair et al., 2007;Burle et al., 2010;Silva 2011). The results of such analyses may have value in the identification of progenitors that offer the highest likelihood of selecting elite genotypes in segregating generations within future breeding programs (Cruz and Regazzi, 2001).
In contrast to the cluster analysis, the optimum K value generated by the STRUCTURE program was two, suggesting the formation of only two groups. However, the genotypes incorporated into groups II and III in the dendrogram formed a single group (represented by orange bars in Figure 4) according to Bayesian MCMC analysis. In reality, groups II and III are very close genetically and a slight dislocation of the cut-off point on the dendrogram would eliminate the division into two groups. In this sense, the results of cluster and Bayesian MCMC analyses are compatible and consistent one with another.
The high levels of genetic dissimilarity between some of the indigenous genotypes and the domesticated cultivars suggest that crossings would probably produce improved cowpea varieties with advantageous traits such as superior quality and yield of seed with better resistance to biotic and abiotic stress. Considering the similarity matrix presented herein, it is possible to suggest crossings between 1-1 (Agreste Potiguar) and 50-1 (Pau dos Ferros), 21-1 (Pau dos Ferros) and Guariba, Marataoã and 21-1 (Pau dos Ferros), Nova Era and 4-1 (Agreste Potiguar), 24-1 (Serra de São Miguel) and 10-1 (Angicos), and 20-1 and Marataoã, the Jaccard similarity coefficients of which were 0.203, 0.221, 0.227, 0.239, 0.240 and 0.230, respectively. Intra-population diversity was responsible for most (91.82%) of the genetic differentiation among the 14 populations of cowpea studied. One explanation of this finding is that small farmers in Northeastern Brazil plant different varieties of seeds in the same location and, consequently, there may have been a mixture of seeds in a single collection. Thus, some individuals considered to be from one population may have been from different populations.
Although the assessment of genetic diversity between populations was very low (8.18%), the value of F ST may have been underestimated because some populations comprised only a small number of individuals. Moreover, since V. unguiculata is predominantly autogamous and cleistogamous, that is, pollination of the stigma occurs before the opening of the flower bud or anthesis (Bespalhok et al., 2007b), the rate of natural crossing is very low and generally less than 1% (Ehlers and Hall 1997). According to Carvalho (2009), the distribution patterns of genetic variability among populations are correlated with the mode of reproduction.
Nevertheless, greater divergence between the studied populations was expected since inter-and intrapopulation genetic diversities of 71.50 and 28.50%, respectively, had been reported by Ghalmi et al. (2010) following an ISSR analysis of indigenous cowpea populations in Algeria. However, although the theoretical value of F ST varies from 0 (indicating no divergence) to 1 (indicating the fixation of alternative alleles in different subpopulations), the maximum value observed is normally much smaller than 1 (Hartl and Clark 2010). According to Wright (1978), an F ST value ranging between 0.05 and 0.15 indicates the existence of moderate genetic differentiation, as in the case of the cowpea populations presently evaluated. Moreover, statistical analysis showed that the F ST value was significant and that genetic divergence existed among the study populations.
The high coefficient of endogamy (f = 92.6%) established in the present study likely reflected the autogamous nature of the species, while the low average heterozygosity value (H S = 0.39) was probably due to self-pollination of cowpea accessions resulting in descendents with identical ancestral alleles in a single locus (autozygous) (Hartl and Clark 2010).

Conclusions
The ISSR markers used in this study were very efficient in revealing the genetic diversity and population structure of the indigenous cowpea genotypes collected in Rio Grande do Norte and of the cultivars produced by Embrapa. Most of the diversity was attributed to intrapopulation variability and no association between genetic and geographical distances was observed. The best combinations of genotypes for future breeding programs were 1-1 (Agreste Potiguar) and 50-1 (Pau dos Ferros), 21-1 (Pau dos Ferros) and Guariba, Marataoã and 21-1 (Pau dos Ferros), Nova era and 4-1 (Agreste Potiguar), 24-1 (Serra de São Miguel) and 10-1 (Angicos), and 20-1 and Marataoã. The genetic divergence between the 14 cowpea populations, expressed by the F ST coefficient, was considered moderate although it was lower than expected possibly due to the cultivation of mixed seeds from different sources in the small farms of Rio Grande do Norte where sampling was performed.