Agro-morphological diversity in yam genotypes from Recôncavo of Bahia , Brazil

1 Federal University Recôncavo da Bahia (UFRB), Rua Rui Barbosa 710, CEP: 44380-000, Cruz das Almas, BA, Brazil. 2 Polytechnic Institute of Kwanza Sul, Rua 12 de Novembro, Centro, Kwanza Sul, Angola. 3 Brazilian Agricultural Research Corporation (EMBRAPA Mandioca e Fruticultura), Embrapa, Rua Embrapa S/N CP 007, CEP: 44380-000, Cruz das Almas, BA, Brazil. 4 University of Milan, DiSAA, Via Celoria, 2 20133 Milan, Italy.


2016
). FAO estimates show that yam was collected from approximately 7.6 million hectares in 2014 generating 68 million tons of tubers (FAOSTAT, 2014).Globally, Nigeria ranks first with a production of 45 million tons in 2014 (FAOSTAT, 2014).
The importance of this species is related with the production of rhizophores with high nutritional and energetic value, constituting a staple food for human consumption which is already being used in all classes of Brazilian society (Mesquita, 2001;Santos et al., 1998;Santos, 1996).A large part of the production is destined for the domestic market and another part for export, mainly to Europe (Santos and Macedo, 2002;Santos et al., 2007b).
Regarding species of Dioscorea spp., it is estimated that 150 to 200 occur in Brazil, the only edible genre of the Dioscoreaceae family present in all regions of the country.In the producing regions, species such as D. cayennensis and D. rotundata are mostly used, followed by the species D. alata.However, most species are still poorly studied (Pedralli, 2002).In Bahia, the species D. rotundata and D. cayennensis (Boca funda) occupy more than 90% of the cultivated area, followed by D. alata (São Tomé and Jibóia) and Dioscorea trifida (yam or cará mimoso) as well as Dioscorea bulbifera (liver yam), sporadically (Carvalho et al., 2009).The largest acreage of yam in this state lies in the Recôncavo region, notably in the municipalities of Maragogipe, São Felipe, Cruz das Almas and São Félix, showing a significant socioeconomic importance as a promising alternative for small and medium producers in the region (Mesquita, 2001).
Notwithstanding the socio-economic importance of the yam crop, its expansion is still extremely limited, mainly due to the scarcity of technical and scientific information that will provide sustainability and increase the productivity (Da Silva Dantas et al., 2013;Siqueira et al., 2011).Despite being a species with great plasticity which can adapt from tropical humid to tempered climates without frost and drought (Pereira et al., 2003), the diversity of yam genotypes needs to be studied in order to provide information for enabling the development of technologies and basic knowledge to support and further explore this culture, assist in breeding programs and conserve the species in the Recôncavo region of Bahia, Brazil (Carvalho et al., 2009).
The diversity of this species can be estimated using genetics (Arnau et al., 2017;Loko et al., 2016) and phenotypical information (Dansi et al., 2013;Sheikh andMoreira et al. 2071 Kumar, 2017).Using phenotypical traits, the diversity can be accessed by morphological characterization and subsequent cluster analysis (Asare et al., 2016).Cluster analysis of these variables can be carried out individually, as the distances are calculated depending on the type of variable used.Cruz (2013) provided procedures to estimate dissimilarity measures based on quantitative traits that could be analysed using Euclidean or Mahalanobis (1936) distances, binary data adopting for example the Jaccard (1908), Nei and Li (1979) coefficients and multicategoric variables to which the Cole-Rodgers et al. (1997) coefficient is applied.
The joint analysis of different types of variables can provide a better indication of the potential on the existing variability in germplasm banks and genotype groups.However, few studies have used this methodology to quantify the diversity of the yam crop due to the lack of knowledge of statistical techniques for this approach and the lack of freely available computer programs for such an analysis.Under these circumstances, the Modified Location Model (MLM) procedure (Franco et al., 1998) is fundamental to quantify the variability using quantitative and qualitative data simultaneously.This procedure is characterized by the Ward grouping method (Ward Junior, 1963) defining groups based on the Gower similarity matrix (Gower, 1971) and by the vector average of the quantitative variables estimated by MLM, regardless of the value of the qualitative variables.It has been used in different cultures for various purposes (Barbé et al., 2010;Cabral et al., 2010;Gonçalves et al., 2009;Pestana et al., 2011;Sudré et al., 2010 ).
This study aims to quantify the genetic diversity (using the morpho-agronomic traits) of yam genotypes from the Recôncavo region of Bahia, using both quantitative and qualitative data based on the Ward-MLM procedure.
In each property of the respective commercial production areas, the yam plantation system used the individual staking mechanism with rods.In these areas, plants were chosen randomly and were marked with red ribbons for better visibility in the field where fortnightly analysis was performed.The evaluated morphological traits relate to the plant's subterranean part (production) and the part above ground: rhizophore length in cm (RL); rhizophore width in cm (RWD); rhizophore weight in kg (RW) (Figure 1); rhizophore shape (RS -1 -long and 2 -irregular) and skin color (SC -1brown and 2 -yellow).The sample measurements of the morphological quantitative descriptors were carried out in accordance with the International Plant Genetic Resources Institute and International Institute of Tropical Agriculture (IPGRI, 1997).
For the quantitative descriptors, descriptive statistics were calculated, comprising of the minimum and maximum values, mean, standard deviation and variation coefficient, using SAS (SAS  Institute, 2011).The frequency percentages of the classes for each morphological descriptor and the entropy level of the traits using the Renyi entropy coefficient (Renyi, 1961) were calculated according to the formula: where entropy is a measure of the frequency of the distribution of (n) genotypes P = (p1, p2... ps).pi= fi/n; p1 + p2 + ... + ps = 1; N = f1 + f2 + ... + fs, where f1, f2, ... fn are the counts of each of the classes in the descriptor considered.The values of the entropy level (H') were classified as low (H '<0.50), moderate (H' = 0.50-0.75)and high (H '≥ 0.75) (Jamago, 2003).
Quantitative and qualitative variables were analyzed simultaneously using the Ward-MLM procedure for the composition of groups of accessions through the CLUSTER and IML procedures of SAS (SAS Institute, 2011).For Ward's clustering method, the distance matrix was obtained by the Gower algorithm (Gower, 1971).The definition of the optimal number of groups was carried out according to the pseudo-F and pseudo-t 2 criteria combined with the likelihood profile associated with the likelihood ratio test (SAS Institute, 2011).
The graph of the difference between groups and the correlation of the variables with the canonical variable was established using the CANDISC procedure of SAS (SAS Institute, 2011).The distance proposed by Matusita (1955), adapted by Krzanowski (1983) and later by Franco et al. (1998), for the distribution of variables was used to determine the dissimilarity among the groups.

RESULTS AND DISCUSSION
The rhizophore weight (RW) was the descriptor with the highest variation, with a coefficient of 86.9% (Table 2 and Figure 1).The range of variation was from 0.11 to 9.88 kg, with an average value of 1.46 kg.The average weight of marketable rhizophores is of great importance for decision-making by the farmer, given that, depending on the market and changes in rhizophore trade prices, the highest average weight rhizophores reach prices which are 20 to 30% higher than average rhizomes and 80% higher than small rhizomes, which can be a strategy to achieve greater profitability (Pereira et al., 2003).
The rhizophore length (RL) ranged from 5.0 to 67.0 cm, with an average of 33.0 and a 37.1% variation coefficient.Regarding the rhizophore width (RWD), the range was 31.1 to 155.0 cm, with an average of 80.7 cm and a variation coefficient of 35.3% (Table 2).The wide range and average in these descriptors between evaluated yam genotypes may suggest the existence of a broad genetic variability that can be used in breeding programs of the species.In general, considering all analyzed descriptors, the breeders must know the genetic inheritance of such traits and the peculiar edaphoclimatic factors in each micro-region, that influence the phenotypic plasticity of yam genotypes, and in crop management.It is important Table 2. Descriptive statistics for the quantitative descriptors rhizophore length in cm (RL); rhizophore width in cm (RWD); rhizophore weight in kg (RW) and frequency percentages for the classes of the qualitative descriptors rhizophore shape (RS) and skin color (SC).

Descriptor
Minimum to not only define superior agronomic traits, but to stabilize them in order to provide local materials of high productivity and effective agronomic materials for the producers.
The entropy level can be used to quantify the variability present in qualitative descriptors by observing the relative frequencies of the classes for each evaluated descriptors.Low entropy values are associated with a lesser amount of phenotypic classes and high values associated with descriptors with a large number of classes, which reveals genetic variability among the studied accessions (Vieira et al., 2007).In this study, the values were found to be 0.47 and 0.22, indicating a high concentration of genotypes in only one class within each evaluated qualitative descriptor.A possible explanation may be the fact that the only source of yam genotypes (Moreira et al., 2007) is from the Batatan region, close to Cruz das Almas, as well as the limited number of qualitative descriptors used in this study.
An important aspect in the cluster analysis interpretation consists of determining the number of groups that best describe the real structure of the analyzed data.Based on the likelihood function, the largest increase occurred in the formation of four groups, with a value of 68.29 (Table 3 and Figure 2).According to Gonçalves et al. (2009) and Barbé et al. (2010), analysis of the likelihood function can define more precise criteria in the formation of groups, resulting in the determination of less subjective groups.Padilla et al. (2005), evaluating the diversity of 120 populations of Brassica rapa subsp.Rapa L., found that the largest increase in the probability function was achieved when five groups were considered.In turn, Barbé et al. (2010), evaluating the genetic diversity of bean lines, verified the formation of three groups with increments of 14.66 in the logarithmic probability function.Thus, the number of groups can vary  depending on the studied species, the number of genotypes and the number and type of descriptors (Gonçalves et al., 2009).
Of the four groups (G1, G2, G3 and G4) formed by the Ward-MLM procedure, group 4 stood out in relation to the average of quantitative descriptors.The trait rhizophore length varied between 17.4 cm (group 3) and 35.1 cm (group 4).For the trait rhizophore width, variation was 64.7 cm (group 3) to 114.1 cm (group 2).The trait rhizophore weight ranged from 1.19 kg (group 1) to 9.66 kg (group 4) (Table 4 and Figure 3).The productivity of tradable rhizophores is the main purpose of commercial exploitation of yam.The rhizophore weight is an important factor on the consumer market.According to Santos (1996), rhizophores weighing between 0.70 and 1.50 kg are intended for the US market, those between 1.60 and 2.00 kg exported to France and those between 2.10 and 3.00 kg are destined for other European markets, while those weighing more than 3 kg are not of the export type and achieve lower prices.Thus, one must identify the factors responsible for obtaining these larger rhizophores to prioritize those for export, which will lead to a better remuneration for the producer.Most likely, some of these determining factors in the production of heavier rhizophores are the type and amount of fertilizer used in the culture management as well as the use of irrigation.
The existing variation in the other traits (Table 4 and Figure 3) may be related to genotype × environment interaction.In Taro, a wide variation due to differences in locations, growing seasons and different management practices is observed (Pereira et al., 2003).Regarding yam genotypes, a likely explanation for the detected variation lies in the accessions from different regions, the management employed by traditional farmers through the introduction or exchange of materials within and between communities (Moreira et al., 2007), thereby generating a varying representation of variability.
For the canonical variate analysis, we found that the first two variables explained 94.26% of the total variability among the genotypes (Figure 4).This value indicates that the graphical representation of the first two canonical variables was appropriate for displaying the genetic relationship between groups as well as between genotypes within the same group.
The trait of the rhizophore weight had the highest correlation with the first canonical variable, followed by rhizophore width and rhizophore length, with values of    4).The discriminatory power of the Ward-MLM method in the formation of groups is noticeable.From the results, the criteria used for the separation of the groups considering the canonical variables were apparently associated with the origin of genotypes, as they are from two regions with different soil and climatic conditions.The distance between the formed groups corroborates the graphical representation of the canonical variables; groups 1 and 3 are the closest with a distance of 2.60, while group 4 was the furthest with 95.47 (Table 5).This fact can be explained by differences in genetic material (rhizophores seeds) planted by each producer.Furthermore, plants of vegetative propagation are, in general, highly heterozygous preserving the allelic diversity at individual level (IPGRI, 1997).The great similarity detected in the other groups may be due their common origin.Moreira et al. (2007) observed that in those regions of Recôncavo in Bahia, where yam is cultivated, most rhizophore seed come from a single region called Batatan which covers the municipality of Maragogipe.

Conclusions
In this work, the diversity of 209 yam genotypes was accessed from the commercial production area of the Recôncavo region in the state of Bahia, Brazil.Considering the traits studied, the rhizophore weight (RW) was the descriptor with the highest variation (0.11 to 9.88 kg).The entropy values reported in this study indicated a high concentration of genotypes in only one class within each evaluated qualitative descriptor.The Ward-MLM algorithm was efficient for forming four homogeneous groups using morphological data.The group 4 stood out in relation to the average of quantitative descriptors and the canonical variate analysis, showed that the first two variables explained 94.3% of the total variability among the genotypes.Thus, the results obtained in this work will be very useful for selecting genotypes of different groups for breeding purposes.

Figure 1 .
Figure 1.Agronomic traits evaluated in 209 yam genotypes from the commercial production areas in the municipalities of São Felipe and Cruz das Almas in the Recôncavo region of Bahia, Brazil.A: rhizophore width in cm (RWD); B: rhizophore length in cm (RL); C: rhizophore weight in kg (RW).

Figure 2 .
Figure 2. Graphic expressing the logarithmic likelihood function of probability (loglikelihood) with respect to the number of groups.

Figure 3 .
Figure 3. Box plots of the minimum and maximum values, median, 25 and 75% percentiles and outliers for descriptors length, width and weight of 209 yam genotypes in four groups formed by the Ward-MLM algorithm.

Figure 4 .
Figure 4. Boxplot referring to the first two canonical variables with four groups (G1 -G4) formed by the Ward-MLM algorithm considering 209 yam genotypes.

Table 1 .
Summary of 209 yam genotypes from the commercial production areas in the municipalities of São Felipe and Cruz das Almas used in the study.

Table 3 .
Number of groups formed by the Ward-MLM method based on the logarithmic function of the probability (log-likelihood) and its increase.
*The highest increment value

Table 4 .
Average of quantitative descriptors for each of the four groups formed by the WARD-MLM algorithm and canonical correlation coefficients for the first two canonical variables.