Phenotypic diversity in physic nut (Jatropha curcas L.) in vivo germplasm bank for superior parent selection

Jatropha curcas is an interesting alternative for biodiesel production due to the high oil content in its seeds, its ability to grow in a wide range of climate and soil conditions as well as low cost of production. However, the species is considered to be in domestication and there are no defined cultivars. Therefore, it is extremely important to understand the genetic diversity of the species for selection and characterization of promising genotypes to initiate breeding programs. The objective of this study is to evaluate the phenotypic diversity of physic nut in order to select the most divergent and superior genotypes to compose future breeding programs, using multivariate analysis. Eleven agronomic characters were evaluated in 165 J. curcas genotypes from the in vivo germplasm bank, which were: Plant height, stem diameter, number of primary branches, fruit length, width, weight and shape, seed length, width and weight plus the oil content. The data were analyzed by principal component analysis (PCA), cluster analysis by Ward and k-means methods. The character fruit shape was removed from the multivariate analysis as the only one with qualitative character. The PCA resulted in 4 main components (PC), which explained 71.62% of total variance. The characters selected in PC1 were seed weight, fruit width, fruit length and fruit weight. There were 22 promising genotypes highlighted, with potential to be exploited in breeding programs. Cluster analysis by Ward and k-means methods generated 9 groups influenced by all analyzed characters, of which five groups of genotypes had advantageous characters. Regarding fruit shape, 13 genotypes had an ellipsoid lanceolate shape and the others had an ellipsoid spherical shape. Multivariate analyses allowed genotype characterization, indicating good strategies used for the selection in genetic breeding programs. 
 
   
 
 Key words: Agronomic characters, Jatropha curcas, oleaginous, multivariate analysis.

species originated from Central America; however, it may have adequate development in many tropical and subtropical countries (Laviola and Dias, 2008).Jatropha genus is entomophilous (Saturnino et al., 2005), which increases the genetic variability probability within species.The height of physic nut plant ranges from 2 to 5 m; it can live up to 35 years, and produces fruit with three seeds of 20 mm long, 11 mm wide and 9 mm thick.They are oblong and black.Its young leaves are reddish, and become dark-green during maturity; they have three to five lobes, petiolate, alternate.Its flowers are small and greenish-yellow (Drummond et al., 1984).
Due to the global energy crisis from environmental and climatic impacts caused by the high fossil energy consumption, the demand for renewable fuels that produce lower pollutants is increasing.In this context, physic nut is an alternative source because of some of its advantages: the high oil content of its seed, between 22 and 42% (Sunil et al., 2008;Achten et al., 2008), its low production cost, it is an edible crop and meets environmental demands, since its oil contains sulfur in insignificant amounts.The physic nut can reach between 500 and 3000 kg ha -1 depending on genotype and environment (Wani et al., 2016), producing about 2000 kg ha -1 of oil in the fourth year of cultivation, depending on the spacing (Laviola and Dias, 2008).However, the cultivation depends on its domestication in order to achieve higher productivity and production uniformity (Fairless, 2007).This potential can be surpassed with breeding programs along with production system improvement.
Since there are no defined cultivars and descriptors for physic nut, its genetic diversity exploration and characterization is of great importance in the species genetic breeding programs (Achten et al., 2010).Therefore, studies concerning genetic diversity among the genotypes around the world have been done at the molecular level (Pazeto et al., 2015;Pecina-Quintero et al., 2014;Sinha et al., 2016) as well as the phenotypic level (Freitas et al., 2015;Oliveira et al., 2016;Priyanka et al., 2015) stating that there is a large genetic variability, making physic nut a prosperous species for domestication and breeding.
Multivariate techniques are important tools in predicting genetic diversity, germplasm classification, accession variability ordering and analysis of genetic relationships between characteristics and existing genetic material (Iqbal et al., 2008).Among these techniques, it can be highlighted the principal component analysis (PCA) and cluster analysis (Cruz et al., 2004;Gonçalves et al., 2008).The principal component analysis is a useful tool used to identify characters containing more information for the germplasm characterization, even as to inform which characters contribute less to the total variation available (Cruz et al., 2004).Cluster analysis aims to gather, by some classification criteria, the sample units in groups, for there to homogeneity within the group and heterogeneity among them (Neto and Moita, 1998).The PCA technique was applied for physic nut by Singh et al. (2016) to distinguish parental accessions for plant improvement, Nietsche et al. (2015) to evaluate the variability in reproductive traits and by Tripathi et al. (2015) to study the genetic diversity of Indian accessions.Different methods of cluster analysis have also been widely used for the species aiming to study its genetic diversity (Noor Camellia et al., 2012;Silva Junqueira et al., 2016;Reddi et al., 2016).Due to the importance of physic nut genotypes characterization for domestication and the need to obtain superior genotypes for future use in breeding programs, the present study aimed to evaluate phenotypic diversity to select the most divergent and superior genotypes from a physic nut germplasm bank using multivariate analysis strategies.

MATERIALS AND METHODS
We evaluated 165 physic nut plants from 50 accessions from four Brazilian states: Paraíba, Pernambuco, Tocantins and São Paulo (Table 1).The genotypes belong to the in vivo Germplasm Bank of the College of Agricultural and Veterinary Sciences of the Universidade Estadual Paulista -UNESP, Jaboticabal, SP, located in the Plant Production Department experimental area.Harvesting and agronomic characters were evaluated from December 2014 to May 2015.
To define the characters evaluated, some descriptors used for castor bean (Ricinus communis L.) were used (MILANI, 2008), since the two species are from the same family.Eleven agronomic traits were evaluated: Plant height -measured from the plant collar to the apex (m); stem diameter -measured using a digital caliper (ZAAS brand) at 10 cm ground level interval (cm); number of primary branches -the primary branches of each genotype were observed; fruit length -length of 10 fruits per genotype were measured using a digital caliper (cm); fruit width -widths of 10 fruits per genotype were measured using a digital caliper (cm); fruit weight -10 fruits per genotype were weighed using a digital scale (g); fruit shape -10 fruits per plant were evaluated, classified as ellipsoid spherical or ellipsoid lanceolate, according to Laviola et al. (2011); seed length -10 seeds per genotype were measured using a digital caliper (cm); seed width -10 seeds per genotype were measured using a digital caliper (cm); seed weight -10 seeds per genotype were weighed using a digital scales (g); oil contentextraction was performed with a soxhlet extractor, the method includes leaching of the oil in the material via contact with a particular solvent in a series of cycles, according to the AOCS official method (AOCS, 2003) (%).The process was performed in duplicate for each genotype and subsequently the average was calculated for each plant, with results presented in g/100 g.
All fruits were harvested when their color was brown and at random from each plant over the previously mentioned period by multiple harvests.To perform the agronomic traits analyses, seeds and fruits evaluated were randomly chosen from each genotype.For fruit length, width and weight characters, as well as seed oil content, the mean values for each genotype were considered.
Principal Components Analysis and Cluster Analysis were performed using Statistica software, version 10 (STATSOFT, 2010) for all agronomic traits except fruit shape since it has a qualitative characteristic.Data were standardized resulting in mean zero and variance one for all the variables analyzed.Hierarchical clustering was performed by the Ward method and dissimilarity estimates were generated using the Euclidean distance procedure.

RESULTS AND DISCUSSION
The first four principal components explained 71.62% of the variance contained in the original ten variables, and the first principal component (PC1) retained 28.71% of the original variance (Figure 1).The principal characters that explained this variance retention (PC1) were the production components: seed weight, fruit weight, width and length (Table 2).The second main component (PC2) retained 20.47% of the variance, explained by different characters of the plants such as plant height, seed length and oil content (Table 2).The third principal component (PC3) retained 12.00% of the variance, which was explained by the plant proportions: stem diameter and number of primary branches (Table 2).The fourth principal component (PC4) retained 10.42% of the variance and was contributed by fruit length and seed width; loads presenting absolute value greater than 0.5 were considered relevant (Table 2).The 165 genotypes were distributed along the axis of the principal components.That means the closer a genotype is to the other; the more similar they will be, while the genotypes that are further away from the axis of the principal components are the most discrepant.The two-dimensional plane formed by PC1 (28.71%) and PC2 (20.47%) components retained altogether 49.18% of the original variance.It can be observed in Figure 1, that 58,59,62,63,64,65,66,70,71,73,75 and 80 genotypes are located to the left of PC2, indicating negative correlations, differentiated by the seed weight, fruit width,  length, and weight, plant height and oil content variables.
The seed weight and oil content characters also had great importance in the principal components analysis of Malaysian physic nut (Shabanimofrad et al., 2013) and castor accessions (Anjani, 2010).According to the report of Reis et al. (2015) in their study on physic nut accessions, oil production per plant showed coefficient of variation of 60.00%, indicating that this is a character with high phenotypic variability.On the other hand, characteristics related to the seed presented the lowest coefficients of variation, with values below 10%.Seed morphological characters of the wild accessions are considered to be the first step in ascertaining genetic variability of the population.Large seeds, for example, may be favoured because they generate larger and more vigorous seedlings with better chances of survival than small seeds, on the other hand, the small seeds may have a selection advantage thanks to its wider and more effective dispersal (Eriksson, 1999).
The variables that have the same sign act in the same direction; that is, when there is an increase in one variable, it also occurs in the other, and those with opposite signs act in opposite directions; when the value of one increases, the other decreases.Thus, in accordance with the correlations indicated in Table 2, in PC1 and PC3, variables considered with higher discriminatory power act directly.In PC2, plant height and seed length characters act directly, but indirectly to the oil content.In turn, in PC4, seed width and fruit length characters act indirectly.Carvalho (2010) reported that the number of primary branches was a character that had high contribution in the third principal component in physic nut, and may be the variable discarded according to Cruz et al. (2004) and Pereira et al. (2003)'s criteria, showing similar results presented in our study.Thus, seed width variable could also be discarded, since it showed a correlation only in the fourth principal component.However, considering the economic potential of the understory crop in the initial years of establishment, the characters plant height and number of branches are considered important for major selection indices when the objective is to incorporate physic nut in an agroforestry system wherein balanced trade off can be made on yield (Rao et al., 2008).
By observing the agronomic traits values for each genotype selected by PCA (Table 3), genotypes 63 and 53 were selected only due to their high seed-oil content, genotypes 75 and 15 for their high fruit width values, genotype 69 for the greatest number of primary branches and high stem diameter value, genotypes 64, 66 and 59 for their high seed and fruit weights and genotype 73 for higher seed weight and fruit length and weight.Genotypes 71, 65, 62 and 117 had higher seed weight, fruit width and weight, 57 had high seed weight, genotype 6 had higher seed weight and fruit weight, 58 had higher fruit width, length and weight values.Genotype 33 was selected due to its high stem diameter value.Genotypes 17 and 128 were selected due to the high plant height and larger stem diameter, genotype 52 had high seed width value and genotype 74 had the longest fruit length.Laviola et al. (2011) found that the stem diameter and plant height traits contributed 12 and 11%, respectively, to the genetic diversity of physic nut accessions.Considering that physic nut is a bushy plant that can PC3:10.42PC2: 28.71% Table 3. Physic nut genotypes averages selected by principal component analysis for 10 agronomic traits: Plant height (PH), stem diameter (SD), number of primary branches (PB), seed weight (SWt), seed width (SWd), seed length (SL), fruit width (FWd), fruit length (FL), fruit weight (FWt) and oil content (OC).reach up to 5 m in height (Saturnino et al., 2005) and its harvest is mainly performed manually, selecting smaller genotypes will improve the harvesting process.In addition, for commercial purposes, those genotypes that have taller trees, low oil content, and low productivity are not feasible.Genotypes 70 and 80 were considered the most promising and with potential to be used in genetic plant breeding programs, as they presented higher seed weight, fruit weight and width, oil content and also lower plant height.In a study of physic nut seeds using genotypes from Suriname, Ethiopia, Nigeria, Brazil and China, Vaknim et al. (2011) found that the oil content varied between 39 and 62%, Aguilera-Cauich et al. (2015) verified an average of 50.52% oil content in American physic nut accessions, whereas in the present work values observed vary between 50.81 and 62.89%.Cluster analysis was performed using the Ward method, which generated the dendrogram shown in Figure 4.The dendrogram allowed the formation of nine groups from a cutoff level where abrupt changes were observed, as recommended by Cruz et al. (2004).Cluster analysis by the non-hierarchical k-means method (Figure 5) allowed the characterization of the nine groups formed, according to the generated dendrogram.It is possible to observe that five within nine groups had oil content above average, and these same groups presented below average for plant height; that is, they were considered groups with genotypes that presented good results to be considered in genetic breeding programs.Group 1 was considered the best group, especially for presenting the highest values for seed weight, fruit width, length and weight and oil content, plus the low plant height.Group 2 presented the largest stem diameter and higher number of primary branches, besides high oil content and low plant height.Groups 3, 5 and 6 were classified as having the worst performance, as they had low oil content and high plant height values, therefore not considered relevant.Group 4 is composed of genotypes with high oil content, but low number of primary branches.Groups 5 and 6 were characterized as having high plant height values and low oil content.Group 7, despite presenting low values for all traits related to seed and fruit, had high oil content.Group 8 was also considered good due to the low plant height and high oil content.In contrast, Group 9 was not considered interesting since it was characterized by having low values for traits related to fruit and seed, and low oil content.According to groups formed by Ward's dendrogram, hybridizations between groups 1 x 4, 1 x 7, 1 x 8, 2 x 4, 2 x 7, 2 × 8, 4 x 7 and 4 x 8 can be recommended, due to the distance between the groups, thus having heterogeneity between them.Also they have genotypes with attractive characteristics for the physic nut production system, such as a low plant height and high oil content.It is noteworthy that, within groups, the genotypes belonging to different accessions had similar values in some characters.Rao et al. (2008) and Spinelli et al. (2010) observed positive correlation between the number of branches and plant height, and productivity character as well.This is a very important information that can be used to facilitate the selection of promising genotypes for this crop which still has a lot of genetic variability to be exploited.

Accession
Cluster analysis indicated that there is variability within the accessions, as the genotypes belonging to the same accession are in different groups and their characteristics do not always resemble each other.In a study with phenotypic diversity of physic nut, Aguilera-Cauich et al. (2015) found variability within the accessions and concluded in their study that are greater diversity among the American physic nut accessions evaluated in comparison with reports on diversity for India and Malaysia.Likewise, Trebbi et al. (2015) verified an increased genetic variability and heterozygosity in physic nut accessions of Mexico and Guatemala.These results can be explained by the fact that Central America is the center of origin of the species.Genetic variability in physic nut population was also found by Brasileiro et al. (2013), as well as higher estimates of heritability, in which it was possible to obtain genetic gains for growth and production traits.Reinforcing this information, Abreu et al. (2009), in physic nut accessions, obtained high heritability coefficients for plant height, first leaf height, stem diameter and number of leaves, due to the wide genetic variability among accessions.The higher the heritability of a characteristic, the better is the prediction of genetic value by individual performance and the faster the response to selection for this trait (Oliveira et al., 2007).
In a study with multivariate analysis for resistant peanut genotypes selection, Pitta et al. (2010) concluded that Ward and K-means clustering methods were efficient and complementary to the principal components analysis, also presented in this study.Due to the presented variability within the accessions, the generated dendrogram did not reveal a pattern with similar geographic regions, and it is explained by the fact that each group brings together different accessions within it.Similar results with physic nut were obtained by Tripathi et al. (2015) that used the k-means method to group accessions from different parts of India, Jun-ling et al. (2010) that studied by UPGMA method 38 accessions from different regions of China and Indonesia, and Kaushik et al. (2007) that analyzed accessions from India by non-hierarchical Euclidian cluster analysis, concluding that geographical diversity need not obligatorily be related to genetic diversity.
For fruit shape, genotypes 29,65,66,67,68,69,70,71,72,73,74,75 and 77 presented an ellipsoid lanceolate, and most of them were in Group 3. All the other 152 genotypes had an ellipsoid spherical fruit shape.These results corroborate with that of Laviola et al. (2011)'s study where among 195 physic nut accessions evaluated, 190 had fruit in an ellipsoid spherical shape, four had ellipsoid lanceolate shape and one had ellipsoid ovoid fruit shape.It can be concluded, then, that fruit shape is a qualitative trait that contributes little to the variance among accessions.

Conclusions
The study showed that there is genetic variability for the physic nut accessions evaluated for the traits assessed and the results are very important information to be exploited in a genetic breeding program.Multivariate analyses allowed genotype characterization and also indicated those that are different from each other, allowing the targeting of crossings.All agronomic traits allowed genotype discrimination and characterization.

Figure 4 .
Figure 4. Hierarchical cluster analysis dendrogram using the Euclidean distance and the link between the groups by Ward method for agronomic traits: Plant height, stem diameter, number of primary branches, seed weight, seed width, seed length, fruit width, fruit length, fruit weight and oil content.

Table 1 .
Physic nut origin, identification and number of plants used in this study.