Variability evaluation of castor seeds (Ricinus communis) by multivariate analysis of local accessions from Mexico

1 Departamento de Ingeniería Bioquímica, Instituto Tecnológico de Celaya, C. P. 38010, Celaya, Guanajuato, México. 2 Departamento de Ingeniería Química, Instituto Tecnológico de Celaya, C. P. 38010, Celaya, Guanajuato, México. 3 Departamento de Procesos Alimentarios, Universidad Tecnológica del Suroeste de Guanajuato, C. P. 38400. Valle de Santiago, Guanajuato, México. 4 Departamento de Tecnología de información y Comunicación, Universidad Tecnológica del Suroeste de Guanajuato, C.P. 38400. Valle de Santiago, Guanajuato, México. 5 Agrícolas y Pecuarias Campo Experimental Bajio, Instituto Nacional de Investigaciones Forestales, C.P. 38010. Celaya, Guanajuato, México. 6 Departamento de telemática. Universidad Politécnica de Juventino Rosas, C.P. 38253, Santa Cruz de Juventino Rosas, Guanajuato, México.


INTRODUCTION
The castor plant belongs to the family of Euphorbiaceae and is developed in tropical and semi-tropical regions (Severino et al., 2012).Given its huge, adaptive capacity it can currently be found practically all over Mexico (Rodríguez and Zamarripa, 2012); however, its cultivation and commercial use has not been developed to the same extent as Brazil, Nigeria, India and China (Salimon et al., 2010).Castor oil has a wide range of uses and is used as a raw material in the synthesis of high added value products such as oleorresines, polymers and synthetic fibres, as well as chemical products such as Undecylenic acid and Hydroxystearic acid (Ogunniyi, 2006;Perdomo et al., 2013;Van der Steen et al., 2011).The main characteristic of castor o is the high content of ricinoleic acid (C 18 H 32 O 3 (R)-12-Hydroxy-cis-9-octadecenoic acid) that, possessing a hydroxyl group and unsaturation can contribute to reactions to obtain products with a high added value (Yusuf et al., 2015).
In Mexico, during the decade of the 1960's, over 12,000 ha were grown whereas in the year 2000 that had reduced to only 1,800 ha and today there is practically no agricultural production of this plant (FAOSTAT, 2017); Mexico currently imports castor oil.Government authorities promote production of the castor seeds (SAGARPA, 2009), however, the value chain for the utilization of the seed and castor oil has not been consolidated, and there is a need to substantially increase the growth area together with agricultural productivity and the planning of the processing industry and marketing channels.
Deeper knowledge about and an ability to identify those varieties of local castor plants with high oil production potential would make their cultivation attractive.Furthermore, the study on variability of the seeds via their descriptors would enable the most apt accessions to be chosen for their growth and industrial use (Lima et al., 2014).In this last decade, the seed has been grown in a marginal way in Mexico leading to a preference for the introduction of foreign varieties which have raised the farming costs.Nevertheless, there is a diversity of local varieties of the castor plant that have been little studied yet offer an interesting potential and are better adapted to the region.This signifies the importance of evaluating these varieties for the quality of their oil that could be competitive with commercial varieties.
In this work a variability study of 18 local accessions of seeds collected in Mexico from castor plants was carried out through 22 descriptors of their agronomic yield, proximate composition of the seed and the quality of the oil.This study was carried out using descriptive statistics and multivariate analysis to implement clustering algorithms to identify and select the seeds for their attributes and yields.

MATERIALS AND METHODS
The collection of castor seeds used in this study came from a national harvest that was made up of 120 local accessions that was carried out by INIFAP-Bajio (Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias Unidad Bajío) en 2010.The accessions considered in this study came from Guanajuato, a state located in the centre of Mexico, and were selected for their high agricultural yield and oil content.These seeds are safeguarded and preserved in facilities INIFAP-BAJIO.
The seeds were grown on plots located in the Campo Experimental de INIFAP-Bajío under the same agro-climatic conditions (20°34´47" N and 100°49´14" W) and agricultural technologies (Hernández-Martinez et al., 2013).The commercial varieties k75B and k93B were used as controls along with the seeds sown and harvested from varieties known as k75G and k93G.The seeds were stored in hermetically sealed plastic vessels at a temperature of between 4 and 5ºC for their subsequent study (Valero and Díaz, 2014).

Fatty acid profiles
Oil from the seeds was extracted with hexane using a Soxhlet apparatus and underwent transesterification reactions to convert the fatty acids into methyl esters (FAMEs, Kirk et al., 2004).The FAMES were dissolved in heptane to be analyzed by gas chromatography (Perkin Elmer; GC-Clarus 500).An AT-Wax capillary column 30 m x 0.25 mm x 0.5 µm (Alltech Heliflex®) was used.The operational temperatures were 230, 230 and 250°C in an injector, oven and FID detector, respectively.Chromatographic grade N2 was used as a carrier gas at a pressure of 14 psig with a 15:1 split , the injected sample volume was 1 µL.Analytical standard methyl heptadecanoate, (Sigma) was used as an internal standard.

Chemical analysis of the oils
The following values were determined: acidity (Horwitz, 2002), peroxide (Crowe and White, 2001), iodine (Canesin et al., 2014) and saponification (Canesin et al., 2014).The values were determined in triplicate using the oils extracted from seeds of the castor plant considered in this study as raw materials.

Statistical analysis
The experimental details for 22 selected castor plant seeds were analyzed by descriptive statistics: Mean, standard deviation, maximum and minimum values, range and coefficient of variation.The matrix of correlation coefficients between the descriptor studies was also constructed.The Turkey test (α= 0.5) was carried out to determine the similarity between the measurements of the different descriptors of the accessions of the seeds studied.In order to study variability, multivariate analysis was used, implementing *Corresponding author.E-mail: msacosta@utsoe.edu.mx.
Author(s) agree that this article remain permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

RESULTS
Table 1 shows the characterization of the different accessions of the seeds studied, indicating the average value, standard variation, maximum and minimum values, and the range and coefficient of variation of each of the descriptors that were experimentally measured.The Turkey tests of means are indicated by superscripts; same letters indicate means with no significant statistical difference (α= 0.5), the variables of volatile solids, acidity values and iodine values show similar means in all the accessions.Figure 1 shows the variation coefficients (CV) of each one of the descriptors studied.As can be seen, the amount of araquidonic acid (Oil7) and the ashes content in the oil (Cea) are the variables that exhibit the greatest variability among the accessions.The agronomic yields of the seed (Re) and the oil (Ra) are also sources of important variability.The variables that describe the proximal seed composition and the chemical properties of the oils show lower CV.One of the variables of greatest interest is the yield of the castor plant seed (Re).The average world yield is 1.1 t ha -1 ; under favorable conditions yields of up to 4-5 t ha -1 have been reported (Scholz and da Silva, 2008).In the results obtained in this work, the values reported in the technical datasheets of the hybrid seeds are the ).However, in agricultural testing in Mexico these reduced to (K75G: 2,200 kg ha -1 and K93G 2,350 kg ha -1 ). Local accessions with attractive yields such as A3 with 2,402 kg ha -1 were found.In a collection of castor plant seeds in the United States, the weight of 100 seeds (P) between 10.1g and 73.3 g with an average of 28.3 g were determined (Wang et al., 2010).A collection of seeds in Chiapas State, southeastern Mexico has a P of between 7 and 123.9 g with an average of 48.72 g (Goytia-Jiménez et al., 2011).In this study, that varies between 21.13 and 91.65 g with an average of 48.55 g.In landrace collections, seed oil contents, PA, of 40-55% have been reported (Wang et al., 2010), in selected seeds, 54.47% (Chen et al., 2016) and hybrid seeds, 45.5-52.1% (Alexopoulou et al., 2015).The accessions studied in this work, showed a content range of 40.51-55.53%and an average of 47.47%.Local variety A553 is the highest content.
Acidity value (IA) based on the ricinoleic acid of the wild varieties A7 and A8 showed the highest values (1.46%), while variety A5 showed the lowest, 1.06%.Commercial seed K93B exhibited an index greater than 1.7%.Wesoøowski and ErecinÂska (1998) reported an IA of 1.59% (oleic acid) in rapeseed oil.The oils of local varieties A1, A12, A2, A3, A553, A559 and A6, as well as hybrids k75G, k93B and k93G have peroxide value (IP) of around 10 meq O 2 kg -1 , indicating greater chemical stability of the oil to oxidation reactions that give rise to rancidity.
A14 and A16 exhibited the highest iodine value (IY); 110.46 and 113.78 g I 2 /100 g respectively.Vegetable oils generally have an IY between 30 and 60 g I 2 /100 g (Meshram et al., 2013).Akpan et al. (2006) and Yusuf et al. (2015) reported that IY values between 81 and 91g I 2 /100 g, similar to the results presented in this work.).Variety k93B, being a commercial variety, had been harvested for a longer time than the k93G; however data presented in this work do not coincide with those reported by Wesoøowski and Erecin Âska (1998), given that they reported values of 173 and 191.3 mg KOH g -1 ; the first in castor plant oil and the second in mangrove oil whereas lower values were found in this work.
Table 2 shows the correlation coefficients among the variables that describe the seeds studied.It can be observed that the majority of coefficients differed from the unit, indicating that the variables are mainly independent of each other.The castor seed (Re) and oil yield (ra) possess a correlation coefficient of 0.860, which, if all the seeds had the same oil content (PA) this coefficient would approach unity.It is for this reason that the oil content (PA) has a low correlation with respect to Re and ra.Coefficients with low variation imply problems in the description of variability of the castor plant accessions in function of their original descriptors.

Clustering analysis
Clustering is based on dissimilitude analysis of objects that make up a population, seeking to group objects with more homogenous properties.Dissimilitude is measured mainly by the difference in the attributes of the objects, in this case, euclidean distance.Figure 2 shows the modules of the distance vectors for each accession in an ordered way.The variables were normalized with respect to their maximum value, in percentage units.The components of each distance vector, as the difference between the measured value and the average of each variable studied were calculated.It can be clearly observed that the commercially grown hybrid varieties k93G and k75G show an important distancing from the rest of the seeds.The majority of the local accessions are found between polygons 40 and 80.Some local seeds and the commercial varieties k75B and k93B are found between polygons 80 and 100.
The results of the clustering algorithm using UPGMA (Unweighted Pair Group Method with Arithmetic Mean) methodology are shown by a dendogram in Figure 3.In this method, the distance vectors are conformed by the differences in descriptors between two accessions; the pairs of closest accessions are associated in a cluster with average properties.In the next stage of clustering the new distances are calculated and new clusters are created.It is interesting to observe that the original hybrid seeds and the cultivated hybrids grouped in two different clusters separate from the local seeds.The seeds with the greatest differences are the local varieties A1 and A553 which make up unitary clusters.The local vanity A553 is associated with the great majority of the local varieties has the average attributes similar to the set of seeds involved in this category.
The seed accessions show high dispersion in their descriptors, the average CV was 32.45% and the correlation coefficients among variables, were in general, low.PCA found that 11 Principal Components are necessary to represent 95% of the total variation.For the variability study of the seeds, components PC1 and PC2 were used representing 33.4% and 13.6% of the total variation respectively.Table 3 shows the components of each eigenvector.These coefficients are used to transform the values of the characteristics of each seed into one single value for each main component.In this   way, each seed was represented in function of components PC1 and PC2 in Figure 4; this graph, known as biplot, also shows the correlation between the original variables with each main component using dotted lines.
As can be seen, the highest amount of correlation lines coincides around the positive axis of PC1.The variables with the highest positive correlation with PC1 are Re, ra and oil1-oil7.The seed accessions from PC1 are ordered from lowest to highest, left to right, in accordance principally with their agronomic yield of oil and seed.In contrast, PC2 exhibits a high positive correlation with PA and a high negative correlation with CH.It can be seen in Figure 4 that the accessions with the greatest seed and oil yield are k93B and k75B whereas the local accession A10 is the lowest.Moreover, accessions A13 and A553 produce seeds with the highest amount of oil and hybrids k75G and k93G, commercially grown in Mexico, show the lowest.
A cluster analysis of the seed accessions, based on their representation in function of PC1 and PC2 was carried out.The resulting clusters are represented in Figure 4 as ellipses.In Figures 3 and 4 the two methods of multi-variant analysis groupings were able to discriminate and separate the accessions of commercial hybrid seeds from the local accessions.The local seeds in both methods group together in different heterogenic groups.In particular, A1 and A553 were unique and distant from other accessions using UPGMA, these accessions may be desirable parents for cross breeding in castor plant.In the dendogram, A5 had been identified associating the average properties of a group of local accessions.However, in the PCA, A5 is again found located close to the centroid of the clustering on the second quadrant between PC1 and PC2.Accessions A6, A7, A8, A9, A10, A11 and A12 can again be observed in this cluster.Both grouping techniques show that the accessions of castor plant seeds exhibit high dispersion in their descriptors.

Conclusions
In this article, 22 accessions, 18 local seeds from a national collection in Mexico and 4 hybrid seeds from Ricinus communis were evaluated from three aspects: agronomic yield, seed composition and oil quality, evaluating 22 descriptors.In the descriptive statistical analysis, high variability was found; the variation coefficients of the descriptors ranged from 3.05 to 99.34% with an average of 32.45%.This explains why the grouping techniques form dissimilar sets of sedes.The two clustering algorithms used were able to identify and separate the commercial hybrid accessions from the local wild varieties.From both clustering methodologies, it was possible to identify 5 groups; the hybrid varieties were located in 2 identical groups.The UPGMA identified two notable accessions of local seeds, A1 and A553, whose properties situate them as extreme opposites of all the seeds studied.
Castor seeds were described and classified by PCA methodology, in two major components, PC1, correlated to crop yields of oil and seed and PC2, correlated to the oil content and seed composition.The hybrid seeds exhibit high crop yields but oil content decreases, a group of local accessions with medium agronomic yields and high seed oil content was identified, 1944 kg ha -1 and 48.27%: A1, A3, A4, A13, A15 y A553.
This methodology can be extended to a greater number of castor seed accessions to obtain groups with a higher number of individuals and more defined, homogeous characteristics.This analysis technique can be extended to correlate seed characteristics to their agricultural performance such as germination, adaptation to climatic conditions and resistence to pathogens.

Figure 1 .
Figure 1.Variation coefficients of the descriptors of the castor plant seed accessions studied.

Figure 2 .
Figure 2. Euclidean distance with respect to the averages of the normalized descriptors for the accessions of the castor plant.

Figure 3 .
Figure 3. Dendogram obtained from clustering analysis using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) based on euclidic distances.

Figure 4 .
Figure 4. Distribution and grouping of local and commercial hybrid seeds of Ricinus communis as a function of the main components of their descriptors.

Table 1 .
Average of the variables evaluated in the eighteen local seeds of the castor plant in Mexico and the commercial hybrid varieties.

Table 2 .
Correlation coefficients among the descriptors of the seeds and oil of the castor plant.

Table 3 .
Principal components analysis of the correlation coefficient matrix.