Identification of superior soybean lines by assessing genetic parameters and path analysis of grain yield components

This study estimated genetic parameters, identified traits of direct and indirect correlation with the grain yield through path analysis and select superior lines, maximizing genetic gain. Two experiments were conducted in 2012/2013 season in randomized blocks with three replications. 23 lines (experiment I) and 44 lines (experiment II) were seeded. For all traits analyzed in both experiments, there was a significant difference by F test (P <0.05) between the lines. Estimates of genetic parameters have identified the traits plant height at maturity and 100-seeds weight in both experiments I and II as the most favorable to the selection by presenting heritability values above 0.5. The genotypic correlations and the path analysis indicated the plant height at physiological maturity (PHM) of greater effect on direct favorable grain yield in both experiments. The indirect selection for grain yield (GY) via the PHM trait is considered effective while superior lines identified included 10, 15, 17, 20, 32, 41, 48, and 57.


INTRODUCTION
Soybean breeding programs in Brazil aim at mainly increasing oil and protein content and grain yield in order to ensure competitiveness in the world market (Batista et al., 2015).This has been achieved through continuous genetic improvement which has contributed to the development of soybean cultivars with high yield and adapted to the different environments of Brazilian (Lima et al., 2008).However, the continued progress of breeding and subsequently genetic gain in soybean depends on the genetic variability and the application of information about genetic parameters to obtain information that can facilitate the efficient selection process (Hamawaki et al., 2012).Knowledge about genetic parameters such as heritability, genetic gain and *Corresponding author.E-mail: bh.val@hotmail.com.
Author(s) agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License genetic correlations also aids in decision making and thus choosing the most appropriate strategy for improving yield related traits (Hamawaki et al., 2012;Leite et al., 2016).
Grain yield is a complex trait that results from expression and interaction of many factors (El-Mohsen et al., 2013;Silva et al., 2015).Knowledge of the degree of this interaction through genetic correlation studies helps to identify traits that can be used for direct and indirect selection for optimum genetic gains (Cruz et al., 2004;Silva et al., 2015).The direct and indirect correlation among traits can be achieved through the method developed by Wright (1921).It helps to identify traits that contribute most to the final value of the main character (Akram et al., 2011;Alcântara Neto et al., 2011).
The objective of this study was to estimate genetic parameters, and identify directly and indirectly correlated traits with the grain yield in order to select superior lines with high yield.

MATERIALS AND METHODS
Two experiments were conducted in the season 2012/2013 at the Teaching, Research and Extension Farm (TREF), Universidade Estadual Paulista "Julio de Mesquita Filho", Jaboticabal Campus, located in the North of the State of Sao Paulo, 21° 15 ' South latitude and 48° 18' West longitude, with an approximate altitude of 595 m.The soil of the experimental area is classified as clayey Red Eutroferric Latosol.The climate is subtropical with a hot and humid summer and a dry winter with average annual temperature of 22.2°C (Köppen, 1948).
The first experiment evaluated 23 F7-F8 lines and the second experiment evaluated 44 F6-F7 lines.The division into two experiments was necessary due to the difference in the generations of the two groups, although all lines are from crossing PI200456 × MGBR46 (Conquista).The evaluation was done using randomized complete blocks with three replications.Each experimental plot consisted of four rows of 5 m long, spaced 0.5 m apart, being considered as useful area for the two central rows, disregarding 0.5 m from each end, totaling 4 m².
Sowing was mechanically done on November 27, 2012, for the first experiment and November 29, 2012 for the second experiment, with seeding density of 15 plants m -1 .In planting, 350 kg ha -1 of formula 00-30-15 were used and experimental plots were maintained throughout the crop cycle, with strict control of pests, diseases and weeds, as recommended for the soybean crop (Embrapa, 2013).
All phenotypic data were collected as a mean of six plants when the crop was at R8 development stage, when approximately 95% of the pods are physiologically mature (Fehr and Caviness, 1977).The following traits were evaluated: height of the first pod (HFP): measured from plant's stem base to the height of the first pod (cm); plant height at physiological maturity (PHM): measured from the height of the plant's stem base to the apex of the main stem (cm); number of days to physiological maturity (NDM); lodging (Lg): basing on visual notes scale, ranging from 1 (all erect plants) to 5 (all lodged plants); agronomic value (AV): visual notes scale, ranging from 1 (plants with poor agronomic characteristics) to 5 (plants with optimal agronomic characteristics), and the assigned note represents a set of visual traits (plant architecture, quantity of filled pods, plant vigor and health, premature thrashing of pods, lodging and leaf retention at maturity); number of branches (NB): obtained as a mean of number of branches for six plants; number of pods (NP): obtained by counting the number of pods from each plant and taking the average in the useful area; 100-seed weight (HSW): obtained by the average of three samples of 100 seeds; oil content (OC): percentage of oil contained in the soybeans obtained by the equipment NIR Bruker, Tango model; grain yield (GY): obtained from the harvest of the plants of the useful area of the plot, which were threshed and the grain weight was corrected to 13% moisture and subsequently converted into kg ha -1 .

Data analysis
Data were analyzed using MIXED model analysis of variance procedure of SAS ® 9.3 software (2011).Where necessary, transformation of data was done.
The values were estimated for the coefficients of heritability and genetic correlation (Falconer, 1987), and later, the path analysis was performed.The trait GY was chosen as the main variable to determine the correlations between direct and indirect effects of other traits on GY (Wright, 1921).
Then, the methodology of selection index of the sum of "ranks" (Mulamba and Mock, 1978) was applied.The selection intensity was 30% and performed based on the traits GY, AV and OC.Path analysis and selection index were processed by the software Genes (Cruz, 2013).

RESULTS AND DISCUSSION
For all traits analyzed in both experiments, there was a significant difference by F test at 5% of probability between genotypes.The experimental coefficients of variation ranged from 2.18% (OC) (Table 1, experiment I) to 32.09% (GY) (Table 2, experiment II).
The ratio between the genetic coefficient of variation (gCV) and the environmental coefficient of variation (eCV) was above 1 for HFP, PHM, NDM, AV, NB, NP, HSW, OC and GY (Table 1) and PHM and HSW (Table 2), indicating favorable situation in obtaining gains with selection based on the traits.The broad sense heritability coefficient was 22.11% for Lg (Table 1, experiment I), a low value indicating little gain for the trait that does not present favorable situation for selection as the ratio (gCV/eCV) was below 1.This fact will not result in major problems since the average of the trait in the population is 1.25, very close to the ideal, which is 1.The same is true for the trait Lg in the second experiment (Table 2).
The heritability coefficients for the traits HFP, PHM, NDM, AV, NB, NP, HSW, OC and GY in experiment I (Table 1) showed medium to high values and are favorable situation for selection as significant gains are possible in these traits.
For the second experiment, the heritability coefficient showed low values for all traits except PHM and HSW, which showed higher values.These two were also the only ones who had a favorable situation for selection.In the situation of the experiment II, the gains with the selection tend to be low, since eight of the ten traits evaluated showed low values for the heritability coefficient and an unfavorable situation for selection (Table 2).Leite et al. (2015) obtained heritability estimates of 70.60% for HFP and 79.29% for PHM enabling obtaining gain estimates with significant selection for the two traits.Muniz et al. (2007) found heritability values ranging from 0 to 53% for PHM and from 0 to 19% for GY.These values are very close to those found in this study; the said author obtained for the same traits gain estimates with low selection, ranging from 0 to 6.87% for PHM and from 0.57 to 5.76% for GY, indicating that the situation it is not favorable to selection for all populations for the two traits.Leite et al. (2016) found heritability values of 85.00, 81.00, 74.00 and 53.00% for PHM, HFP, NP and GY, respectively.These values differ from those observed in this study.Heritability estimates are intrinsic to the study population, and may vary a lot or almost nothing from one population to another.Moreover, the parameter can vary widely within the same population according to the method used for its estimation.In this work, authors used the method of restricted maximum likelihood, which is most appropriate when having unbalanced data.
It appears that the estimates of genetic correlations of GY with HFP, PHM, NDM and OC in experiment I (Table 3) indicate a significant degree of association and are in the same direction, corroborating the results of Akram et al. (2011) and Nogueira et al. (2012).This indicates that selecting higher values for these traits may result in indirect gain in productivity.
The trait AV presented low magnitude of genotypic correlation with GY in the same direction and correlations ranging from medium to high magnitude also occurred between other traits of agronomical interest such as AV × HFP (medium correlation and same direction), AV × PHM (high correlation and same direction) NB × AV (medium correlation and opposite direction) and NP × AV (medium correlation and opposite direction) (Experiment I).
In experiment II, the correlations of GY with HFP, PHM, and NDM were of low magnitude and same direction and the correlation GY × Lg was of low magnitude and opposite direction, meaning that grain productivity gains may result in reduced lodging.The highest correlation coefficients for the second experiment occurred for PHM × AV, NB × NP, and HFP × AV, all in the same direction.
The results obtained in this study of low and medium correlations, both in the same direction for the two experiments, corroborate those found by Leite et al. (2015), who obtained genotypic correlation of medium magnitude in the same direction for GY × PHM in soybeans.The genetic correlation between GY × PHM in opposite direction can be interesting when, in the population, the average of the trait PHM is high (close to 100 cm).This is because by selecting the most productive plants, one will also select lower plants, facilitating mechanized harvesting and seeking a more upright stand, as in many cases there is a correlation between PHM × Lg in the same direction.
In order to identify and quantify the direct and indirect effects that other characters have on the grain yield, wehare carried out the unfolding of genotypic correlation coefficients in direct and indirect effects through path analysis (Wright, 1921).The direct and indirect effects of the explanatory traits on grain yield for the 23 soybean lines (experiment I) are shown in Table 4 and for the 44 soybean lines (experiment II) are shown in Table 5.
The coefficient of determination showed that 63.03% of grain yield (experiment I) is explained by the effect of the traits analyzed based on genotypic correlation matrix, while for the second experiment this figure was only 33.10%.
As shown in Table 4 (experiment I), the traits that most influenced the GY were PHM, NDM and NP, which had the highest values of direct effects on the genotypic coefficient.This indicates that these traits that had larger direct effects on GY may be suitable and important in soybean breeding due to presenting great influence on the final determination of GY.Alcantara Neto et al. ( 2011  between GY and NP and direct effect of number of pods on GY and stated that the number of pods is a possibility in soybean breeding, for indirect selection of grain yield. The overall correlation between HFP and GY was positive and higher than the direct effect, indicating in this case that the cause of correlation is due to the indirect effects exerted by the other traits.The trait PHM for the experiment I showed high correlation with GY of 0.58; the direct effect exercised by PHM on GY was equal to the overall correlation, indicating in this case that the total correlation explains the true relationship between the traits and that the direct selection on PHM will be effective.Akram et al. (2011) stated that the trait PHM can be regarded as an indirect selection alternative of GY because it exerts great influence on the final value of grain yield.
The trait NDM in the first experiment showed overall correlation of 0.57 and direct effect on GY of 0.41.As the direct effect of NDM on GY is high and close to the overall correlation, we can say that the correlation Table 6.Averages of the original population (X0) and selected population (Xs), heritability (h² %) and estimates of gains with the selection (GS%) obtained for ten traits by the sum index of "ranks" of Mulamba and Mock (1978)  explains the true relationship between traits, and in this case, the direct selection on NDM will also be effective.This result agrees with that obtained by Akram et al. ( 2011), who reported positive direct effect of NDM on GY.
For the trait AV, overall correlation was 0.24 and the direct effect on GY was 0.05, indicating that in this case the indirect effect is the cause of correlation.This was expected, as the trait AV is a visual note assigned by the researcher, taking into account several other traits.Rigon et al. (2012) found overall correlation and negative direct effects on GY for NDM, Lg and PHM, in which the correlation explained the true relationship between the traits.In this case, the direct selection on these traits may be effective for indirect gains in GY; the authors found positive direct effect only for the trait HSW.
In Table 5 (experiment II), the trait PHM showed overall correlation of 0.37 and the most favorable direct effect on GY (0.48), indicating that the correlation explains the true relationship and that the direct selection on the trait is effective.No trait presented pronounced indirect effect.Perini et al. (2012) found for cultivars with indeterminate habit overall correlation between the trait HSW and grain productivity of -0.785 and direct effect on grain yield of -0.020, indicating that in this case the correlation does not explain the true relationship and that the direct selection on the trait is not effective, which can be confirmed by the indirect effect of number of grains per plant of -0.764.For the trait number of grains per plant, the authors found overall correlations very close to the direct effect on grain yield, which indicates the possibility of using this trait for indirect gains in grain yield.
Regarding the trait AV for the second experiment, the correlation between AV × GY is due to the indirect effect of the trait PHM.The trait Lg showed overall correlation with GY very close to the direct effect on GY, indicating that the correlation between the two traits explains the true relationship and that the direct selection on the trait is effective.
The trait PHM can be used in selection indexes in the upper direction in order to obtain indirect gains for grain yield and the trait Lg too, but in the lower direction.
In analyzing the results of the two experiments, it can be seen that the correlations, as well as their direct and indirect effects, are intrinsic to the population under study and that they cannot be extrapolated to other populations.In view of this, there is need for studying each population (experiment) so that one can adopt an effective selection strategy to maximize the gains according to the purpose of the program.
Regarding the analysis of genetic gains, the index based on the sum of "ranks" of Mulamba and Mock (1978) allowed a total gain of 111.80% distributed for the ten traits in the experiment I.The biggest gain was for GY (32.15%), followed by OC and HFP, with 19.40 and 15.06%, respectively.The trait NDM was the one with the lowest gain (0.65%) (Table 6).
Table 7 shows that the index had good applicability in the selection of superior lines, as the direct selection on GY also selected the same lines.
The use of selection indexes allows efficient simultaneous selection of a group of agronomic traits (Cruz, 2013), as they provide gains distributed in all evaluated traits with higher total gains, without providing significant loss in the main trait (Rezende et al., 2014).The lines that stood out in the trait GY for experiment I were 10, 15, 17 and 20.
For the experiment II (Table 8), lower total gains were observed (58.57%) compared to experiment I, which can be explained by the high values of environmental variances checked in (Table 2), which have indirect Table 7. Averages of ten traits of the lines selected by the sum index of "ranks" of Mulamba & Mock (1978)  relationship with heritability.The gain for the trait GY was much lower than in the first experiment.The traits PHM, GY and HSW showed the highest gains, which were 15.46, 9.86 and 9.34%, respectively.The lowest gain was observed in the trait NDM of 0.19%, a loss of 1 and 0.91% was observed for traits NB and NP, respectively.However, the gain in the trait HSW was of 9.34%.
In the first experiment, the index also had good applicability in the selection of superior lines for the experiment II, differing from the direct selection on GY only in the last three least productive lines.The lines 41, 57, 32 and 48 showed higher values for trait GY in experiment II (Table 9).Costa et al. (2004) observed 46.67% of gains for grain yield by using Mulamba and Mock index (1978), economic weight 1 and considering GY, AV and PHM as the main traits.The authors concluded that Mulamba and Mock index proved to be better suited to the conditions of their work, with greater progress in various situations.
In the selection of superior progenies in popcorn populations, some authors observed higher and more appropriate gains by using Mulamba and Mock index (1978) for the capacity of expansion and grain yield, with respective values of 7.16 and 10. 00% in the third cycle of recurrent selection (Santos et al., 2007); 10.55 and 8.50% in the fourth cycle (Amaral Junior et al., 2010); and 6.01 and 8.53% in the fifth cycle of recurrent selection (Rangel et al., 2011).

Conclusions
Estimates of genetic parameters helped to identify that traits PHM, HSW, OC and GY in experiment I and PHM and HSW in the experiment II are ideal for selecting high grain yield.
Genotypic correlations and the path analysis showed ) and Nogueira et al. (2012) observed a positive correlation

Table 1 .
Mean squares and genetic parameters for Experiment I.

Table 2 .
Mean squares and genetic parameters for Experiment II.
**Significant at 5% of probability by the F test. ¹Transformation Y 0.5 .***Values presented without data transformation.HFP, Insertion height at first pod; PHM, plant height at maturity; NDM, number of days to maturity; Lg, lodging; AV, agronomic value; NB, number of branches; NP, number of pods; HSW, 100-seed weight; OC, oil content; GY, grain yield.

Table 3 .
Estimates of genotypic correlation coefficients in Experiment I (above the diagonal) and Experiment II (below the diagonal) between ten traits evaluated in advanced soybean lines.

Table 4 .
Unfolding of genotype correlations in direct effect components (diagonally) and indirect effect components involving the main dependent trait (GY) and explanatory independent traits (HFP, PHM, NDM, Lg, AV, NB, NP, HSW and OC) for Experiment I.
HFP, Insertion height at first pod; PHM, plant height at maturity; NDM, number of days to maturity; Lg, lodging; AV, agronomic value; NB, number of branches; NP, number of pods; HSW, 100-seed weight; OC, oil content; GY, grain yield.

Table 5 .
Unfolding of genotypic correlations in direct effect components (diagonally) and indirect effect components involving the mai n dependent trait (GY) and explanatory independent traits (HFP, PHM, NDM, Lg, AV, NB, NP, HSW and OC) for Experiment II.
HFP, Insertion height at first pod; PHM, plant height at maturity; NDM, number of days to maturity; Lg, lodging; AV, agronomic value; NB, number of branches; NP, number of pods; HSW, 100-seed weight; OC, oil content; GY, grain yield.
in 23 soybean lines (experiment I).
HFP, Insertion height at first pod; PHM, plant height at maturity; NDM, number of days to maturity; Lg, lodging; AV, agronomic value; NB, number of branches; NP, number of pods; HSW, 100-seed weight; OC, oil content; GY, grain yield.
HFP, Insertion height at first pod; PHM, plant height at maturity; NDM, number of days to maturity; Lg, lodging; AV, agronomic value; NB, number of branches; NP, number of pods; HSW, 100-seed weight; OC, oil content; GY, grain yield.