Assessment of genetic variability, diversity, and identification of promising lines in linseed germplasm for harnessing genetic gain in central plain of the Indian subcontinent

Linseed (Linum usitatissimum L.), cultivated in more than 60 countries, is grown for fiber and oilseed worldwide. Here effort was made to assess the genetic variability in linseed germplasm and identify some promising lines used as parents in the linseed hybridization program. The study was designed with a total of 82 germplasm and a national check in RCBD for genetic variability for 11 agronomic traits. In this study, a considerable variation was observed for all the studied traits by using PCA analysis. It was also found that single plant yield, number of seeds per capsule, and number of capsules are ideal for linseed improvement through the selection in central India. Few high yielding accessions such as RL-10129 and Padmini showed maximum diversity with the popular variety T-397, and can be used in the hybridization program. Similarly, we identified a few potential accessions such as NDL-2013-03, EC-41741, Ruchi, EC-704, RL-10129 to be used as parents in the breeding program.


INTRODUCTION
Linseed or flax, (2n=30 and genome size of ~370 Mb), is the only species of agricultural significance within the Linaceae family of 14 genera and 200 species (Wang et al., 2012;Diederichsen and Richards, 2003). This selfpollinated crop has been cultivated for centuries primarily for its seed oil (linseed) or stem fibers (flax), or both (Zohary, 1999). Linseed provides raw materials for food, medicine, and textiles, and has been of great importance to human civilization and development for more than 8,000 years ( Van and Bakker, 1975). Crop evolutionists consider linseed domestication to have existed in the old world, including the modern-day countries Egypt, Palestine, Israel, Iraq, Syria, Turkey, Iran, Lebanon, Cyprus, and Jordan (Fu, 2011) and then spread to Switzerland and Germany 5000 years ago in Europe (Barber, 1991). Evidence suggests that the crop was also grown in China and India, at least 5,000 years ago (Cullis, 2007).
Linseed oil is remarkable for its health benefits primarily attributed to its high content of dietary fiber (20-25%) and omega-3 alpha-linolenic acid (45-65%) (Green and Marshall, 1981;Rabetafika et al., 2011). Given its impressive drying properties, it is also a prerequisite for its use as an industrial product (Cullis, 2007). Nonetheless, flax fiber is used as a valuable raw material for textiles, threads, and wrapping materials, and its straw used to manufacture different forms of cigarette papers, currency notes, and the wooden component used as a biomass energy source (Rowland, 1998). To increase the linoleic acid content in eggs and meat, the various farms include linseed as animal feed (Simmons et al., 2011).
Globally, Kazakhstan, Canada, Russia, China, India, USA, Ethiopia, France, and the UK are the primary linseed cultivation and producing nations. Kazakhstan is the world's largest linseed producer (0.93 Mtonnes), followed by Canada, and India (0.17 Mtonnes) ranks 5 th in the category (FAOSTAT, 2018). In India, under marginal and rainfed conditions, linseed is grown predominantly as an industrial oilseed crop covering an area of 0.32 million ha with a production of 0.174 million tons compared to 3.26 million ha worldwide which produces 3.182 tons. India's productivity is considered very poor at 543.8 kg/ha compared to the world average yield of 975.1 kg/ha and the average yields for the UK (1720 kg/ha), USA (1516.8 kg/ha), Canada (1497 kg/ha), China (1308.6 kg/ha), Kazakhstan (866.9 kg/ha), which are the top producers for this crop (FAOSTAT, 2018). This yield disparity may be due to low yield potential or lack of optimum agro-technological practices or a combination of both, and due to lack of availability of improved varieties in line with varied agro-climatic conditions (Singh et al., 2016).
The production of high-yielding varieties becomes the top priority to address the low yield levels; and improvement in any crop depends upon the accessibility of a wide range of genetic diversity. The development of a new variety depends primarily on selecting diverse populations with a broad genetic base. Identifying promising genotypes is very useful when breeding from initial parent lines to the final release of the variety. Modern linseed improvement has, however, lagged behind other oilseed crops, such as soybean and brassica oils.
The introduction of new germplasm is needed to broaden the genetic base and rejuvenate the breeding stocks.
Yield, a complex polygenic trait, is influenced by a large number of factors. The assessment of genetic variability for linseed accessions would constitute a better resource Hussain et al. 13 and direction for better germplasm utilization in linseed genetic improvement. The present study was, therefore, conducted to evaluate the variability present in the linseed germplasm for the central part of the Indian subcontinent and patterns of the interrelationship between different traits and important selection parameters.

Plant materials and phenotyping
The experimental materials consisted of a collection of 82 accessions of linseed for the present study. We laid the field experiment during the post rainy season of 2015-16 at Seed Breeding Farm, Department of Plant Breeding and Genetics, College of Agriculture, Jabalpur, M.P. India. All India Coordinated Research Project (AICRP) on linseed based at Regional Agricultural Research Station, Sagar, M.P., India, provided the seed materials for this study. We planned the experiment in Randomized Complete Block Design (RCBD) with 2 replications in 2.5 m long 2row plot spaced 30 cm apart and plant to plant distance of 10 cm. All the recommended agronomic practices were strictly followed to raise a healthy crop. We collected the data on 11 agronomic traits, viz; days to flowering (DF), days to maturity (DM), plant height (PH), no. of primary branches (NPB), no. of secondary branches (NSB), no. of capsule per plant (NCP), no. of seed per capsule (NSC), 1000-seed weight (TSW), harvest index (HI), and seed yield per plant (SYP). Five competitive plants were selected randomly from each entry for recording observation.

Statistical analysis
For agronomic traits, best linear unbiased predictors (BLUPs) were obtained, and the range and mean were calculated based on BLUPs. In GenStat 15, phenotypic correlations were estimated for the determination of trait associations. Path analysis was performed to estimate the direct effect of the traits towards grain yield using R Version 3.5.3 (R Project for Statistical Computing, http://www.rproject.org/) (R Core Team, 2018). To avoid the multicollinearity issues, independent traits biological yield per plant (BM) was excluded while performing path analysis. Based on agronomic traits, the Euclidean dissimilarity matrix was constructed using the R package cluster (Patterson and Thompson, 1971); thereafter, the accessions were clustered following Ward's method. The most diverse accession pairs were identified based on the Euclidean distance matrix for potential use as parents in linseed crossing programs.

Variance components, genetic parameters, and trait variability
The REML analysis showed significant variations among linseed germplasm (σ 2 g ) for all the 11 agronomic traits indicating considerable variability among the linseed germplasm. The phenotypic coefficient of variation (PCV) values of all the traits was higher than the corresponding genotypic coefficient of variation (GCV). Eight traits showed large phenotypic and genotypic variations, with PCV and GCV values greater than 10% (Table 1). Three traits had PCV exceeding 30%; four ranged from 20 to 30%, one has 11.17%, and 3 less than 10%. SYP and NCP had the largest PCV and GCV values of (34.80, 29.95%, and 31.95, 29.65%), respectively. Broad sense heritability was found high for all the traits studied (68.26 -98.78%). Genetic advance as a percentage of mean was found highest for the number of capsules per plant (56.67%) followed by SYP (53.11%), BM (48.71%), and NSB (46.99%).
A considerable variation in flowering time was observed (42 -69 days) in the germplasms. Four genotypes showed early flowering than popular check variety T-397, which flowered in 51 days. The study found a considerable variation amongst the germplasm for PH (45.94 -82.91). Similarly, the number of capsules per plant was much higher in the germplasm (up to 66 capsules per plant, e.g. Rashmi) compared to T-397 (33 capsules per plant).

Correlation analysis
Correlation analysis showed no association between phenological traits viz., (0.05), and DF (0.07) with SYP. Similarly, PH also showed no correlation (r=0.18) with SYP. On the other hand, both NPB (r=0.48**) and NSB (r=0.59**) showed a significant positive correlation with SYP. The NCP emerged as one of the most important indirect traits for selecting a high yielding line as it showed a correlation (r=0.79**) on SYP. Similarly, BM also showed its importance in determining seed yield with a correlation value (r=0.82**). TSW and HI showed a significantly high correlation with SYP (r=0.28** and r= 0.45**, respectively) ( Figure 1). Among the yield contributing traits, NPB had a significant correlation (r=0.40**) with NSB. Similarly, branches per plant, e.g. NPB (r= 0.44**) and NSB (r=0.68**), showed a significant positive correlation with the NCP.

Principal component analysis
The principal component analysis was performed based on predicted means (BLUPs) for the quantitative traits of linseed. Out of 11, only four principal components (PCs) exhibited more than 1.00 eigenvalue and showed about 79% of the total phenotypic variability ( Table 2). The PC1 had the highest variability (34.11%) followed by PC2 (20.39%), PC3 (14.01%) and PC4 (10.50%) for traits. The first two principal components accounted for (54.51%) of total phenotypic variability. The PC1 explained 34.11% for the first axis, and PC2 explained 20.39% for the second axis. SYP, BM, NCP, NSB, NPB, NSC were the main contributing traits in PC1. In contrast, PH, DM, and DF contributed to PC2 (Figure 2).

Cluster analysis
The hierarchical cluster analysis following Ward's method resulted in 10 clusters (Table 3 and Figure 3). Cluster 3 was the largest cluster consisting of 25 lines, followed by cluster 2 (10 lines) and cluster 5 (10 lines). Cluster 6 had only three genotypes, all high yielding lines, EC-41741, NDL-2013-03, and Shikha.
The Euclidean distance matrix identified the most diverse genotypes among the linseed germplasm and the most similar and diverse genotypes to the popular variety (T-397). Rashmi was the most diverse (8.6) than T-397, while SLS-95 (1.8) was the most similar genotype with T-397. Among the 82 genotypes, the most diverse pair of accessions was RL-10129 and Padmini, with a distance of 11.02. The top 10 most diverse pairs of accession are listed in Table 3.

DISCUSSION
In the present study, we observed a large genetic         variation for all the important agronomic traits viz-DF, DM, PH, NCP, TSW, and SYP. High GCV and PCV values for the NPB, BM, and SYP indicates selection will be rewarding for these traits (Tyagi et al., 2014;Reddy et al., 2013;Mirza et al., 2011;Tadesse et al., 2010). The traits ~ SYP, NSC, NCP showed high heritability along with high genetic advance. This fact suggests that additive gene action controls the traits, and simple selection for these traits may be successful (Payasi et al., 2000;Naik and Satapathy, 2002;Muhammad et al., 2003;Awasthi and Rao, 2005;Vardhan and Rao, 2006;Iqbal et al., 2013). The study revealed no correlation between TSW and the NCP. The absence of any correlation between TSW and the NCP indicates an excellent opportunity for independent improvement of the NSC and seed size. As there was no correlation between DM and BM, it means there were early maturing high yielding lines, listing some early and high yielding lines. e.g. EC-704 and SLS-91. Since there was no association between PH and SYP, there would be no benefit of PH when deciding on grain yield in linseed.

Trait Number of lines significantly better than T-397 (range) T-397 Top five performing lines against mega cultivar T-397 (range)
The linseed breeders can use traits such as NPB, NSB, NCP, NSC, TSW, and BM as an indirect selection criterion for enhancing grain yield as these showed a significant positive association with grain yield. Furthermore, independent improvement of the traits~ NSC and NCP is possible with no correlation between them (Pal et al., 2000;Chimurkar et al., 2001;Bhosle, 2002;Naik and Satapathy, 2002;Akbar et al., 2003;Bhosle and Rao, 2005;Vardhan and Rao, 2006).Based on the cluster analysis, genotypes from Cluster V and Cluster VIII can be utilized in future hybrid programs for the highest grain yield as they had the highest mean value for the NSC and TSW. The utilization of germplasm and the source of genetic diversity occur periodically to meet the changing needs of improved crop varieties. Besides, there must be significant variation for economic traits in the germplasm for productive utilization following recombination breeding or selection. Optimal parental diversity is much needed to obtain superior genotypes to recover transgressive segregants (Griffing and Lindstrom, 1954;Moll et al., 1962). The genetic diversity of selected parents does not always rely on factors such as geographic diversity per place of release or degree of ploidy.
Therefore, the classification of germplasm for genetic divergence should be based on second statistical methods, such as D 2 statistics and cluster analysis, to identify suitable and diverse genotypes. Cluster analysis categorize lines into distinct groups/clusters where genotypes in different clusters are more diverse than in a cluster (Ward, 1963) and are useful in selecting the most diverse genotypes to be used as parents in crossing programs. Besides, knowledge about the similarity/ dissimilarity between accessions and check cultivars is vital for the efficient use of the accessions in hybridization programs.
Including diverse accession in the hybridization, the program is very beneficial as this leads to new and useful recombinants used as variety. Based on yield and phenological traits, the cluster analysis grouped 82 lines into 10 clusters wherein similar accessions were in the same cluster. Hence, this would make the selection process easy for the breeders to choose trait-specific and diverse accession for use in a breeding program. Apart from this, a few accessions such as RL-10129 and Padmini showing maximum diversity with the popular variety (T-397), have been identified for breeding programs to develop new cultivars.
To understand the potential of this germplasm in improving cultivated linseed, we compared the performance of these lines with the popular check variety T-397. In this study, apart from identifying 18 highly significant promising lines, we have observed many traitspecific significant lines. We identified many lines having early flowering, bold seeded, a higher number of primary branches as well as capsules per plant. Based on this, the breeder can select trait-specific significant lines and can be used in crop improvement breeding program as per the necessity. Besides, we found early flowering lines too, e.g., EX313-23, FRW-9, and RLC-140, which can be used in breeding programs for earliness trait. Similarly, we can use lines having a higher NCP for breeding promising lines in linseed. Remarkably, the high yielding and early flowering genotype EC-704 signifies there will be no yield trade-offs while breeding for early high-yielding lines from this selection.

Conclusion
This study on evaluation of linseed germplasms for variability and identification of promising lines revealed considerable genetic variation in the studied location. High genetic gain for this crop would be possible in this region with the improvement of traits such as single plant yield, number of capsules per plant, and seed per capsule. This study identified some lines, viz: NDL-2013-03, EC-41741, EC-704 and RL-10129 that have the potential to be used as parents in breeding programs and released as a variety (es) after evaluation across multilocation over the years. Also, utilizing those lines as parents in hybridization programs would diversify the current germplasm available. The high yielding lines should also be evaluated for its flax content and linoleic acid for further use in the breeding program.