A comparative analysis of distinctness , uniformity and stability ( DUS ) data in discriminating selected Southern African maize ( Zea mays L . ) inbred lines

The ability to discriminate germplasm is important for plant breeding as well as for plant variety protection. To achieve this, plant breeders have been using molecular, physiological and biochemical markers in discriminating and grouping of genotypes. Breeders have been looking for effective, quick and cheaper ways of grouping germplasm. Therefore this study was carried out to assess the ability of the traits used for determining the distinctness, uniformity and stability (DUS) of new plant varieties and agro-morphological characteristics for differentiating Southern African maize inbreds. In this study, 18 maize inbred lines were assessed for their variation based on 25 agronomic and 12 DUS traits. The maize inbred lines were grouped differently based on qualitative or quantitative traits or when combined. The correlation between the qualitative and quantitative similarity matrices was low (r=0.048) and non-significant. This indicated that both qualitative and quantitative traits should be used for effective maize inbred line discrimination. Both qualitative and quantitative similarity matrices were highly significantly (p<0.001) and highly correlated to mixed data (r=0.82 and r=0.61 respectively). The grouping of the inbred lines based on Principal Component Analysis (PCA) was similar to the similarity matrix of the mixed data. The first principal component, which explained 27.8% of the total variation, was due to grain yield and productive parameters. The second component, explaining 13.2% of the total variation was due to number of tassel branches (TBNo) and tassel length (TL). The Shannon diversity index showed that the inbred lines were diverse in days to silking, ear diameter, days to maturity, shelling percentage and leaf colour. It is concluded that for effective discrimination of maize inbred lines both agro-morphological and DUS traits should be used especially when few inbreds are being considered.


INTRODUCTION
The development of inbred lines and identification of their best hybrid combinations is critical in an inbred-hybrid oriented breeding programme (Ristanovic et al., 1987).However, the process of developing and selecting inbred lines is costly and time-consuming as extensive yield trials are required to evaluate F 1 performance to identify the parental lines combinations.Thus breeders make several crosses and evaluate the F 1 to identify inbred lines that are heterotic.In this case, inbred lines with desirable agronomic traits are selected for hybridisation and are maintained (Bertan et al., 2007).Hence, phenotypic diversity of parental lines is necessary to achieve high heterosis in hybrids.Therefore, a breeding programme with diverse inbred lines is most likely to deliver superior hybrids.
Morpho-physiological markers have been used to study the genetic diversity in maize (Beyene et al., 2005;Xiang et al., 2010).In addition, morphological characters have been recognized to constitute universally undisputed descriptors for varietal characterization of crop species and establishing the distinctness, uniformity, and stability (DUS) of crop species in Plant Variety Protection (PVP) Systems (Begum and Kumar, 2011).The traits used in assessing crop varieties for DUS have been carefully selected taking into account the plasticity of morphological characteristics and thus are efficient for comparing varieties (Law et al., 2011).However, the measurement of morphological traits is expensive, requiring more space, time consuming (Smykal et al., 2008) and trait expressivity is affected by environment (Bonow et al., 2009) due to gene x environment interaction (Law et al., 2011a).The limitation of using morphological traits is further compounded by the reduction in the variability of morphological traits in elite germplasm (Bonow et al., 2009;Gunjaca et al., 2008) caused by inbred line recycling (Reif et al., 2010;Ristanovic et al., 1985) and essential derivation (White et al., 2006).Pedigree breeding has also been implicated for reducing genetic variability of maize (Newton et al., 2010;Reif et al., 2010).The reduction in genetic and morphological variability makes it difficult to distinguish varieties (Begum and Kumar, 2011;Bonow et al., 2009).Despite this drawback, phenotypic characterisation of inbred lines is still important for breeding high yielding genotypes (Hung et al., 2012) as heterosis has been reported for morphological traits in sub-tropical maize (Iqbal et al., 2010).Recently, it has been shown that morphological traits are still important in maize characterisation and discrimination (Law et al., 2011a;Law et al., 2011b).Furthermore, there is a genetical basis for morphological differentiation in plants (Cavender-Bares and Pahlich, 2009).In this respect, a method to identify traits that are reliable, robust with high discrimination ability has been developed and described (Law et al., 2011b).All these are aimed at improving the methodology of identifying parents for the generation of superior hybrids.Chanda et al. 3057 To improve maize productivity in Zambia, a comprehensive maize breeding programme was initiated in 1979 (Mungoma, 1999).At that time, the breeding programme had a task of developing maize hybrids suited to all agro-ecological zones of Zambia.This meant having a breeding programme that ensured continuous supply of inbreds with better performance and adaptation.Therefore the use of improved versions of elite lines inbreds was practiced (ZARI, 1987).Inbred line improvement was achieved by recurrent selection.However, recurrent selection reduces the number of alleles and increases genetic differentiation at the expense of loss of heterozygosity (Solomon et al., 2010).Although the improved maize inbred lines that were developed produced hybrids with improved performance and wide the adaptation, there is little information available on the changes in genetic diversity.Monitoring the changes in genetic diversity of elite inbreds, as time progresses, is important to avoid crop vulnerabilities associated with a narrow genetic base as well as for maintaining genetic gain (Smith, 2007).This leads to effective management of genetic diversity which is necessary for increased crop productivity (Smith, 2007).
Therefore, the aim of the study was to generate information that will result in the effective utilization of historical elite maize inbred lines.The objectives of the study were to characterise and quantify genetic diversity of founder lines using agronomical and DUS traits.

MATERIALS AND METHODS
A total of eighteen (18) maize inbred lines from Zambia, Zimbabwe and CIMMYT-Zimbabwe were used for the study (Table 1).The trial was conducted at Seed Control and Certification Institute (SCCI) under well fertilised conditions, using a randomised completed block design.During the growing period, data on several morphological traits were collected.The traits were selected from those used by the Variety Testing, Registration and Protection Section of the SCCI in their trials.These traits are modified from the Union for the Protection of New Plant Varieties (UPOV) DUS test guidelines for maize (UPOV, 2009).The characteristics are shown in Table 2.

Data analysis
The mean values for eleven morphological characters and scaling values for physiological characters were used to assess the dissimilarity between inbred lines.The matrix of all the quantitative traits was first standardised before calculating the Euclidean similarity distance matrix among the inbreds.A dendrogram was constructed using Ward to provide a general visualisation of the relationship between inbreds based on quantitative traits using Minitab 14 statistical software.The Ward's method of clustering was used as it has been shown to be in concordance with pedigree data when phenotypic traits based on the UPOV descriptor are used *Corresponding author.E-mail: edchazm@gmail.comAuthor(s) agree that this article remain permanently open access under the terms of the Creative Commons Attribution License 4.0 International License     *Characteristics that should be used every growing period for the examinations of all varieties and should always be included in the description of the variety, except when the state of expression of a preceding characteristic or regional environmental conditions render this impossible; + Characteristics that are scored with the help of a drawing or photo graph; @ SCCI S/N: Serial number of the characteristic in the SCCI field guide book for DUS evaluation; MG = single measurement of a group of plants or parts of plants; MS = measurement of a number of individual plants or parts of plants; VG = visual assessment by a single observation of a group of plants or parts of plants.(Babić et al., 2008).Qualitative traits and a combination of qualitative and quantitative traits were used to generate the similarity between inbred lines based on the Gower similarity matrix using XLSTAT 2013 software (excel addin software).Then the similarity matrix was submitted to MEGA 5 (Tamura et al., 2011) for clustering using the UPGMA for qualitative trait only.The Gower similarity matrix for the combined qualitative and quantitative traits was submitted to the NTSYSpc version 2.21 L software for neighbour joining clustering.The Quantitative data were also subjected to principal component analysis (PCA) to identify traits that are most discriminatory, using the Minitab 14 statistical software.Similarly, Furthermore, the principal coordinate analysis (PCoA) was performed on the qualitative traits using the MVSP for Windows (Kovach, 2007).
The euclidean distance was computed from data between two individuals i and j, as: Where d ij = euclidean distance, x ij and x jk are the standardised values for the ith character of the j th and k th inbred lines, respectively.
The Gower's coefficient (Gower, 1971) permits the simultaneous use of variables of different scales of measurement.It is calculated as: Where S ij = Gower's similarity combining similarities from different traits, S ijk = contribution of k th variable and W ijk = weights of each variable which is usually 1 or 0.
The Shannon-Weaver diversity index (HS) was computed using the phenotypic frequencies, to assess the phenotypic diversity for each trait for all inbred lines (Shannon and Weiner, 1983;Spellerberg and Fedor, 2003).The HS was evaluated as:

Quantitative traits
Analysis of variance revealed highly significant differences among inbred lines for all the traits except for days to maturity, number of rotten cobs, number of grain rows per cob, ear leaf width (cm), Taxis BelowB, 6 cm upper, 6 cm lower, number of tassel branches, TL long and TUBL short (Table 3).A wide range of expression was also observed for 25 agronomic traits studied.The widest range was exhibited by grain yield (2584 kg/ha) followed by leaf area (467cm 3 ).The narrowest range was observed for hundred seed weight (0.03 g), and followed by ear diameter (1.16 cm).The widest range exhibited by grain yield was also observed by Beyene et al. (2005).This could be attributed to the breeding programme that was initiated to develop inbred lines that would produce hybrids adapted to a wide range of growing conditions (Ristanovic et al., 1985).This breeding programme was initiated after it was concluded that Zambia needed hybrids for different agro-ecological areas and levels of management.On the other hand, the narrow range of time to maturity indicates that there was a shift to develop varieties with medium to intermediate maturing period, to avert the risks involved with maize production.However, the variation in time to maturity can only be appreciated in testcrosses.
Grain yield was significantly correlated to 36% of the traits (9 out of 25) (Table 4).The highest correlation was observed in ear height (r 2 =0.63), and followed by ear leaf width (r 2 =0.62).Cobrot was significantly and negatively correlated to grain yield (r 2 =-0.61).This implies strengthening the breeding of inbreds that are tolerant to cobrots.The impact of tassel related traits on seed set has been discussed (Chanda et al., 2010).In this study, tassel axis length below the upper branch and tassel axis above the upper branch were negatively correlated to grain yield.This confirms in part the tendency for breeders to select for reduced tassel traits.Similarly, gray leaf spot (GLS) had a small negative impact on grain yield, hence it is needed to develop tolerant lines to GLS to produce hybrids with better yield performance.
Ear leaf has been reported to play a critical role in maintaining grain yield through resource remobilization to reported yield reductions of 17 to 25% when ear leaf alone is removed and 40 to 50% reduction when all leaves above the ear are removed.We observed a similar trend in the present study.Furthermore, the results indicate that leaf width had the greatest influence than leaf length in this set of inbred lines.
The repeatability was highest for ear length (R=0.93),followed by days to silking (R=0.90) and shelling percentage (R=0.90).Traits of grain yield, number of grain rows per cob, number of grains per row, hundred seed weight, shelling percentage, and length tassel axis above the upper branch had repeatability greater than 0.80.This indicates that all these traits are relatively Based on the Euclidean distance and Ward method, 18 lines were assigned into 4 clusters (Figure 1).Cluster I consisted of J185, L12, L152, L913, and SC; Cluster 2 consisted of L151, L5522, L3233, and N3; cluster 3 consisted of L1212, L3234, L1214, L917, L211, and L911; and cluster 4 consisted of K64r, L2, and L334 respectively.Cluster 3 possessed the highest number of genotypes (6), followed by cluster 1 (5).Cluster 4 had 3 genotypes and cluster 2 had 4 genotypes.The average trait performance of the inbred lines in each cluster is shown in Table 6.
Cluster 3 consisted of the highest yielding group, associated with longer days to maturity, low cob placement, longest cob, heavy kernels and large leaf area.The lowest yielding was cluster 4, associated with early maturity, shortest plant height, shortest cob length, smallest leaf area and lightest kernel weight.It was expected that cluster 4 with the highest number of kernels per row and highest number of grain rows per cob could be the highest yielding.This could be attributed to the low density of the kernels (0.023 vs. 0.042).
Therefore, kernel density exhibited its importance when developing inbreds and thus should be used as an effective parameter for line improvement.

Qualitative traits
The intensity of node colour was the same for all the 18 inbred lines; therefore the trait was removed from the analysis.The variation of the inbreds based on qualitative traits is shown in Table 7.A pair wise comparison was carried out to identify characters that clearly distinguished the inbreds.A variety is said to be clearly distinct from another variety if the difference is more than at least one state (UPOV, 2009).
A pairwise comparison of the traits indicated that grain type is the most distinguishing and ear shape is the lowest distinguishing (Table 8).
The intensity of internode colour, glume ring colour, ear sheath colour, tassel glume colour and leaf sheath colour were the most predominant in the inbred lines (Table 9).According to the study, it is evident that most of the genotypes have absent/very weak intensity of silk colour, internode colour, glume ring colour, ear sheath colour, tassel glume colour and leaf sheath colour.
It is also evident that ear shape, hairiness of leaf margin, leaf colour, attitude of tassel branches, glume ring colour, grain type and leaf attitude can be used for discriminating maize inbred lines.Furthermore, it can be said that there has been a tendency to develop maize inbred lines that are dent or dent-like or flint with slightly conical (56%) and cylindrical (28%) ears.Due to breeding for the stay green trait, there has been a tendency for developing genotypes with dark green leaves (56%) with only 16% being light green.
The similarity distance based on qualitative traits showed that the longest distance (12.00) was between L3234 and L911, followed by L334 and L911 (11.58), while the shortest distance (2.45) was between L3234 and L334 and L3233 and K64r, followed by that between L12 and SC (2.83) (Table 10).The inbreds were clustered into two major groups, with group 2 having two sub-clusters (Figure 2).Cluster 1 had four genotypes while the rest were in cluster 2 with four in cluster 2a (Figure 2).

Qualitative and quantitative traits
The Gower similarity coefficient executed in XLSTAT 2013, an excel adding software, allows the analysis of mixed data.The longest distance observed was between L911 and L334 (14.06) and the least between L917 and L1214 (5.54).Inbred line L1212 was also closer to L3234 (5.96) (Table 11).The inbreds were clustered into two major groups, A and B (Figure 3), each with two subgroups.Inbred lines N3, L3233 and SC were grouped together in cluster III.N 3 and SC are the original heterotic groups used in Southern Africa, their derivatives being L3233 and L5522 respectively.Thus we expected N 3 and L3233; and SC with L5522 to be closer.
It was expected that sub-lines would be clustered together with the original lines.For example, L12, the original line was not grouped together with L1212 and L1214 sub-lines.Similarly, lines L911, L917 and L913 are all sub-lines of L9 and were not grouped together.This indicates that the sub-lines selected were phenotypically different from the original.Inbred lines N3 and L3233 were clustered together while SC and L5522 were in separate clusters.The observations are in agreement with the findings of Ristanovic et al. (1985), when L5522 was contaminated.The contamination had great impact on the phenotypic expression of the inbred line.This suggests that breeders should be using SC but not L5522 in their breeding works involving heterotic patterns.However, the line per se cannot be discarded as it may have other important attributes that can be used in breeding.

Principal component and coordinate analysis
Principal coordinate (PCoA) of qualitative traits: The principal coordinate analysis of the qualitative traits resulted in the first axis explaining 42.0% of the variation, with an eigenvalue greater than 1.
The second with eigenvalue less than 1, accounted for 15.6% of the total variation while the third and fourth axis accounted for 11.8 and 8.9% of the total variation respectively (Table 13).
The sum of eigenvalues for axes1 and axes2 were 2.39.Four traits, namely leaf attitude, ear sheath colour, tassel glume colour and leaf sheath colour had high correlations (≤0.40) with the first axis.High correlations were also observed on tassel glume ring colour, grain type, attitude of tassel branches and leaf colour with the second axis.Leaf colour had the highest correlation (0.795) with the first axis and was fourth in the second axis.This implies that leaf colour was very important in discriminating genotypes.The Shannon index, sometimes referred to as the Shannon-Weaver index is used to measure diversity.The index has been used to measure the phenotypic diversity for each trait (Shannon and Weaver, 1949).The Shannon Index for the 12 qualitative traits ranged from 1.09 to 1.24, with a mean of 1.20 and a range of 0.15 (Table 12).Leaf colour had the highest diversity index (1.24) and the intensity of tassel glume ring colour had the lowest (1.09).The diversity index for three traits, namely ear sheath colour, tassel glume colour,   ear shape and leaf sheath colour was the same (1.23).
On the other hand, the Shannon index for the 25 agronomic traits ranged from 0.95 to 1.26, with a mean of 1.24 and a range of 0.31 (Table 12).The highest index (1.26) was observed for days to silking (Dsilk), days to maturity, ear diameter (ED) and shelling percentage (%).
The lowest index (0.95) was recorded in number of rotten cobs only.The Shannon index observed in this study was higher than that reported among maize accessions in Italy (0.789-0.849) and almost comparable to the result for Chinese germplasm (Li et al., 2002;Lucchin et al., 2003).Siopongco et al. (1999) considered a Shannon index of 0.68 and 0.80 to be medium and high degree of variation.In this study, a high degree of variation existed for all the traits studied.Furthermore, the quantitative traits were more diverse than qualitative traits.Similar findings have been reported in maize (Siopongco et al., 1999).
The high diversity index of LFA and PH could be used in the generation of heterotic hybrids.The results also indicate that there is wide diversity in the qualitative traits among the inbred lines used in the study.Therefore, these traits can be used in the development of identification keys.Traits that are known to be mildly   influenced by environment are effective in variety discrimination (Dillmann and Guerin, 1998) and these should be used in developing variety identification keys.Polygenic traits, like kernel type, ear height, earliness are some of the traits reported to be mildly affected by environment (Dillmann and Guerin, 1998).

Principal component analysis (PCA) of quantitative traits
The principal component analysis of the quantitative traits resulted in the first seven components explaining 84.6% of the total variation, with eigenvalues greater than 1.The first component accounted for 27.5% of the total variation.The second and third components accounted for 15.2% and 12.2% of the total variation respectively (Table 14).Cumulatively, the first and second components explained 42.6% of the total variation.Grain yield had the highest positive loading    (0.276) in the first while shelling % and TaxisUpperB had the highest positive loadings in components 2 (0.346) and 3 (0.494), respectively (Table 14).TaxisUpperB, ED, TL long, TUBL short and 6 cm upper had positive loadings in the first three components, while Lfw, LfL, LfA, GYkgha, TBNo and TL short had positive loadings in the first two components.These traits were important for discriminating the maize inbred lines.

Principal component analysis for quantitative and qualitative traits
When the quantitative traits were converted to categorical data and combined with qualitative data, the first fourteen components explained 97.97% of the total variation, with eigenvalues greater than 1.The first component accounted for 28.5% of the total variation.The second and third components accounted for 13.3 and 10.3% of the total variation respectively (Table 15).Cumulatively, the first and second components explained 41.7% of the total variation.The PCA grouped genotypes that were similar to that produced by the neighbour joining clustering method (Figure 4).Inbred lines L3233, L152 and SC were grouped differently by both methods.The trait, LfA had the highest factor loading (2.41) in the first component followed by EH (2.06) and GY (2.02) (Table 15).The axis is considered productivity and yield axis since it loaded highly for yield and reproductive traits.GY had the highest positive loadings in component 2 (1.63) and component 3 (1.38)respectively (Table 15).GY had the second highest factor loading in the first component.The PCA indicates that ED, 6 cm Upper, PH, TBNo, GY and leaf attitude were important in distinguishing inbreds.Out of these, TL long, GY and TBNo had positive loadings in the first three components.GY was the most important trait as it loaded positively and highly with the first and second factors (2.02 and 1.63 respectively).The inbreds, based on the PCA scores, were divided in 3 clusters (Figure 4).
PCA (Figure 4) and cluster analysis (Figure 3) grouped some inbred lines similarly for cluster I and cluster III.The lines in II and IV (Figure 1) were grouped in cluster II by PCA.This resulted in Cluster II having sub-clusters.The differences in the classification of the inbred lines could be attributed to the differences in data used for quantitative traits (code or standardised).However, the study demonstrates that either of the method could be used to provide information about the diversity of maize.

Comparison of dissimilarity matrix derived from qualitative and quantitative traits
The Mantel test statistic (Z) were calculated to measure the degree of relationship between the dissimilarity matrixes generated from qualitative, quantitative and combined data.The p-value was calculated using the distribution of r (AB) estimated from 10,000 permutations.The matrix correlation between qualitative and quantitative was low (r=0.048)and non-significant.However, the correlations of qualitative and quantitative traits with combined data were high and highly significant (r=0.82 and r=0.61, respectively).There was an agreement of 82% between qualitative and quantitative matrices using the Mantel matrix correspondence test.
On the other hand the quantitative traits were 61% in agreement with the combined data.The low correlation between the qualitative and quantitative could be attributed to the different methods used in calculating the dissimilarity matrices, because each of the dissimilarity matrix has different mathematical properties (Mohammadi and Prasanna, 2003).Since the correlations for association to mixed data by qualitative and quantitative are high, while their correlations between them is low, suggests the need for combining the two data sets for analysis.Hence qualitative and quantitative data should be used for assessing genetic diversity in combination.

Conclusions
The study assessed the pattern and extent of the phenotypic diversity of elite inbred lines.The results reveal that phenotypic selection for the creation of sublines and inbred line recycling significantly affected the morphological diversity of inbred lines.This diversity can be exploited for the generation of heterotic hybrids.The observed disparity between clustering and the expected similarity could be attributed to mutation and /or admixtures.Admixtures could have occurred at the time when there was high staff turn-over coupled with reduced government funding to research activities.
Shannon diversity index (HS) =Where, P i = is the proportion of inbred lines in the i th class of an nclass character, n = number of phenotypic classes for a character and In = natural logarithm.HS Evenness (J) = ln SWhere HS = Shannon-Waeaver diversity index and In S = natural logarithm of the inbreds richness.

Figure 1 .
Figure 1.Dendrogram of maize inbred lines based on 24 morphological traits, using Euclidean distance matrix and Ward clustering method.

Figure 2 .
Figure 2. Dendrogram of maize inbred lines based on 12 qualitative traits, using Gower similarity matrix and UPGMA clustering method.

Figure 3 .
Figure 3. Dendrogram of Maize inbred lines based on 25 quantitative and 12 qualitative traits, using Neighbour Joining clustering method.

Figure 4 .
Figure 4. Scatter plot of the maize inbred lines based on 37 traits.

Table 1 .
List and sources of maize inbred lines used in the study.

Table 2 .
Characteristics used in DUS testing of inbred lines and their acronyms in brackets.

Table 3 .
Statistical of 25 agro-morphological traits measured in 18 maize inbred lines.

Table 4 .
Correlation between grain yield and other agronomic traits.

Table 5 .
Genetic dissimilarity matrix of 18 maize inbred lines based on 25 agronomic traits.

Table 6 .
Agronomic characteristics of the clusters.

Table 7 .
Phenotypic Variation of maize inbred lines based on 12 qualitative traits.

Table 8 .
Contribution of each trait to clearly distinguish 18 inbreds.

Table 9 .
Variation of 18 maize inbred lines for 12 qualitative traits.

Table 10 .
Genetic dissimilarity of 18 maize inbred lines based on 12 qualitative traits.

Table 11 .
Gower similarity coefficients based on 13 qualitative and 15 standardised quantitative traits for 18 maize inbred lines.

Table 12 .
Diversity Index for 12 qualitative traits for 18 inbred lines.

Table 13 .
Principal coordinate analysis of 12 qualitative traits.

Table 14 .
Principal component analysis of 25 quantitative traits.

Table 15 .
Principal component analysis of 18 maize inbred lines across 37 traits.All quantitative traits were coded according to the instructions shown in Table1. *