Geographic patterns of phenotypic diversity in sorghum ( Sorghum bicolor ( L . ) Moench ) landraces from North Eastern Ethiopia

Understanding the pattern of genetic variability is an important component of germplasm collection and conservation as well as the crop’s improvement process including the selection of parents for making new genetic recombination. Nine hundred seventy four sorghum landraces from North Eastern (NE) Ethiopia were evaluated for agro-morphologic characters to assess geographic patterns of phenotypic diversity and to identify whether there are specific areas of high diversity for particular traits. The Shannon-Weaver diversity index (H′) for qualitative traits ranged from 0.30 to 0.93 (mean = 0.67) for grain covering and grain color, respectively. The landraces also displayed highly significant differences (p<0.01) for all the quantitative traits with days to flowering ranging from 64 to 157 days (range = 93), days to maturity from 118 to 215 (range=97) days, plant height from 115 to 478 cm; 1000-seeds weight from 18 to 73 g, and grain number from 362 to 9623. The first five principal component axes captured 71% of the total variation with days to flowering and maturity, leaf number and length, panicle weight, grain weight and number per panicle, panicle length, length of primary branches, 1000-seeds weight and internode length accounting for most of the variability. Cluster analysis grouped the landraces into ten clusters. The clustering of zones and districts revealed close relationship between geographic locations based on proximities and agro-ecological similarities. Differentiation analysis showed that most of the landraces variability was within rather than between geographic origins of the landraces, indicating weak genetic differentiation among landraces from predefined geographic origins such as political administrative zones and districts. The weak differentiation might be due to frequent gene flow across the study area because of seed exchanges among farmers.

North-eastern (NE) Ethiopia is one of the main sorghum growing areas of the country with the crop ranking second next to tef (Eragrostis tef) in area sown and first in production (CSA, 2003).It is among the important areas for germplasm collection.The high lysine sorghum lines came from Welo collections (Singh and Axtell, 1973).The majority of sorghum production in the country in general and NE Ethiopia in particular depends on landraces (Gebrekidan, 1973;Worede, 1992;Seboka and van Hintum, 2006;Shewayrga et al., 2008).These landraces are good sources of genepool for sorghum improvement program to develop high yielding and farmer preferred improved varieties.Understanding the genetic variability available and its potential use in future breeding programs is important component of the crop improvement process including the selection of inbred parental materials for hybridization for making new genetic recombination.It also helps to devise appropriate sampling procedures for germplasm collection and conservation purpose including the establishment of a core collection with maximum genetic diversity (Brown, 1995;Kresovich et al., 1995;Hayward and Sackville-Hamilton, 1997;Ramanatha Rao and Hodkgin, 2002;Kahilainen et al., 2014;Govindaraj et al., 2015).
Analysis of morphological diversity is one of the important and useful techniques employed to determine variability in different crops (Assefa and Labuschagne, 2004;Haussmann et al., 1999;Grenier et al., 2004;Gerrano et al., 2014;Dossou-Aminon et al., 2015;Mengistu et al., 2015).In the case of sorghum in Ethiopia, Gebeyehu (1993) recorded significant difference in quantitative traits among 59 landraces from Gambella in Western Ethiopia.McGuire et al. (2002) observed high phenotypic diversity in sorghum landraces from two districts (Meiso and Chiro) in Eastern Ethiopia, where variation among farmers in their ecological conditions and needs contributes to overall varietal diversity.At a wider scale, Ayana and Bekele (1998) reported high morphological diversity among 347 Ethiopian sorghum accessions for qualitative and quantitative traits.The studies on sorghum landraces from NE Ethiopia were limited in terms of the nature of sampling and number of samples considered, and the number of sites sampled and the coverage of sorghum growing areas.Either small samples of accessions from gene banks (Ayana and Bekele, 1998) or collections from few targeted sites (Abdi et al., 2002) were included in the studies.The objectives of the present study were to investigate the extent and geographic pattern of phenotypic diversity in sorghum landraces in terms of districts, zones and altitude classes of NE Ethiopia using large sample of representative landraces, and to identify whether there are specific areas of high diversity for particular traits.

MATERIALS AND METHODS
A total of 974 landraces that included 307 from North Welo, 363 from South Welo, 129 from Oromiya and 175 from North Shewa administrative zones were evaluated covering more than 350 km distance of north to south stretch (Figure 1).A total of 14 sorghum growing districts were covered in the study that included Kobo, Gubalafto, Habru and Meket from North Welo; Ambassel, Tehuledere, Dessie Zuria, Kalu, and Debresina from South Welo; Bati and Artuma Jille from Oromiya; and Kewot, Tegulet and Merabete from North Shewa.Some districts used in the analysis represent two or more of the current administrative delineation as most of the passport data available from the Institute of Biodiversity Conservation (IBC) relate to old classification.For example, Merabete represents all the present districts in that boundary; Kewot represents Kewot and Efratana Gidim; and Artuma Jille represents both Artuma Fursina Jille and Chafa Gola (Dawa Chafa) districts.A classification was made to group landraces based on altitude as lowland (<1650 m), intermediate (1650-2000 m) and highland (>2000 m above sea level) areas.The landraces were tested at Sirinka (1850 m above sea level), North Welo, Ethiopia.Data were recorded for seventeen quantitative and eight qualitative traits using sorghum descriptors (IBPGR/ ICRISAT, 1993).The data for quantitative traits were averages from five randomly selected plants while the data for qualitative traits were recorded at plot level.

Data analysis
For qualitative traits, phenotypic frequency distributions and Shannon-Weaver diversity index (H′) were estimated for all the landraces, districts, altitude and zones.The Shannon-Weaver diversity index (Ayana and Bekele, 1998) for a trait is given as: , where, Pi is the proportion of landraces in the i th class of an n-class trait (the number of phenotypic classes of a trait).
The variability for quantitative traits was described using mean, range and analysis of variance (ANOVA) with randomized complete block design.Cluster analysis was performed based on the quantitative and qualitative traits with Gower's generalized distance estimates and ward clustering method to group the landraces using daisy in R-Project Software Package (R core team, 2010).For the cluster analysis, some 21 landraces from Harerge (Eastern Ethiopia) and 9 improved varieties were included for comparison purpose.Cluster analysis was also performed for districts and zones using mean for quantitative traits.Principal component analysis (PCA) was performed after standardizing the quantitative data.

Variability for qualitative traits
A high variability for qualitative traits was observed in the landraces.The distribution of phenotypic classes for the entire NE Ethiopia showed the predominance of particular trait classes in the landraces for the qualitative traits.Non-juicy types (92%), awns at maturity (67%), white midrib color (72%), grey and straw glume color (63%), semi-compact to compact head types (68%), mostly starchy and completely starchy (82%), and 25% grain covered (90%) characterize the majority of landraces (Supplementary Table 1).White, red and light red, straw, yellow, and brown seed color were important accounting for more than 81% of the landraces.A similar trend was observed in the distribution of the trait classes in the four administrative zones and three altitude ranges, but localities of specific diversity for trait classes were observed.For example, the proportion of red brown seed color, a predominant color of Zengada, was higher in South Welo as compared to the other three zones.Red seed color was not common in North Shewa relative to other zones.The proportion of grey seed color was relatively high in North Welo and Oromiya.These are landraces such as Tikureta (e.g.Gubete, Wanose, leza), Mogayfere and Homdade, which are mainly grown for roasted grain ('eshet') or local beverages.The proportion of black and white glume colors was relatively higher in Oromiya reaching 19 and 14%, respectively, which was low in the other zones.
In the case of altitudes, awns at maturity, panicle compactness, grain color and endosperm texture showed some clinal variation.Presence of awns at maturity was a frequent trait ranging from 64% of landraces for the lowland and intermediate altitude to 82% for the highland areas.Semi-compact elliptic and compact elliptic head types were more frequent in the lowland and intermediate altitude areas accounting for 67 and 69% of the landraces, respectively.In higher altitude areas, semiopen head types were equally important as the semicompact elliptic types, while compact elliptic types were less frequent.The distribution of grain color showed a predominance of white, yellow, brown, straw, red and light red in the lowland and intermediate areas.In comparison, red brown followed by brown were the frequent grain colors in higher altitude areas.Mostly starchy endosperm types were more frequent in the lowland (55%) and intermediate areas (72%).In contrast, completely starchy endosperm and mostly starchy types were important in the highland areas accounting for 44 and 39% of the landraces, respectively.
Estimates of H′ revealed high qualitative traits diversity in the sorghum landraces of NE Ethiopia in general  1).
The diversity index across NE Ethiopia ranged from 0.30 to 0.93 (mean H′ = 0.67) for grain covering and grain color, respectively.The mean H′ across collection zones ranged from 0.60 for landraces in North Welo to 0.70 for landraces in Oromiya zone.The H′ estimates for the threealtitudes showed an increasing trend from highland to lowland for most of the traits with mean range of 0.59 to 0.68, respectively.The mean H′ at district level was a reflection of the diversity observed at the altitudinal level.Districts from higher altitude areas like Meket, Dessie Zuria, Debre Sina and Tegulet showed low diversity for many traits resulting in low mean diversity.The H′ value for grain color ranged from 0.54 to 0.93 for landraces from Dessie zuria and Ambassel, respectively.Landraces from Gubalafto, Dessie zuria, Meket, Debre Sina were all non-juicy types.Similarly, landraces from Dessie zuria were monomorphic for absence of awns at maturity.Overall, awns at maturity, grain color and glume color were the most diverse traits, while grain covering and stalk juiciness displayed the lowest diversity.
Partitioning of the qualitative traits variability into between and within zones of origin, altitudes and districts revealed that most of the variation was found to be within rather than between geographic origins of the landraces (Table 2).Ninety seven percent of the variation was within zones while only three percent was between zones.Stalk juiciness contributed relatively more (10%) to between zones differentiation.Similarly, the differentiation between altitudes appeared to be weak where 95% of the variation is accounted for by within altitude variation.Grain covering followed by stalk juiciness contributed to between altitude differentiation.Landraces between districts displayed relatively higher differentiation as compared to zones and altitudes, and almost all traits contributed to the differentiation.

Variability for quantitative traits
The landraces showed a wide range of variability for the quantitative traits (Table 3).Days to flowering ranged from 64 to 157 days, and days to maturity from 118 to 215 days.Plant height ranged from 115 to 478 cm; panicle weight from 21.8 to 443.4 g; grain weight per panicle from 11.87 to 348.23 g; 1000-seeds weight from 18 to 73 g; threshing percent from 29.6 to 93.6%; and grain number from 362 to 9623.
ANOVA for the entire NE Ethiopia data, zones and districts showed highly significant difference (P<0.01) for all traits among landraces (Table 4).Mean separation values for each trait are given in supplementary Table 2. Landraces from South Welo had higher values for panicle weight, grain weight per panicle and number of grain per panicle.Landraces from North Welo were relatively early flowering and maturing while landraces from North Shewa were late flowering and maturing.Similarly, highly significant differences were observed between altitudes for all traits except length of primary branches per panicle.Landraces from highland areas were late flowering and maturing with significantly low mean values for panicle weight, grain weight per panicle, 1000-seeds weight and number of grains per panicle as compared to lowland and intermediate altitude landraces.The variation between districts was highly significant for all the traits.Landraces from higher altitude districts like Tehuledere, Dessie zuria, Meket, Debre Sina and Tegulet were late flowering and maturing with low mean value for 1000-seeds weight.The variations among landraces were also highly significant within zones, districts and within altitudes.

Cluster analysis
The cluster analysis resulted in clustering of the landraces into ten groups, with minimum similarity level of 0.64 (Figure 2A).A total of 179 landraces were grouped in Cluster I, 89 in cluster II, 128 in cluster III, 198 landraces in cluster IV, 94 landraces in cluster V, 78 PL=Panicle length(cm), NPB=Number of primary branches per panicle, LPB=Length of primary branches(cm), PW=Panicle weight(g), GW=Grain weight per panicle(g), SW=1000 seeds weight(g), ThP=Threshing percent (%), GN=Grain number per panicle landraces in cluster VI, 61 in cluster VII, 52 in cluster VIII, 48 in cluster IX and 77 landraces in cluster X.The mean value of quantitative traits and frequency distribution ofqualitative traits for each cluster are given in Supplementary Tables 3 and 4, respectively.Cluster I included landraces characterized by above average values for most quantitative traits.However, it was below average values for internode length, peduncle exertion, panicle length and length of primary branches.Straw and grey glume colors, white and yellow seed color and presence of awns at maturity were frequent traits.Semicompact elliptic head types and mostly starchy endosperm types were predominant trait classes in the cluster.Cluster II contained landraces with above average value for all quantitative traits except peduncle exertion, with predominant qualitative traits of grey and straw seed color, straw glume color, mostly starchy endosperm and compact elliptic head type.Landraces such as Jamyo and Degalet types, which are preferred for various end-use qualities, were grouped in clusters I and II.Cluster III also showed above average values for all quantitative traits except panicle length and length of primary branches.Semi-compact elliptic followed by compact elliptic panicles, and grey and straw glume colors characterize most of the landraces in this cluster.Cluster IV displayed above average value for majority of the quantitative traits except for internode length, leaf width, panicle length and length of primary branches.Landraces like, Keteto, Tikureta as well as some Degalet and Jamyo types were grouped in this cluster.South Welo, North Welo and North Shewa together accounted for more than 87, 84, 92 and 83% of the landraces in Cluster I, II, III and IV, respectively.The landraces in clusters V, VI, VII and IX showed below average values for most of the quantitative traits.Clusters V and VII include most of Jigurte, Cherekit and other early maturing types where North Welo accounted for 55% of the landraces in Cluster V while South Welo accounted for 51% of the landraces in Cluster 6. White, red and light brown seed colors were equally important in cluster V accounting for more than 65% of the landraces in the cluster.The sweet stalk sorghums were included in cluster VI.Cluster VII contained all improved varieties, landraces from North Welo and Oromiya.White seed was dominant in this cluster.Cluster VIII contained mixture of landraces, where more than 54% were landraces from South Welo.The landraces in cluster IX include open panicle and small seeded landraces like Inchiro, Wancho, Wofaybelash, Merere and Slimo.This cluster had the smallest 1000 seed weight (27.7 g), and many of them (44%) are from South Welo.Light brown seed color predominates the group, and many of the landraces have seeds partly covered with glumes.Cluster X contained majority of landraces known as Zengada, a type of landrace widely adapted but mainly grown in higher altitude areas, and South Welo accounted for 54% the landraces in the group.These landraces were very late maturing with lower panicle weight, grain weight per panicle and 1000 seeds weight.Red brown seed color, semi-loose and semi-compact panicles, completely starchy endosperm and grey glume color manifested most of the landraces in this cluster.
The clustering of zones revealed the close relationship between landraces from North Welo and South Welo; from North Shewa and Oromiya (Figure 2B).The similarity values ranged from 0.24 between South Welo and Improved varieties to 0.93 between North Welo and South Welo.Landraces from Harerge showed close relation with landraces from North Shewa and Oromiya relative to other zones.The improved varieties formed a separate cluster.At district level, the grouping appeared to reflect both geographical proximity and ecological similarity (Figure 2C) with similarity values ranging from 0.72 between Artuma Jile and Improved varieties to 0.82 between Kobo and Habru.Landraces from Kobo, Habru, Artuma Jile and Bati grouped closely.Closer to this group was the grouping of landraces from Gubalafto, Shewa Robit, Ambassel, Tehuledere and Kalu.These districts belong to the lowland and intermediate altitude agroecologies.Landraces from high altitude districts such as Dessie zuria, Debressina and Merabete were relatively closely related.The improved varieties grouped distinctively from other districts.

Principal component analysis
Principal component analysis showed that the first five components with Eigen values greater than unity explained 71% of the variability among the landraces (Table 5).The first component was correlated mainly with days to flowering and maturity, followed by leaf number and leaf length.
The second component was associated with panicle weight, grain weight per panicle and grain number per panicle, and the third component with panicle length and length of primary branches.The fourth component correlated with 1000-seeds weight while the fifth component mainly correlated with internode length.Figure 3 shows the principal component loadings for the 17 quantitative traits and the ordination (grouping) of the landraces.Landraces opposite side of trait vector arrows indicate small value for the particular trait while those landraces on the direction of vector arrows display high values.The length of the vectors is proportional to the magnitude of the trait in grouping the landraces.

DISCUSSION
The descriptive, ANOVA, cluster and principal component analyses revealed the presence of high phenotypic variability among the sorghum landraces from the study area at large as well as within each of the four zones.The analyses for quantitative traits revealed wide variability among the landraces.Previous studies (Teshome et al., 1999;Seboka and van Hintum, 2006;Shewayrga et al., 2008) reported that farmers purposely maintain and grow many landraces to address various needs as well as risk aversion strategy, and the landraces vary in maturity, yield potential, stress tolerance, end-use quality and other agronomic traits.Diversity studies in NE Ethiopia have also shown high DF=Days to 50% flowering, DM=Days to 90% maturity, LN=Leaf number, LL=Leaf length(cm) , LM=Leaf width (cm), IL= Internode length(cm) , LSL=Leaf sheath length(cm), PH=Plant height(cm), PE=Peduncle exertion(cm), PL=Panicle length(cm), NPB=Number of primary branches per panicle, LPB=Length of primary branches(cm), PW=Panicle weight(g), GW=Grain weight per panicle(g), SW=1000 seeds weight(g), ThP=Threshing percent (%), GN=Grain number per panicle.
diversity for other crop landraces including tef (Assefa et al., 2001;Kefyalew et al., 2000), barley (Abebe et al., 2010;Mekonnon et al., 2015) and durum wheat (Eticha et al., 2005;Mengistu et al., 2015).Generally, NE Ethiopia is one of the crop diversity areas of the country attributed to the wide topographic and agro-ecological variation coupled with subsistence farming requiring landraces that are locally adapted to marginal environments.Although, there was high variability for most of the qualitative traits, some trait classes were more frequent than others.For example, most of the landraces were non-juicy, had starchy endosperm, compact panicles and 25% glume covered grain.Compact panicle, an adaptive trait, is a character of durra race mainly grown in dry areas (Doggett, 1988;Stemler et al., 1977).These types of landraces are the most preferred types by farmers for qualitative and quantitative attributes as well as end use.The predominance of starchy types may be attributed to farmers' intentional selection for suitability for Injera, the staple bread (Gebrekidan and Gebrehiwot, 1982).Besides, the landraces showed low diversity for grain covering and stalk juiciness at all levels of geographic domains.Majority of the landraces were with only 25% of the grain covered.Previous studies (Ayana and Bekele, 1998) observed increasing trend of 25% grain covered from high rainfall areas of Western Ethiopia to dry areas of Eastern Ethiopia.The same was true for compact panicle types.Grain cover by glumes is related to threshability and seed size, which are important sorghum selection criteria for farmers of the area (Teshome et al., 1999).It is also an adaptive trait where it plays important role in reducing grain mold in high rainfall and humid areas like Western Ethiopia.NE Ethiopia is dryland area with regular moisture stress (deficit) to crops where grain mold in the field is not a serious problem.Consequently, landraces with 50% or more grain covered with glume were not important among the landraces evaluated.This result is in line with previous observation in farmers' fields (Shewayrga et al., 2008).Such landraces were not widely grown and most of them were small seeded (1000-seeds weight) with brown color and often with very loose or open panicle.Some of these landraces such as Chobe and Ganseber are mainly for utilization of secondary importance like tella, genfo, while some others (e.g.Inchiro) are adapted to marginal environments.Farmers also reported some of these types as volunteers resulted from cross-pollination with wild types (Kilo).The juicy stalk sorghums are maintained for chewing purpose where farmers usually mix plant few sweet sorghum seeds in normal (non-juicy) sorghum fields.The farmers decision as to which landraces to plant, where and at what proportion involves a number of selection criteria to avoid risk, and the decision is focused mainly on normal types.This could explain the low variability observed for stalk juiciness.
Cluster analysis grouped the landraces into ten clusters, each having a wide within cluster variation.However, the clustering did not quite well correspond with geographic origin (administrative boundary) of the landraces.Rather, the grouping appeared to follow similarity in altitudinal and environmental factors as well as spatial proximity.The districts with close proximity and similar agro-ecology clustered together.Districts from lowland and intermediate altitudes have warmer climates as compared to those in higher altitudes.The physical distances and climatic factors may enforce physical and adaptive barriers to gene flow between agro-ecologically less similar and distant districts.The improved varieties formed separate cluster.These varieties were early maturity, short stature and most have genetic background from exotic origin.Ayana et al. (2000) also observed separate grouping of sorghum lines of exotic origin from Ethiopian landraces.The high morphological (phenotypic) variability among the landraces is an important resource to utilize the benefit of sorghum breeding of the area.The breeding focus to develop early maturing varieties for moisture stress areas does not appear to be fully effective as the improved varieties are very limited in number and area coverage in the study area (Shewayrga et al., 2008;Seboka and van Hintum, 2006).The improved varieties do not have the qualities the farmers are looking for in a variety (Seboka and van Hintum, 2006), which not only include yield but stalk yield, food making quality and other attributes.Integrating the attributes of important landraces in the breeding programs by identifying valuable parents with various traits of economic interest would be important for increasing the adoption rate of improved varieties.
In summary, the landraces displayed high variability for both quantitative and qualitative traits.But the differentiation among landraces from different geographical domains (zones, altitudes and districts) of the area was generally weak.Seed exchanges among farmers may attribute to frequent gene flow across the region resulting to weak differentiation.Concerning germplasm conservation, the weak differentiation among geographic domains and the high level of landraces variation observed at different levels of geographic domains suggest that a single large random collection from the whole target area would capture most of the genetic variation present in the sorghum landraces.However, targeted collection would also be important to capture specific but potentially valuable variability as landraces between districts showed appreciable differentiation.Besides, farm survey of landraces (Shewayrga et al., 2008) indicated a shift in sorghum types cultivated in the area where old and preferred landraces were either lost or marginalized in many localities while new types are coming into the system.Therefore, periodic collection surveys would be important to capture new variability.

Figure 1 .
Figure 1.Map of Ethiopia and the study area.

Figure 2 .
Figure 2. Dendrograms showing the clustering patterns for: A) landraces, B) districts and C) zones.

Table 1 .
Shannon-Weaver diversity index (H′) of eight qualitative traits for sorghum landraces from NE Ethiopia by districts, zones and altitudes.

Table 2 .
Ayana and Bekele (1998)notypic variability into within and between zones of origin, altitudes and districts of NE Ethiopia us ing the method ofAyana and Bekele (1998).
H′NEE = Diversity index for the entire data of NE Ethiopia; H′Z, H′A and H′D = mean diversity index for each trait for the four zones, three altitudes and fourteen districts, respectively; H′Z/H′NEE, H′A/H′NEE and H′D/H′NEE = proportion of diversity within zones, altitudes and districts, respectively; (H′NEE -H′Z)/H′NEE, (H′NEE -H′A)/H′NEE and (H′NEE -H′D)/HNEE = proportion of diversity between zones, altitudes and districts, respectively, in relation to total variation.SJ= stalk juiciness, LMC=leaf midrib color, PCS=panicle compactness and shape, AM=awns at maturity, GlC=glume color, GCov=grain covering, GC=grain color, ET=endosperm texture.

Table 3 .
Variability for quantitative traits described with minimum and maximum values, range and mean for the entire data.

Table 4 .
Mean square for variation between zones, districts and altitude groups, and between landraces for the entire data (within region) and within each zone from ANOVA for quantitative traits.

Table 5 .
Principal component matrix showing Eigen values, variance and Eigen vectors for 17 quantitative traits in sorghum landraces.