The population structure of wild sorghum species in agro-ecological zones of Western Kenya

1 Plant Science and Crop Protection Department University of Nairobi, P.O. Box 29053-00625, Nairobi, Kenya. 2 Africa Harvest Biotechnology Foundation International, P.O. Box 642, 00621-Village Market, Nairobi, Kenya. 3 Kibabii University College, Masinde Muliro University of Science and Technology, P.O Box 1699-50200 Bungoma, Kenya. 4 Department of Biochemistry and Biotechnology, School of Pure and Applied Sciences, Kenyatta University, P.O. Box 43844, Nairobi, Kenya.


INTRODUCTION
The sympatric nature of the members of the sorghum genus over time may have contributed to the spon-taneous occurrence of Sorghum bicolor alleles in Sorghum halepense and Sorghum sudanense populations (Ejeta and Grenier, 2005;Mutegi et al., 2009).S. sudanense, a cultivated form of S. bicolor ssp.drummondii, has shown natural outcrossing ranging from 0-100% on individual plants with averages of 39 and 57% when most tillers were at anthesis (Pedersen et al., 1998).Morrell et al. (2005) found S. bicolor alleles in S. halepense weeds suggesting persistent natural and spontaneous out-crossing in the species.The potential for crop to wild hybridization in sorghum populations in Ethiopia (a center of origin) and in Niger was found to be widespread (Tesso et al., 2008;Adugna et al., 2013).Crop and wild sorghums have been found to be intermixed in farming communities where they were interfertile and had synchronized flowering with several putative crop wild hybrids being observed (Tesso et al., 2008).Ancient and recent cross hybridization events after speciation in the sorghum genus have maintained several crop genes in the wild sorghum species (Morrell et al., 2005).In some cases the spontaneous hybrids formed could have obtained features from both parents, whose heterosis and unique features may have resulted in formation of a bridge species within the sorghum genus.Crop alleles might have an impact on the fitness of the crop x weed hybrids and may enhance or diminish their weedy potential (Hokanson et al., 2010).
Diversification within the sorghum genus heavily draws from disruptive selection (Dogget and Majisu, 1968).This phenomenon favors extreme phenotypes to the intermediates leading to wide variability within the species on specific traits (Rueffler et al., 2006).This has led to the wide variability on the morphological features of sorghum and on important food and forage associated traits as a result of artificial and natural selection for more than one level of a particular character within populations of crop and weed sorghums (Doggett, 1988).This diverse variability is also observed on important microsatellite locus in the genus (Yonemaru et al., 2009).The polymorphism on the microsatellite loci shows maintenance of extreme mutations showing losses or addition of several repeats on mononucleotide, di-nucleotide, trinucleotide and tetra-nucleotide SSR motifs.Individual species within the genus also show similar variability at the genotypic and phenotypic levels (Yonemaru et al., 2009).
Analysis of the population structure of crop and wild sorghum species is important in Africa and Ethiopia, which is a probable "center of origin" of the genus.Therefore, wide genetic diversity is expected to be naturally maintained in East and Central Africa regions and other secondary centers of diversity where sorghum plays an important role in human diets.Recent studies have shown that there is heterozygosity excess and diversity in crop and wild sorghum populations in some parts of Kenya (Mutegi et al., 2011) Ethiopia (Adugna et al, 2013) and Africa (Barnaud et al., 2007;Sagnard et al., 2011) with the wild sorghums contributing higher magnitudes of diversity than the crops or landraces.North-West and South-East wild sorghum populations from Kenya had H EQ of 0.34 and 0.56 each indicating no apparent loss of genetic variability from the expected diversity (Muraya et al., 2010).Mutation models are commonly used for the analysis of population bottlenecks in natural populations.The step wise mutation model (SMM) considers the size of microsatellite repeats, thus it may be subject to problems of homoplasy in its interpretation.The infinite allele model (IAM) assumes that each mutation can create any new allele randomly and considers any point mutation along a stretch of DNA within a locus to constitute a new allele (Goodman, 1998;Schlotterer, 2000).The two-phase model (TPM) is an intermediate model that provides better analysis of how DNA sequences evolve (Di Rienzo et al., 1994).Excess of H EQ over the observed heterozygosity on loci evolving under the IAM, SMM, or TPM models would suggest environmental evolutionary pressures had minimal effect on reducing the population size and thus the allelic diversity (Cornuet and Luikart, 1996).Therefore, indicating absence of significant population bottlenecks in the recent past.This situation is sustained despite the pressures that the wild populations would have experienced as a result of agricultural practices like weeding and changes in biotic and abiotic conditions.If such pressures were significant, mutation and genetic drift would have been important in the populations thus an excess of observed heterozygosity over H EQ would have been apparent.In some members of Poaceae spontaneous occurrence of crop alleles and crop transgenes in wild and landrace populations have been observed (Gepts and Papa 2003).In maize (Zea mays) distinct transgenes have been shown to exist in landraces in Mexico showing evidence of genetic contamination of the centers of origin (Piñeyro-Nelson et al., 2009).Hybridization between maize and teosinte (Zea mays ssp.parviglumis) (wild) has been shown to exist naturally despite the distinct morphological differences that exist between the two (Doebley, 2004).In rice, natural hybridization between O. rufipogon and O. sativa has been reported in important rice cultivation regions globally (Song et al., 2002).
With the recent development and introduction of transgenic crops (James et al., 2010), it is vital to isolate and characterize the role of crop alleles within local wild populations of sorghum and study their spatial and temporal significance (Hokanson et al., 2010).Intrapopulation and inter-population statistics provide a means to elucidate the diversity of wild sorghum populations in the agro-ecological zones (AEZs) from three counties in Western Kenya.This study evaluated the effect of spontaneous hybridization in conspecific sorghum species on the spatial and temporal allelic composition of wild sorghum populations.The study also evaluated allelic differentiation among wild sorghum species across AEZs in Homabay, Siaya and Busia counties.The local population structure of wild sorghum species in AEZs from the three counties of Kenya was examined.In addition heterozygosity deficiency under the infinite allele model (IAM), under the two-phase model (TPM) and step wise mutation model (SMM) was evaluated.

MATERIALS AND METHODS
Three counties (Homabay, Siaya and Busia) were sampled to represent Western Kenya sorghum producing regions around Lake Victoria (Figure 1).Sampling units or farms that had wild sorghum forms within the counties were clustered based on the mapped AEZs.Sorghum farms were randomly sampled and 175 samples were collected from 13 AEZs from three counties in July 2011 (Table 1).Sampling was done after seed set in farms that had both wild and crop sorghums.Some wild sorghums growing on road reserves and uncultivated land in the clusters were also sampled.The seeds and first two leaves next to the panicle were collected labelled and kept on ice.Sampled AEZs in the three counties represented the diversity of wild sorghums in varying ecological conditions (Table 2).

DNA extraction and electrophoresis analysis
Sampled leaves and seeds were maintained on ice during transportation to the laboratory located at the College of Agriculture and Veterinary Sciences (CAVS) (-1° 14' 59.72", +36° 44' 30.79") of the University of Nairobi.The leaves were stored at -20C in the laboratory freezers while the seeds were dried in the greenhouse.The seeds were germinated and young leaves collected and maintained on ice for DNA extraction.Sampled leaves and young leaves were washed with detergent in running water to remove dust particles and other debris.Leaves were weighed and 0.3 g kept in clean and labelled polyethylene bags on ice until grinding.The seeds from sampled panicles were threshed and kept in labelled plastic bags.Two procedures were applied to obtain genomic DNA from young leaves (two to five weeks old plants), from old leaves (five weeks old to flowering plants) and from dried seeds.Genomic DNA was extracted from young leaves by using a modified CTAB extraction procedure (Doyle and Doyle, 1990;Barnaud et al., 2008).Total nucleic acid extraction from seed was done by a modified CTAB-based seed extraction protocol (Delobel et al., 2007).
Agarose gel electrophoresis for genomic DNA extracted from young leaf, old leaf and seed tissues was done before running multiplex PCR.The multiplex PCR products were separated and analysed using a 4% UltraPure ™ Agarose from Invitrogen TM .

Simple sequence repeat (SSR) marker assay
Primers were designed based on the polymorphic SSR sequences from crop sorghum physical map genomic clones on five linkage groups chromosomes found in Phytozome databases.To ensure the specificity of the primers prior to synthesis, e-PCR was performed for all primer combinations.Annealing sites were also confirmed by using BLASTn procedure against the NCBI GenBank database.The primers were selected on the basis of their polymorphism and ability to distinguish several wild sorghum accessions (Table 3).Three out of the seven primers analysed had high polymorphic index.
PCR was done in 0.5 ml reaction tubes with a hybaid® thermocycler.Reaction volumes of 11 µl were used in all experiments.The reaction conditions were set in conventional PCR format using Invitrogen TM PCR reagent system, Taq DNA polymerase with (W-1) Invitrogen TM and 10mM dNTP Mix, PCR grade Invitrogen TM .The PCR components in the reaction mix included; nuclease free water, 1X Taq DNA polymerase reaction buffer , 500 mM KCl], 1.5 mM magnesium chloride, 0.2 mM of each of the four dNTPs, 0.05% W1 detergent, 0.5mM of each of the two primers, 0.5-1 U Taq DNA polymerase enzyme and 50-100 ng template DNA depending on the PCR product size and number of alleles.Mineral oil overlay was added to control evaporation of the reaction.A master mix for 50 samples was prepared and placed on ice and the appropriate volumes aliquoted into each reaction tube before adding the DNA template.
The PCR products were separated and analysed using a 4% UltraPure ™ Agarose, Invitrogen TM .PCR products were evaluated on 3-4% UltraPure ™ agarose gels depending on fragment size separation requirement.UltraPure ™ Agarose, Invitrogen TM (3-4) was added in 300 ml of 1X TBE buffer (recommended rate with modification in run time and voltage) at room temperature and stirred to remove clumps.The agarose was melted by heating in the microwave oven till boiling.Ethidium bromide (15 µl) (10 mg/ml) was added after cooling to approximately 60 to 70°C.This was poured in gel casting equipment with 50 well combs and left to solidify, air bubbles were removed manually.Mineral oil was removed by pippeting from PCR products and 1 µl of loading dye (BlueJuice™ gel loading buffer) Invitrogen TM was added to 11 µl of the PCR products including a 100bp DNA ladder Invitrogen TM 1 μg/μl.These were loaded in the wells while the gel was submerged in electrophoresis buffer.Electrophoresis was done at 120 v for 2-4 h to ensure maximum separation of fragments with varying molecular weight.The gel was placed on a UV transilluminator and a photograph taken.

Data analysis
Genotypic and allelic data was recorded after analysis of PCR agarose gels.Total gene diversity (HT) was partitioned into inter-population and intra-population components (Nei, 1973) to derive intra-population gene diversity (HS) and inter-population gene diversity (DST = HT -HS) values.Furthermore, the coefficient of gene differentiation (GST), which represents the relative magnitude of gene differentiation among subpopulations, was calculated as the proportion of inter-populational gene diversity to total gene diversity as (GST =DST/HT).Expected heterozygosity and diversity (Mondini et al., 2009) indices were estimated using the ARLEQUIN version 3.5 software (Excoffier et al., 2005).Analysis of molecular variance (AMOVA) was done using ARLEQUIN software to estimate the significance of the covariance components allocated for within individuals, within populations, within groups of populations and among groups using non-parametric permutation procedures (Excoffier et al., 1992).The Ewens-watterson neutrality tests were performed with ARLEQUIN software to test for the selective neutrality based on Ewens sampling theory in a population at equilibrium (Ewens, 1972).
Allelic variations on all loci, Shannon index on the polymorphic information on each locus and the overall allele frequency were calculated.F-statistics (fixation index -FIS, Index of deviation from HW equilibrium -FIT and the degree of population differentiation FST) were estimated for multiple populations (Hartl and Clark, 1989;Weir, 1990;Mohammadi and Prasanna, 2003).Test of Hardy Weinburg (HW) equilibrium was done by computing expected genotypic frequencies under random mating using the algorithm by Levene (1949), and performing chi-square (X 2 ) tests.Probability values were used to determine the populations under HW equilibrium.Nei's unbiased genetic distance and genetic identity was estimated with the POPGENE software version 32 (Yeh and Boyle, 1997;Yang and Yeh, 1993).
The classical linkage disequilibrium coefficient measuring  Farm management hand book of Kenya (2005Kenya ( , 2009)).
Table 3. SSR loci used to analyze polymophism in the wild sorghum populations.

Locus
Chromosome Forward primer (5' -> 3') Reverse primer (5' -> 3') 10 TGAATAATGCACGCAGTAGCGTCT TATTCCCACGGCTCGCTAGCTACT deviation from random association between alleles at different loci (Lewontin and Kojima, 1960) D and its standardization by the maximum value it can take (D max), given the allele frequencies (Lewontin, 1964) were estimated in POPGENE.Dendogram for populations from AEZs were based on Nei's genetic distances using UPGMA, which is an adoption of the program NEIGHBOR of PHYLIP version 3.5c.The Wilcoxon test was applied in the Bottleneck software to test for heterozygosity deficiency under the infinite allele model (IAM), the two-phase model (TPM) and the step wise mutation model (SMM).Estimates of HEQ from different mutation models (IAM, SMM and TPM) and analysis for presence of population bottlenecks were performed in BOTTLENECK software.Dissimilarity matrixes were developed in DARwin software using the DICE procedure for presence or absence data below: (1) Where, dij is the dissimilarity between units i and j; a is the number of variables where xi = presence and xj = presence; b is the number of variables where xi = presence and xj = absence; c is the number of variables where xi = absence and xj = presence Axial and radial phylogenetic trees were developed using the weighted neighbourhood joining Saitou and Nei (1987) in DARwin below. (2) Where, i, j and k are elements units or groups of units; Ci, Cj and Ck are the unit numbers of these elements; ∂ (S,K), weighted average of dissimilarities between k and elements i and j; i, j and k -elements units or groups of units.

Allelic variation for microsatellite loci and allele frequency in populations of wild species
Microsatellite loci SB1764, SB3420 and SB4688 showed polymorphisms within the populations from Homabay, Siaya and Busia counties and in sub-populations clustered along AEZs (UM 1 , LM 1 , LM 2 , LM 3 , and LM 4 ).Locus SB1764 showed 5 alleles from which 3.27 alleles were effective.A Shannon information index of 1.325 was observed (Table 4).Five alleles were seen from the analysis of locus SB3420 in all populations, where 3.17 out of the five alleles were effective.A Shannon information index of 1.3 was observed for this locus.For locus SB4688, five alleles were observed in the populations but only 2.92 of the alleles were effective, thus a Shannon information content of 1.26 was recorded (Table 4).
PCR amplification of three loci gave bands ranging from 250bp to 520bp in the wild populations from the three Western Kenya counties (Figure 2a, 2b, 2c).Amplifications were obtained on loci SB4688 and SB3420 in wild sorghums collected from Siaya and Homabay.Some wild materials did not amplify on loci SB4688 (lanes 34 to 38, Figure 2a) and SB3420 (lanes 14,15,20,21,22,25,40,41 and 42,Figure 2b), these were different from the expected alleles in S. halepense, S. verticilliflorum, S. bicolor and S. sudanense (Figure 2c).A 550bp band from locus SB3420 (Lanes 5, 9, 12, 31 and 32, Figure 2b) demonstrated a S. verticilliflorum origin, while 300 and 290 bp from the same locus demonstrated   a S. bicolor and S. sudanense origin (Figure 2b).Some materials had two alleles showing recent hybridization events between S. bicolor and S. verticilliflorum on locus SB3420 (Lanes 4, Figure 2b).This is a result of their proximity in crop sorghum stands (Figure 3).Locus SB1764 showed high (above 0.7) frequencies of alleles A and C in SYLM 3 , SYLM 4 and BULM 4 (Table 5).Alleles A and C of SB1764 were the most frequent in the population (Table 5).Allele C had medium (0.4 to 0.7) frequencies in all AEZs except in HBLM 4 , HBLM 1 , SYLM 4 , BULM 1 and BULM 2 (Table 5).Alleles B, D and E of locus SB1764 had low (below 0.4) frequencies in all AEZS (Table 5).Locus SB3420 alleles had medium (0.4 to 0.7) to low (below 0.4) in all populations except in SYLM 1 where allele E had a frequency of 0.7.Allele B had medium frequencies (0.4 to 0.7) in 8 AEZ populations except HBLM 3 , HBLM 4 , SYLM 1 , SYLM 2 and BULM 4 .Medium frequencies were also observed on allele D and allele E (Table 5).Locus SB44688 showed high frequencies on allele C in BULM 4 and allele E in SYLM 1 , and SYLM 3 .Medium frequencies on this locus were seen on alleles B, C, E, F and G. Low frequencies were observed on alleles B, C, F and G. Allele E was the most frequent in the population (Table 5).

Genetic diversity within and among wild sorghum populations from AEZS in Western Kenya
Total heterozygosity was high in AEZs from Busia (0.712), followed by those from Homabay (0.665) and Siaya (0.564).Observed heterozygosity (H O ) was high (0.762) in BULM 1 (1200-1440 masl) wild sorghum groups.However, H O was low towards the lower AEZs LM 2 , LM 3 , and LM 4 in Busia, similar trend was seen in Homabay AEZs with high H O values of 0.62 in HBLM 1 and HBLM 2 and low values in HBLM 3 (0.444) and HBLM 4 (0.567) (Table 6).In Siaya all the zones had H O of less than 0.4 (Table 6).
High expected heterozygosity (H E ) values were observed in Lower midlands in Busia.Population from BULM 2 had H E of 0.715 while BULM 1 , BULM 3 and BULM 4 had H E values of between 0.590 to 0.690.Expected heterozygosity in HBLMs ranged between 0.531 (HBLM 1 ) to 0.683 (HBLM 2 ).Expected heterozygosity was low in AEZs from Siaya (0.453 to 0.557).Expected heterozygosity was greater than the Inter-population gene diversity (D ST ) and the proportion of inter-population gene diversity (G ST ) in all wild sorghum populations from different AEZS.However the D ST and G ST of HBLM 1 (0.134 and 0.202), SYLM 1 (0.111 and 0.197) and BULM 3 (0.116 and 0.163) had higher values than other AEZS (Table 6).
The degree of inbreeding and therefore heterozygote deficiency (F IS ) was high in HBLM 1 , HBLM 2 , HBLM 3 and HBLM 4 .HBUM 1 had an F IS index of 0.596 which was almost similar to that observed in SYLM 1 (0.582) (Table Table 5. Allele frequency for loci SB1764, SB3420 and SB4688 in the wild sorghum populations obtained from three sorghum growing count ies around Lake Victoria in Western Kenya.6), these populations had higher levels of outcrossing.

Population
Expected heterozygosity on each locus varied across populations in the AEZs.Locus SB1764 gave high H E values in HBLM 4 (0.729) (Table 7).
Midway H E values of between 0.69 and 0.455 were observed in the rest of the populations.Locus SB3420 showed higher diversity within HBLM 3 , BULM 1 , BULM 3 and BULM 4 (>0.7).The least H E value on the locus was observed in SYLM 1 (0.431) (Table 7) suggesting low intrapopulation gene diversity.Locus SB4688 had higher intra-population gene diversity in BULM 2 and BULM 4 with H E values of 0.775 and 0.740 respectively.The least H E value on the locus was observed in the SYLM 1 population just as was the case with locus SB3420 (Table 7).
Analysis of molecular variance (AMOVA) as a weighted average over the three loci SB1764, SB3420, SB4688 showed that 3.2% of the total variation was explained by among group variations, 10.5% by among populations within groups' differences and 16.3% by among individuals within populations.Large variations of 69.97% were explained by differences within individuals on the three loci (Table 8).

Population specific indices in wild sorghum populations from AEZS in Western Kenya
Wild sorghum from BULM 2 showed high degree of inbreeding (F IS of 0.9) on locus SB1764, while loci SB3420 had high inbreeding in HBUM 1 (0.7) and SYLM 1 (0.81).Loci SB4688 had high inbreeding in HBUM 1 (0.83) (Table 9).Midway heterozygosity deficiency (F IS ) values were observed on locus Sb1764 in HBLM 1 (0.4) HBLM 3 (0.56) SYLM 3 (0.41) and BULM 4 (0.43) (Table 9).Similar values were observed on locus SB3420 in SYLM 2 (0.48) and BULM 3 (0.46)Similar inbreeding degrees were seen on locus SB4688 in SYLM 1 (0.59) and SYLM 3 (0.57).Low values of less than 0.4 were observed on locus SB1764 in AEZs in Homabay, Siaya and Busia counties.However negative F IS valves were more present on loci SB3420 and SB4688 in sorghum species obtained from HBLM 1 , HBLM 2, HBLM 4, BULM 2 , BULM 3 and BULM 4 (Table 9).This shows differential heterozygosity deficiency coefficients on the three loci in wild sorghum population with higher affinity for fixation on loci SB1764.On a whole population basis of wild sorghums from Western Kenya, the degree of inbreeding (F IS ) values were generally    on locus SB4688 with an F ST index of 0.2074.Locus SB1764 had an F ST index of (0.1620) while locus SB3420 had an index of 0.1226).All three loci gave a degree of population differentiation of 0.1634 (Table 10).Gene flow (Nm) estimated from the degree of population differentiation (F ST ) had a whole population mean of 1.28.Gene flow was most observed on locus SB3420, moderate on locus SB1764 (1.294) and least on locus SB4688 (0.96) (Table 10).

Population equilibrium within wild sorghum populations in agro-ecological zones in in Western Kenya
Homogeneity tests of gene frequencies in wild sorghums across counties and AEZs within counties using χ 2 tests demonstrated that the wild sorghum populations were within the HW equilibrium (Table 11).This equilibrium was true for all loci (SB1764, SB3420 and SB4688).χ 2 computation for wild sorghum from lower midlands in Homabay, Siaya and Busia showed significant adherence for the HW equilibrium.Wild sorghums from HBLM 4 , SYLM 2 , SYLM 4 , BULM 1 and BULM 4 did not adhere to the HW equilibrium on more than two loci, as shown by the high non-significant P values (Table 11).

Population bottlenecks in wild sorghum populations in the different Agro-ecological zones in Homabay, Siaya and Busia counties
There was no evidence to suggest existence of population bottleneck on wild sorghums growing in AEZs in Homabay, Siaya and Busia.The Wilcoxon test for heterozygosity deficiency under the infinite allele model (IAM) had a probability of 1.00, under the two-phase model (TPM) P was 1.00 and under the step wise mutation model (SMM) P was 1.00 showing no significant pattern.Heterozygosity excess was not significantly different under the IAM, TPM and SMM model (p=0.0625).A similar situation was seen when both heterozygosity excess and deficiency was tested (p=0.125)(Table 12a).
The sign test showed that the expected number of loci with heterozygosity excess under IAM was 1.68 under TPM was 1.74 and under SMM was 1.82.No loci were observed with heterozygosity deficiency but none had significant heterozygosity excess under all models (Table 12a).Despite lack of significant heterozygosity excess or deficiency  pattern in the models applied to explain bottlenecks in the populations there was variation in the calculated heterozygosity values.This was true when compared to the heterozygosity under mutation -drift equilibrium (H EQ ) values in IAM, TPM and SMM models (Table 12b).On locus SB1764 H E was 0.6083 but H EQ was computed at 0.432, 0.648 and 0.543 under IAM, SMM and TPM model (Table 12b).Locus SB3420 had H E of 0.5250 but under IAM, SMM and TPM H EQ values of 0.428, 0.647 had 0.538 respectively (Table 12b).Locus SB4688 had H E of 0.4583 and H EQ 0.434, 0.641 and 0.540 values under IAM, SMM and TPM respectively.
Population from Homabay, Siaya and Busia  4).Populations from Siaya clustered away from Busia and Homabay, however, three sub-clusters were observed within the major cluster (Figure 4).Populations from Busia clustered away from the population from Homabay.Less minor clusters were observed in Busia than those that were seen in Homabay.Some populations from Siaya were found to cluster with those from Homabay.A similar situation was observed with populations from Busia clustering with those from Siaya and Homabay (Figure 4).Populations sampled from AEZs in the three counties showed a trend on the distribution of wild sorghum genotypes in given ecological zonation (Figure 5).Wild sorghums from LM 4 zones made clusters with wild sorghums from other counties.SYLM 4 clustered with populations from lower midlands from Busia.BULM 4 clustered with SYLMs while HBLM 4 clustered with SYLMs.Wild sorghums from each of the counties clustered together in most instances irrespective of their AEZ of origin and showed no significant pattern on the distribution (Figure 5).

DISCUSSION
Crop alleles were observed in the wild sorghum populations obtained from sorghum growing Western Kenya counties around Lake Victoria.Crop allele frequencies varied among the loci studied and in the AEZs from where the material was sampled.Low (˂0.4), moderate (0.4-0.7) and high (0.7) crop allele frequencies were observed in this study.The presence of crop alleles at varying frequencies in wild forms could be directly attributed to the wild sorghum sympatric growth patterns and cultivation of crop sorghum.Localized sympatry was observed in all regions of growth, where wild sorghums were seen growing in crop stands as weeds, on boarder rows, hedges and road side reserves.Tillering in sorghums (especially wild genotypes) extended their flowering period and increased the chances of interspecific hybridization.The climatic conditions in different geographic locations and the rainfall pattern encourage synchrony in germination and flowering and possibly interspecific hydridization.Previous studies in sorghum have shown the existence of crop alleles in populations of weed sorghums (Morrell et al., 2005).Proximity of wild sorghums to crop sorghums has been shown to result in both crop to weed and weed to crop interspecific hybridization (Warwick et al., 2009;Arriola and Ellstrand, 1996;Sahoo and Schmidt, 2010).The persistence of crop alleles in wild populations would be attributed to their weedy nature in given environments (Paterson et al., 1995).
Wild sorghum population had moderate to high Cameroon (Barnaud et al., 2007) and in Mali and Guinea (Sagnard et al., 2011).Wild sorghum populations exhibit intraspecific and interspecific hybridization.Intraspecific hybridization results in low heterozygosity values in populations while interspecific hybridization explains gene flow within wild populations and thus the evolution and diversity of biotypes.Several intermediate types may also be observed in populations, which may play the role of "bridge species".However, interspecific hybridization seems to increase variability of traits per species as a result of disruptive selection in wild sorghums (Doggett and Majisu, 1968).
Wild sorghums in Western Kenya were maintained at HW equilibrium due to low inbreeding and high heterozygosity.The degree of inbreeding (F IS ) was low in most AEZs except in UM l from Homabay and LM 1 from Siaya.These two regions also had fairly low heterozygosity (H E ).F IS and the degree of genetic differentiation of the population (F ST ) have substantial impact on the HW equilibrium at whole population level.Low F IS , F IT and F ST values were observed at a whole population basis, indicating that the population was at HW equilibrium.However, populations from Homabay UM 1 and Siaya LM 1 had moderate F IS values of 0.59 and 0.58 respectively and did not conform to the HW equilibrium.The deviation was probably due to farmer selection practices during weeding.Farmers might have allowed some species of wild sorghums to grow to maturity in or around sorghum fields.This may have a similar effect to inbreeding in wild sorghum population.However, on a whole population basis, inbreeding was low and heterozygosity was high.Previous results show that populations of crop and wild sorghum were heterozygous with low F IS in other parts of Kenya (Mutegi et al., 2011) and in other parts of Africa (Barnaud et al., 2007, Sagnard et al., 2011).Allele frequencies within populations of wild sorghums were high on the loci assayed.This indicates the presence of gene flow among wild sorghums.Intra-population diversity (H S =H E ) was moderate to high in these wild sorghum populations signifying existence of pollen or seed mediated outbreeding and geneflow among the wild sorghums.Thus, there is potential for proliferation and maintenance of exotic crop genes in these wild populations.In addition, low F IS values were observed in the populations showing low inbreeding.Partitioning of variances shows that variation within individuals in populations was higher than variations among populations, confirming diminished inbreeding within individuals in populations.This also indicates that pollen and seed mediated gene flow was important in the wild sorghum populations.Furthermore, the intra-population diversity (H S ) was larger than the inter-population diversity (D ST ) in all populations, thus allelic differences between populations was not huge.The flow of genes from population to population could be accomplished by both pollen flow and seed distribution.This is important due to the movement towards adopting transgenics in cropping system in the near future.If this happened there would be a risk of unintentional escape of transgenes into wild populations thereby boosting their adaptive advantage and therefore their wild character (Warwick et al., 2009).
The wild sorghum populations did not have significant patterns to explain recent population bottlenecks.Heterozygosity under mutation drift equilibrium (H EQ ) varied under IAM, TPM and SMM models.The H EQ estimated by the mutation models show lower values probably due to selective weeding in sorghum fields and the presence of sorghum wild volunteers.Bird preference for the larger seeded weedy sorghums may have had some impact on reducing the estimated H EQ .Growth of small-scale agriculture in the regions around Lake Victoria may have led to heavy weeding of some wild sorghum species to extinction and therefore contributed to the lower H EQ values estimated in the mutation models.Previous studies in sorghum growing regions of Ethiopia did not show recent bottlenecks in sorghum populations (Adugna and Bekele, 2015).
Phylogenetic analysis of wild sorghums showed that counties and AEZs clustered in different ways.The dendograms from the counties had consistent clusters in contrast to those from the various AEZs that did not show a consistent pattern.For instance in Figure 4, genetic material from one county clustered away from material from other counties.However, in Figure 5, genetic material from specific AEZs did not cluster together, indicating that wild sorghums were not clearly segregated along AEZ boundaries in Western Kenya.The presence of consistent geographic county clusters and lack of consistent AEZs clusters indicate higher influence of human population activities rather than climatic conditions on the distribution and diversity of wild sorghums.This could be attributed to practices such as seed contamination with specific weed seed during harvest and also due to selective weeding of the non crop sorghums.Furthermore, selective maintenance of certain non-crop sorghums on hedges and sharing of contaminated seed with specific wild sorghum seeds could be important in the distribution and diversity of wild sorghums.These practices enhance the populations of given wild sorghums within certain geographic locations.Selective weeding and seed distribution practices seem to differ from county to county, culminating in differences in diversity of weed types present.The influence of strong selection by man has been shown to influence sorghum phylogeographic position in Africa (de Alencar Figueiredo et al., 2008).The diversification trend of the wild types found in Western Kenya seem to follow the diversification pattern of crop sorghum which is strongly linked to geographic biotype / race classifications (Deu et al., 1995;2006, Dje et al., 2000;;Barnaud et al., 2007).

Conclusions
Crop alleles were observed in wild, sorghum populations.In addition wild sorghum population had moderate to high Magomere et al. 1491 diversity on SSR loci assayed.H E values of between 0.453 in LM 1 to 0.715 in LM 2 were obtained in the wild populations.The wild populations had low inbreeding, low genetic differentiation and low to moderate deviation from HW equilibrium.Intra-populations diversity (H S ) was larger than inter-population diversity (D ST ) in all wild populations.This shows pollen and seed mediated gene flow is important in the wild sorghum populations.
Analysis of H EQ values under IAM, TPM and SMM mutation models suggest absence of recent bottlenecks in the wild populations.Human influence maybe more important than climatic conditions in explaining the distribution and diversity of wild sorghums populations.Future concerns over the persistence of robust crop alleles including transgenes in wild sorghums will need to be addressed.There is need to expand the capacity for testing and evaluation of the presence and effect of crop genes and in wild sorghums growing around crop sorghum production regions.

Figure 1 .
Figure 1.Sampling of wild sorghums in agro-ecological zone characteristics in survey sites in Homabay, Siaya and Busia counties of Kenya.

Figure 2 .
Figure 2. Agarose gel electrophoresis showing alleles 50 representative samples of wild populations from Western Kenya counties and AEZS.(a) Locus SB4688, (b) locus SB3420.S. bicolor is used as a positive control in lane 50 and in lane 3 (a) and (b) respectively; water was used as a negative control and loaded with the 100 bp ladder in lane 26.

Figure 2c .
Figure 2c.Agarose gel electrophoresis showing expected alleles from sorghum species on loci SB4688 and SB3420.

Figure 3 .
Figure 3. Wild species in crop sorghum farms in Western Kenya.

Figure 4 .
Figure 4. Dendogram showing clusters of populations from Homabay, Siaya and Busia counties of Western Kenya.

Table 1 .
Number of wild sorghums collected from various agro-ecological zone sites in Homabay, Siaya and Busia counties of Kenya.

Table 2 .
Agro-ecological zone characteristics in Homabay, Siaya and Busia counties of Kenya where wild sorghums were collected.

Table 4 .
Observed number of alleles and Shannon Index of the for SSR loci assayed.

Table 6 .
Molecular diversity indices of wild sorghum populations obtained from three sorghum growing counties around lake Victoria in Western Kenya.

Table 7 .
Expected heterozygosity (HE) of wild sorghum populations obtained from three sorghum growing counties around lake Victoria in Western Kenya.

Table 8 .
AMOVA showing percentage of variation on each of the three loci assayed.
low on all loci giving a population mean of 0.1514.Locus SB1764 showed F IS index of 0.3015, locus SB3420 had 0.0336 while locus SB4688 had an index of 0.1196 on all collections from Homabay, Siaya and Busia AEZs (Table10).The whole population leaned towards the Hardy-Weinberg (HW) equilibrium with F IT index (deviation from Hardy Weinberg (HW) equilibrium (=1-H/H E ) of (0.2901).Attainment of the HW equilibrium was observed more on Locus SB3420 with F IT of 0.1521.Locus SB4688 had an index of 0.3022, while deviation from the HW equilibrium was most visible on locus SB1764 with an F IT index of 0.4147 (Table10).Inter-population differences were most observed

Table 12a .
Sign and Wilcoxon tests for heterozygosity excess and deficiency under the IAM, SMM and TPM models.

Table 12b .
Comparison between (Ho) Heterozygosity (He) Heterozygosity and the expected values under the IAM, SMM and TPM models.

Table 13 .
Nei's unbiased measures of genetic identity and genetic distance among wild sorghum populations from AEZs in Homabay, Siaya and Busia counties