Genetic diversity study of Kenyan cassava germplasm using simple sequence repeats

Cassava (Manihot esculenta Cranzt) is an important food security crop for resource poor rural communities particularly in Africa. Little is however known about molecular diversity of Kenyan cassava germplasm. This led to a study whose objective was to identify genetic constitution of cassava accessions from different regions of Kenya using molecular tools. Seven pairs of micro satellite (SSR) primers previously developed from cassava were used to detect polymorphic 21 alleles in a sample of 69 accessions. The cluster analysis of similarity matrix obtained at 68% with SSR data showed that the 69 accessions were grouped into five marker based groups. This study proved that SSRs could be used to identify cassava accessions as well as in the assessment of level of genetic relatedness among accessions.


INTRODUCTION
Cassava (Manihot esculenta Crantz) belongs to the family Euphorbiaceae.Of the 98 species that belong to the genus Manihot, cassava is the only species that is widely cultivated for food production (Nassar, 2005).It is a perennial shrub grown throughout the tropical and subtropical regions of the world.It originated and was domesticated in the Neotropics.The crop is widely grown between the latitudes 30 N and 30 S, a belt that coincides with most of the developing countries of the world (Phillips, 1974).The crop was introduced into Africa in the 16 th century, where it is now cultivated across an extensive area, known as the "cassava belt".In Kenya, cassava was introduced between the 16 th and 19 th century by the European explorers.It was introduced along with other crops such as beans, maize and sweet potatoes.It adapted well to the environment and by the start of the 20 th century, it was widely grown in the country (Suttie, 1970).For a long time, this crop has been ranked among the four major food crops in developing countries, after rice, wheat and maize (Cock, 1985).Cassava's adaptability to relatively marginal soils and erratic rainfall conditions, its high productivity per unit of land and labor, the certainty of a yield even under the worst conditions and the possibility of maintaining a continuous supply year round, makes this crop a basic component of the farming system in most areas of Sub-Saharan Africa (Nweke and Enette, 1999).Other advantages include flexibility in harvesting (year-round availability) and planting periods, and long period of in-ground storage after maturity.Famine therefore is rare in areas where cassava is grown, since it provides a stable base to the food production system and has the potential for bridging the food gap (Iglesias et al., 1997;Nweke and Enette, 1999;Ekwele et al., 2001).*Corresponding author.E-mail: enjohn75@yahoo.co.uk.Tel: + 254-724-732-721.
In Kenya, cassava is largely grown for household food security and in some areas for sale in fresh or processed form.Total output of cassava is estimated to be 864,000 tons at an average productivity of 9 tons per hectare (Kariuki et al., 2002).
Although, genetic resources have traditionally been evaluated on the basis of morphological and agronomic traits, these do not necessarily reflect food value and inherent genetic relationship among germplasm.Indeed, most morphological and physiological descriptors are greatly influenced by the environment and show continuous variation and high plasticity, with most of them only scorable at maturity.The use of reliable and standardized genetic descriptors is therefore critical in enhancing the efficiency of identification and use of high value plant germplasm to ameliorate hunger and malnutrition (Wachira, 2002).Accurate characterization and evaluation of accessions within the cassava germplasm resources in Kenya as well as assessment of the level of genetic variation in the resource is important in devising optimum management strategies for sustainable utilization and conservation of the resource.
A sustainable agricultural system requires that components of diversity be used in a way and at a rate that will not lead to a long term decline of diversity, thus maintaining its potential to meet the needs and aspirations of present and future generations.Genetic diversity is however threatened by the introduction and adoption of modern high yielding varieties (Wachira, 2002).A dramatic increase in the use of small number of highly selected accessions has led to loss of valuable genetic resources.The proportion of genetic diversity accessed by the popular varieties has often not been determined yet it is critical to the sustainable use of cassava genetic resources in Kenya.Since cassava is predominantly vegetatively propagated, over reliance on a few varieties which may also share a common ancestry may minimize the on-farm diversity and thus increase the risks posed by such coevolving biotic factors as pests and diseases to cassava farming.Genetic diversity can most efficiently be quantified using molecular markers.The simple sequence repeat (SSR) markers are one such marker system that has been used for many genetic applications, including the assessment of genetic variability in germplasm collections and pedigree reconstruction (Fregene et al., 2003).It is envisaged that this study will help ensure a broad genetic base for such future varieties.
The objective of this study was therefore to identify genetic constitution of 69 cassava accessions from different regions of Kenya using SSR molecular tools.

Plant materials
Sixty nine (69) cassava accessions were randomly sampled (Table 1) from the national ex-situ gene banks at Kakamega, Katumani and Kiboko.Within them, few samples from advanced IITA lines and CIAT were sampled to act as checks.All these germplasm were subjected to molecular diversity studies using SSR markers.

DNA isolation
Genomic DNA was isolated using two methodologies that is the 2X CTAB method as described by IAEA (2002) and as described by Dellaporta et al. (1983).For the both methods, 500 µg of young leaves was obtained from six months old plants which had been stored at two different conditions; silica gel dried and frozen (at -80°C).

Determination of DNA concentration, purity and intergrity
The isolated DNA was purified and later the quantity and intactness (integrity) was confirmed in 1X TBE buffer alongside some uncut unmethylated lambda () DNA standards (750, 500, 250, 125, 100, and 83 ng), (Sigma, UK).The gel was stained in ethidium bromide (10 ug/ml), visualised on a ultra violet transilluminator at 254 nm and photographed.The band size and staining intensity of the isolated and electrophoresed DNA samples from cassava were compared to those of the  DNA standards to determine concentration.Inergrity of the DNA was determined by absence of smears.

Optimization of SSR-PCRs
PCR optimization experiments were carried out using five DNA samples by varying the concentration of the template DNA, Taq DNA polymerase, annealing temperature, number of cycles and the Mg 2+ salt concentration.

SSR amplifications
Optimised SSR assays were performed using fifteen pairs of oligonucleotide primer sequences (Table 2) obtained from Operon Technologies Inc. (USA).Using the optimized PCR assay the 15 oligonucleotide pairs were screened on a sub set of 5 samples from the entire collection to reveal those that would generate unambiguous polymorphic SSR alleles.The primers which gave scoreable amplicons were then used in the analysis of all the 69 test cassava accessions.Following the initial screening, 7 SSR primers that amplified clear and reproducible SSR allele profiles were selected to study SSR variation in the samples.A negative control was also set in which sterile distilled water was used to replace template DNA.Amplification reactions were performed in a DNA thermocycler machine (Mastercycler) with a heated lid (94°C) programmed as follows; one hot start cycle of 94°C for 2 min followed by 30 cycles of 94°C for 1 min; 56°C for 1 min (DNA annealing); 72°C for 1 min and a final extension cycle of 72°C for 10 min.The samples were then maintained at 4°C.

Gel electrophoresis
The generated SSR amplicons by amplification were separated according to size by electrophoresis on high resolution (3%) metaphor agarose gels run in 1XTBE for 2 h at 100 V.A 100 base pair DNA ladder (Sigma, UK) was used to estimate the sizes of amplification products.Gels were stained in ethidium bromide and visualised on a UV light trans illuminator at 254 nm.

Scoring of SSR segments
SSR alleles were scored from the reproducible PCRs set using different primer pairs.The size of SSR alleles was estimated from the gel photograph by comparison with 100 base pair ladder marker.Allele profiles were manually scored and compiled into a binary matrix.Positive amplification was treated as separate characters and was scored for the presence (1) or absence (0) of alleles.Only intensely stained unambiguous alleles were scored and used for statistical analysis.

Data analysis
The scored molecular data in a binary form (1=allele presence, 0=allele absence) was configured as an input file and analyzed with POPGENE version 1.31 (Yeh et al., 1997).Proportion of polymorphic alleles (P) was derived as: Where n pj is the number of polymorphic alleles and n total is the total number of amplified alleles.Single population descriptive statistics were derived.Genetic variability within the test cassava accessions was determined through derivative of average expected heterozygosity (He) of the accessions using the POPGENE software assuming Hardy-Weinberg equilibrium and no population structure.The index proposed by Nei and Li (1979) was used to calculate genetic identity (Sij) between cultivars (i) and (j) as; Sij=2Nij/(Ni+Nj)   1b).
Where, Nij= the number of bands (alleles) in common between cultivars i and j; Ni and Nj are the number of alleles for cultivars i and j respectively.The similarities were used to derive genetic diversity trees by average linkage cluster analysis (POPGENE version 1.31).

RESULTS
The two extraction methods revealed that the method of leaf storage (silica gel and frozen at 80°C) impacted on quality of extractable DNA.Irrespective of leaf storage treatment, all DNA extracted by the CTAB methods were degraded and the isolation efficiency was not repeatable between samples.The Dellarporta method yielded high quality DNA from fresh leaf samples (refrigerated) but degraded DNA from silica gel preserved samples.For the Dellarporta protocol, the DNA extracted from silica preserved samples showed variation in DNA yields.The variations could have been attributed to the different levels of secondary metabolites in the different accessions which resulted to the variations in efficiency of extraction of DNA observed.Since the fresh leaf samples exhibited minimal variations in quantity and quality of extracted DNA, this type of tissue was chosen in subsequent DNA isolation activities.
The concentration of isolated cassava DNA was estimated by comparing the band size of 3 ul of isolated DNA with that of uncut, unmethylated lambda (λ) DNA standards (750, 500, 250, 125, 100 and 83 ng) (Figure 1).The extracted DNA ranged in concentration from 27.8 to 250 ng/µl.The isolated DNA was also high in molecular weight and was intact.Based on this, PCR optimization of cassava DNAs was carried out using diluted samples with a DNA concentration of 100, 50, 25 and 18 ng/µl.This study established the 25 ng/µl DNA sample to be optimal for PCR assays.All primers were diluted to 1 nm whereas MgCl 2 in the PCR was optimized using five serial dilutions of 2.5, 3.5, 4.0, 4.5 and 5.0 mM per PCR.The 5.0 mM MgCl 2 concentration gave best observable amplicons.After trying different types of agarose, metaphore agarose was chosen for subsequent use due to its ability to resolve alleles that differed only in a few base pairs.
The molecular size of SSR amplicons (alleles) differed with the selected primers and ranged from 230 to 310 bp with primer SSRY 13, SSRY 78 and SSRY 51, respectively.The smallest difference between the highest and lowest values of allele size was 10 bp at locus SSRY 13, and the largest difference (40 bp) was detected at locus SSRY 35.The allele sizes scored at the other remaining loci presented differences between 20 and 30 bp.A representative SSR profile of 19 cassava accessions with primer SSRY35 is presented in Figure 2. The seven SSR primers that were screened amplified a total of 39 alleles.The average number of alleles per  1b.The mean Nei's gene diversity index (He) and shannons information index (I) estimates of heterrozygosity, for the 69 accessions of cassava are presented in Table 4.

Heterozygosity (He)
Shannons index (I) 0.36 ± 0.15 0.53 ± 0.19 primer pair ranged from 4 to 8. The number of polymorphic alleles ranged from 2 to 4. Percentage polymorphism ranged from 42.8 to 75% (Table 3).In most cases, the number of unique alleles (that is, amplified products in just one individual) was positively correlated with the total number of alleles per locus and their size differences.A binary data matrix based on 69 accessions and 21 polymorphic alleles from seven primer pairs was used for statistical analysis.
A genetic identity and distance matrix based on the proportion of shared (common) alleles (Nei and Li, 1972) was derived using Popgene version 1.31 that was used to establish the level of relatedness between the 69 accessions (Table 4).Others like Mue, a land race and 990072-B an IITA introduction were found to be the only one having the lowest genetic identity of 0.1 although there were nine pairs of accession, pairs with low genetic identity of 0.2 such as Nyamambakaya and Nyakatanegi-2.Estimation of molecular identity ranged from 0.2 between the following pairs of accessions; Nyakatanegi 3 and Nyamambakaya; Kamisi and Obarodak; Opondo and Kiringis; MM96/7688 and Nyakatanegi 3; Kapchetuya and Nyakatanegi 2; Ex-Mariakani and Kiringis; Mue and Obarodak; Mue and 990072 among others to 1.0 between accessions Agriculture and CK-9, Tamisi and Sifros, Nabwire and Sifros, Bumba and MM96/1871 among others.Pairs of accessions with an identity of 1.0 could not be distinguished by the 21 polymorphic alleles.This clearly suggests that these paired accessions could be identical genetically and possibly have only been given different names.The observation points to possible genetic redundancy of some accessions conserved in the national repository centers in Kenya.The collection held in these centres need to be rationalized to remove the genetically redundant accessions in order to contribute to generation of a truly core collection.Other accessions M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 M 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 which are popular in Kenya and had high genetic identity of 0.9 included Kaleso and Kibandameno, the popular landraces of the Kenyan coast; CK-8 and CK-9 together with Nabwire and Bumba which are western province land races; Kapchelelyo, Kapchetuya and Marakwet land races which are exclusively grown at Kerio valley of Kenya among others mainly found in cluster IV.
The accessions with similar names among the accessions showed varied levels of genetic identity.Varieties Serere, MH95/0183 and Migyera showed high genetic identity of 0.7.This shows that these collections with similar names could actually be the same.This was followed closely by accessions MM96/1871 and KME-1 each with 0.6 and 0.5 genetic identity respectively.Among the Kenyan land races, genetic identity ranged from 0.2 to 1 (for example Mue and Kitwa with Sifros and Tamisi respectively) while among the IITA lines, identity ranged from 0.2 to 0.9 (for example MM96/1871 and 990072 with 990183 and 990056 respectively).The distance matrix was used to derive a dendogram using unpaired group mean linkage cluster analysis (UPGMA) (Figure 3).The dispersion of cassava accessions into various groups appeared to be random.The wild accessions collected at Kerio Valley clustered into different groups that is Arror-1 clustered into group II whereas Arror-2, Arror-3 and Arror-4 clustred in group IV.The two wild accessions, Arror 2 and 4 were very close to each other and formed a tight cluster in group IV.The three wild types of cassava found in Kenya were spread in the dendogram.This wild type cassava accessions may be the only ones to be found in Africa as has previously been reported by Halsey et al. (2008).
At a distance of 68%, the 69 accessions were clustered together into five marker based groups (Table 5).Cluster group IV had the largest number of accessions (42) which comprised two accessions originating from CIAT, 9 lines originating from IITA and 32 accessions originating from local land races.It is important to note that most land races were clustered into this group.Group I formed a distinct cluster group with accessions 990072 from IITA and landraces Nyakatanegi-3 and KME 1 appearing to be most distantly related from all others.

DISCUSSION
Genetic diversity in cassava has previously been studied using DNA molecular markers.Among the molecular tools that have been used include isozymes (Sarria et al., 1992), RFLPs (Angel et al., 1992), RAPDs (Tonukari et al., 1997) and SSRs (Fregene et al., 2001;Moyib et al., 2007).In most studies, low to medium genetic diversity has been observed.In the present study, also, there was generally medium genetic diversity among the land races and between the improved (introductions) accessions from IITA and Kenyan land races, as shown by the dendogram.This might be as a result of a common source of collection (IITA) from which the Kenyan cassava breeders and farmers choose their common desirable traits of cassava, such as potential for high yields and disease resistance.Because of the wide variability in biochemical quality traits, it is feasible to use some of the accessions assayed in this study as progenitors to introgress useful genes into improved cassava lines.This agrees with the study by Moyib et al. (2007) on Nigerian collections.
SSR primers have shown high levels of polymorphism in many important crops including Sorghum bicolor (Smith et al., 2000), Triticum aestivum L. (Ahmad, 2002), and Cucumis melon L. (Danin-Poleg et al., 2001).SSR primers were also polymorphic in the Kenyan cassava cultivars assessed in this study.The results of this study showthat each of the seven primer pairs detected polymorphisms among the 69 cassava accessions studied.Results of this study, therefore, established a collection of these seven polymorphic SSR primers, SSRY 9, SSRY 13, SSRY 35, SSRY 51, SSRY 66, SSRY 78 and SRRY 106, that could readily be used for genotype identification and genetic diversity studies in Kenyan cultivated cassava.One of these SSR markers, SSRY 51 is located at position N on genetic linkage map of cassava (Fregene et al., 2003).A few highly polymorphic SSR markers like SSR 66 with PIC of 75%, SSR 78 and SSR 106 all together with PIC of 60%, can be used in genetic studies of cassava.This would reduce the necessity for applying many SSR primers for the identification of cassava cultivars in Kenya and, hence, contributes to saving time and also cut the cost of research studies for genotype identification and genetic diversity studies in the species.
The genetic identity of Kenyan land races ranged from 0.2 to 1 while for IITA introductions, from 0.2 to 0.9.This shows that the Kenyan landraces are a rich source of diversity as compared to improved IITA introductions.Nonetheless, the relatively high genetic identity values for some landraces indicate close relatedness.Cassava is routinely propagated vegetatively and it is likely for two similar accessions to assigned different names.This might also stem from the fact that the Kenyan landraces were domesticated in the same ecological zones with narrow genetic base while the improved were obtained from different exotic sources that might have diverse ecological ranges.The diversity index which is the probability that two randomly selected alleles in a given accession are different, estimated by H e was 0.36 and the Shannon's index (I) was 0.53, indicating the average genetic diversity of cassava.Genetic diversity index (H e ) of cassava landraces and introductions was different from the Neotropics and Africa (0.46 to 0.62) but similar to that found in Guatemala as reported by Fregene et al. (2003).The reliability of estimates for genetic variation such as H e and I and genetic distances depends more on the number of loci than on the number of individuals sampled (Fregene et al., 2003).Estimates of genetic differentiation ranged widely from locus to locus, underscoring the danger of assessing SSR diversity with a small set of SSR markers.
In the distribution of cassava into clusters, it is important to note that most land races were clustered into group IV.This may indicate a common ancestry for the  local landraces.Since IITA is a secondary centre of diversity for cassava, this helps to explain why accessions from there were found in every cluster.This is due to the movement of cassava from centre of origin to other places within the region.Land races Nyakatanegi 3 and KME 1 together with accession 990072-A clustered together in group I and diverged from other accessions.This may be due to common ancestry or could be an indicator of duplicates.However, similarities in accessions can also arise due to convergent evolution, selection or sharing of common parentage.

Conclusion
The molecular study has also proved that SSR markers can be useful in breeding programmes of cassava allowing for the identification of new cultivars as well as assessment of genetic similarity/diversity among different genotypes.The average level of genetic diversity observed in the Kenyan landraces will benefit cassava germplasm conservation and enhancement efforts in Kenya, and contribute to the elucidation of forces that shape genetic differentiation in this asexually propagated allogamous crop.

Recommendations
The Kenyan landraces should be collected, conserved and maintained at the national cassava repository centre.Also, further introduction of IITA improved lines should be rationalized on the basis of their distance from the local landraces.

Figure 1 .
Figure 1.Lambda (λ) DNA standards (83-750 ng) and DNA samples isolated from 33 cassava accessions elecrtophoresed on 1% molecular grade agarose gel in TBE buffer (lane numbers represent in the same order the accessions in Table1b).

Figure 2 .
Figure 2. SSR alleles of 19 cassava accessions amplified by primer SSRY 35.M=100 bp ladder, lane numbers represent the accessions number in the same order as Table1b.

Figure 3 .
Figure 3. Dendogram of 69 cassava accessions based on SSR analysis of seven primers based on Nei and Li (1979) genetic distance.

Table 1 .
Sixty nine accessions of cassava sampled from the national genebanks of KARI-Katumani and KARI-Kakamega.
* Accessions that were used in the biochemical study.

Table 2 .
Properties of cassava SSR loci and their primer pairs.

Table 3 .
Number of alleles amplified by SSR primers in test cassava germplasm.

Table 5 .
Distribution of 69 cassava accessions into clusters based on SSR data.