Assessment of the genetic diversity of Kenyan coconut germplasm using simple sequence repeat (SSR) markers

Genetic diversity and relationship among 48 coconut individuals (Cocos nucifera L.) collections from the Coastal lowland of Kenya were analyzed using 15 simple sequence repeat (SSR) primer pairs. Diversity parameters were calculated using Popgene Software version 1.31. The gene diversity values ranged from 0.0408 (CAC68) to 0.4861 (CAC23) with a mean value of 0.2839. The polymorphic information content (PIC) values ranged from 0.0400 to 0.3680 with a mean value of 0.2348. Marker CAC23 had the highest PIC and revealed highest gene diversity values in this study. Analysis of the molecular variation indicated that within individual variation was 98% while among materials, variation was low at 2% suggesting that molecular variation was not defined by region of production. Cluster analysis was constructed using DARwin program version 6.0. Forty eight (48) coconut individuals were clustered into three groups.


INTRODUCTION
The coconut palm (Cocos nucifera L., Arecaceae) is a crop mostly cultivated in the coastal lowlands of Kenya and plays an important role in the economy.However, Kenya has been lagging behind in technology development for product diversification and by product utilization.Beside this, the coconut grows well in agroecological coastal lowlands (CL), where frequent droughts in these zones have been affecting the coconut yields with most trees drying up or tipping off.The unavailability of drought resistant varieties and poor access to quality planting materials are another major hindrance to growth and productivity in the coconut industry in Kenya (Muhammed et al., 2013).Furthermore, the slow growth and long pre-breeding period of palm does not promote the genetic enhancement of coconut palm for productivity and tolerance to biotic and abiotic *Corresponding author.Email: mauriceoyoo464@gmail.com.
Author(s) agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License stresses (Rajesh et al., 2008).Current estimate put total number of plants in Kenyan coastal lowlands at about 4.4 million with an average nut yield of 1.5 t ha -1 , while that of copra is as low as 0.45 t ha -1 . Germplasm collections, containing significant amount of genetic diversity within and among coconut populations, are essential for an effective crop improvement programme.Therefore, the assessment of genetic diversity within coconut populations becomes increasingly significant in germplasm conservation and utilization.Since morphological characterization in the coast of Kenya has shown diversity (Oyoo et al., 2015), the coconut genetic diversity in Kenya shall provide sufficient scientific data for germplasm management and utilization.Diversity analysis in coconut palm has been done by morphological traits, biochemical and molecular markers (Rajesh et al., 2005).Morphological and biochemical markers are not desirable due to the long juvenile phase of palms, high cost of conducting investigations, longterm of field evaluation, influence of environment factors on the phenotype and limited number of available phenotypic markers (Manimekalai et al., 2006).Molecular markers, detect variation at the DNA level and are detectable at all stages of development.Since they can also cover the entire genome, DNA markers overcome most limitations of morphological and biochemical markers (Ashburner et al., 1997;Teulat et al., 2000;Upadhyay et al., 2004;Manimekalai et al., 2006).Among available molecular marker techniques, simple sequence repeat (SSR) markers provide good signal in evaluating genetic diversity and genetic relationships in plants.The increased number of coconut SSR markers greatly improves the previously established genetic relationships among coconut populations.
Three types of coconut palms have been described in Kenya, the East African tall (EAT) Dwarf types and the hybrids.The EAT yield good quality copra and toddy but with inferior coconut water as compared to the Dwarf coconut (Gethi and Malinga, 1997).Just as in Asia, the Talls are naturally cross-pollinating types, vigorous growing, comparatively late to flowering and the fruits are with intermediate colors of brown, green, yellow, orange among individual palms.Dwarfs, in contrast, are naturally self-pollinating types with reduced growth habitat, early flowering and produce large number of medium to small, distinctly colored (green or yellow or orange or brown) fruits (Dasanayaka et al., 2009).To date, little information is available on the genetic diversity among Kenyan coconut palms.For sustainable breeding, adoption and conservation in situ, it is important to develop a strategy to use coconut diversity for socio-economic benefits (Batugal and Oliver, 2003).In the current study, 48 coconut individuals from four administrative units of Kenyan coast were assessed using 15 SSR markers.
In Kenya, coconut is cultivated in six counties; Kwale, Kilifi, Mombasa, Tana River, Lamu and Taita Taveta.With the exception of Taita Taveta, the other five counties have a high concentration of coconut tree population and a total coastline of 640 km of the Indian Ocean.Coastal Kenya covers areas from the sea with ample amount of rainfall to the far west with 600 mm of rainfall per year, usually with poor distribution.Rainfall is bimodal with major rainy season beginning in April and lasts till July.Short rainy season is seen in October to November (Jaetzold and Schmidt, 1983).The coastal lowlands zone is divided into five sub zones called coastal lowlands (CL): Cl2-CL6.These zones are sub-divided based majorly on topography, soil and water which influence agricultural development (Jaetzold and Schmidt, 1983).
The assessment of genetic diversity and structure of germplasm is essential for the efficient organization of breeding material.The objective of this research was to estimate genetic variation of coconut germplasm in Kenyan coast and suggest how that knowledge might lay the groundwork for the genetic improvement and breeding of coconut in Kenya.

Germplasm sampling
Coconut leaf samples were collected in situ and coded as previously described by Oyoo et al. (2015).Coding the palms sampled was done to reflect the county, district, division, sub division, village and the collection number.If collection was from Kilifi county, Magarini district, Magarini division, Gogoni location, Ngarite village and it was the first palm sampled, the code name given was: KLF/MAG/MAG/GOG/NGA/01. Sampling was done at different agro ecological zones (Table 1) and the GPS data was recorded.The focus was in areas where palms grown were morphologically different with a marked change in altitude or cropping systems, a formidable barrier such as a mountain or a river exists, or local people ethnically different (in terms of dialect) from previous collection site.The priority of taking sample was farmers' field while avoiding collecting duplicates.A major challenge was that farmers had no clear distinct local names for coconut at the coast and insisted 'nazi ni nazi tu' (coconuts are just The SSR primers were developed by Perera et al. (1999Perera et al. ( , 2000) ) and Teulat et al. (2000) coconuts).In addition, the general structure of parents and progeny was a mixture in a population.Estimating the age of the population was not easy.The fields were uncultivated with low input or none at all.The collected leaf samples were labeled and stored in silicon gels.

DNA extraction
DNA was extracted from dry leaf samples using a CTAB protocol modified from the Doyle and Doyle (1990) method.The modification involved omission of the ammonium acetate step and a longer DNA precipitation time of 12 h.The quality and quantity of the extracted DNA samples were ascertained by running them on a 1% agarose gel and by using a nanodrop spectrophotometer.The DNA samples were then diluted to a working concentration of 30 ng/µl.The primer sequences and associated information are given in Table 2.

SSR markers amplification and electrophoresis
The PCR amplification was performed in a 10 μl volume mix consisting of 5 U Dreamtaq polymerase enzyme (Thermo scientific corp, Lithuania), x 6 Dreamtaq buffer (Thermo scientific corp, Lithuania), 2.5 mM of each dNTPS (Bioneer corp, Republic of Korea), MgCl2, 5 μM of each primer (Inqaba biotec, S.A) and 30 ng DNA template in an Applied Biosystems 2720 thermocycler (Life Technologies Holdings Pte Ltd,Singapore).The PCR cycles consisted of initial denaturation at 94°C for 5 min followed by 35 cycles of denaturation at 94°C for 30 s, annealing at 54-59.7°C (depending on the primer), extension at 72°C for 1 min followed by one cycle of final extension at 72°C for 10 min.The amplicons were mixed with 6x Orange DNA loading dye (Thermo scientific corp, Lithuania) and separated on a 2% agarose gels (Duchefa, Netherlands) stained with Invitrogen life technologies ethidium bromide (Invitrogen corp, U.S.A) in a 0.5x TBE buffer.The separated amplicons were visualized on an Ebox-VX5 gel visualization system (Vilber Lourmat inc, France).The alleles were scored as absent or present based on the size of the amplified product using a 100 bp O'geneRuler ready to use DNA Ladder (Thermo Scientific Corp, Lithuania).

Cluster and principal component analyses
Genetic dissimilarities between all the germplasm was calculated using DARwin version 6.0 (Perrier et al., 2003;Perrier and Jacquemoud-Collet, 2006) using simple matching coefficient.The dissimilarity coefficients were then used to generate an unweighted neighbour-joining tree (Saitou and Nei, 1987) with Jaccard's similarity coefficient.Bootstrapping value of 1,000 was used.Principal component analyses were also estimated by using Darwin version 6.0.

Analysis of diversity parameters
The amplified SSR markers were scored as present (1) or absent (0), and then recorded into a binary matrix as discrete variables.Markers which could be amplified and interpreted consistently were omitted from the analysis.The diversity studies of the 48 coconut ecotypes (from different AEZ) were conducted using Popgene version 1.31.The following parameters were estimated: number of alleles per locus (N), polymorphic information content (pic), m* = major allele frequency, ne = effective number of alleles (Kimura and Crow, 1964), * h = Nei's (1973) gene diversity, I = Shannon's information index (Lewontin, 1972) and the percentage of polymorphic loci.Other diversity parameters measured were Ht = gene diversity in the total population, Hs = gene diversity within and between sub populations Gst = gene differentiation relative to the total population, Nm = estimate of gene flow from Gst, which in this case is equivalent to Wright F statistics (Fst) (Nei, 1978).Analysis of molecular variance (AMOVA) was computed using Powermarker software version 3 (Liu and Muse, 2005).[Kimura and Crow (1964)].*h = Nei's (1973) gene diversity; *I = Shannon's Information index [Lewontin (1972)]; Smith et al., (1997) polymorphic information content (PIC).The number of polymorphic loci is: 15.The percentage of polymorphic loci is 100.00%.

Diversity parameters
The major allele frequency value ranged from 0.5833 for CAC23 to 0.9792 for CAC68 with a mean of 0.8069 (Table 3).The number of effective alleles values ranged from 1.0425 to 1.9459 with a mean value of 1.4470.It was observed that marker CAC68 had the lowest values while marker CAC23 had the highest value.The gene diversity values ranged from 0.0408 to 0.4861 with a mean value of 0.2839.Marker CAC68 had the lowest value while marker CAC23 had the highest value, suggesting that CAC23 loci could be useful in revealing genetic diversity of coconut accessions in Kenya.This observation was also confirmed by Shannon's information index at locus CAC23 (p = 0.6792), which had the highest value as compared to the lowest value of p = 0.1013 at locus CAC68.Marker CAC23 also showed greatest PIC (0.3680%) as opposed to marker CAC68 which revealed the lowest PIC values (0.0400%).An average of 34.9 bands were amplified with the highest number of bands amplified being observed for marker CAC68.Least number of bands ( 29) was amplified by marker CN1H2.The weighted average of estimated haplotype diversities in the total population (Ht = 0.4884) and weighted average of estimated haplotype diversities in the subpopulation (Hs = 0.4686) were best revealed at the CAC23 locus.The lowest (Ht = 0.408 and Hs=0.0382) were recorded at the CAC68 locus (Table 4).The highest gene differentiation relative to the total population studied, Gst = 0.2016 recorded for marker CAC04, and the lowest Gst = 0.0378 was recorded at CN11E6 locus.The highest estimate of gene flow from Gst, Nm = 28.0937, was recorded for marker CAC56, while the least Nm = 1.9801 was recorded at the CAC04 locus Results presented in Table 5 also reveal that genetic similarity were very high in the subpopulations.It ranged from 0.9502 (genetic identity between Kilifi and Lamu counties) to 0.9891 (between Lamu and Kilifi counties).Genetic distances were very low among the subpopulations and they ranged from 0.011 (Kwale and Lamu) to 0.511 (Lamu and Kilifi).
Further, when the genetic variation of the coconut germplasm was partitioned by analysis of molecular variance (AMOVA), 2% of the variation was revealed among the germplasm that were produced within the counties (Table 6), while the within population variation was 98%, further suggesting genetic redundancy of coconut at the coastal Kenya on the basis of where grown.

Cluster analysis
The dendrogram was constructed from the genetic dissimilarities among all the individuals using DARwin 6.The 48 germplasm were clustered into three distinct groups (Figure 1).Groups II and III were further KLF/MAG/MAG/NGA/02 and KLF/MAL/MAL/GED/CHA/03. Group II was composed of germplasm sampled from Lamu county and a few materials from Kwale and three from Kilifi.Only one material, TR/TD/KIP/KIP/WAK/34 from Tana River clustered in this group, while the rest of materials from this county were clustered in group III.Group III consisted of germplasm from all the four counties; however samples from Kilifi formed the majority with up to six samples all clustering from node 76 to 80. Clustering failed to group the germplasm solely on the basis of the origin.
Principal coordinate analysis (PCoA) (Figure 2) was carried out to determine genetic relationship among the coconut germplasm studied.The first three axes accounted for the majority of total cumulative variation (61.64%) (Table 7) with the three axes accounting for 37.61, 14.02 and 10.02%, respectively.Individual counties total cumulative percent variations were higher than the overall trial percent cumulative variation.Total cumulative variation was highest in Tana River (97.88%); and it was explained by only two axes.The county was followed by Lamu (91.15%),Kwale (90.51%) and Kilifi (90.19%) in that order.These results could be corroborated by results (Figure 2) that shows that germplasm did not cluster around each other in Tana River.Coconut is an important tropical economic crop, but not many studies have been conducted in relation to diversity based on molecular markers.Due to a lack of available molecular markers, Akpan (1994) and Sugimura et al. (1997) indicated that previous assessment and characterization of coconut germplasms relied mostly on morphological and agronomic traits.However, these traits do not generally provide an accurate measure of genetic diversity, because many show complex inheritance and are easily influenced by the external environment.To mitigate, this problem (Rivera et al., 1999) described the development and characterization of microsatellite markers for Cocos nucifera using genomic-SSR hybrid screening.Recently, Xiao et al. ( 2013) used computational methods to utilize sequence information from next-generation sequencing as described by Fan et al. (2013) to develop SSR markers.
UPGMA was drawn to visualize relationships among 48 coconut studied.The dendrogram of cluster analysis of individuals clearly illustrated population structure of the tested varieties.It showed the presence of subpopulations in all the cluster and different levels of similarity between the localities.It however failed to cluster germplasm per counties grown as expected.This could be as a result of farmers exchanging planting materials within the coastal Kenya.As described by Aremu et al. (2007), principal component analysis (PCA) reveals the pattern of character variation among genotypes.In this study, PCA failed to cluster coconut materials based on their counties where they are grown.This could be explained by the low coefficient of Grp III variation realized in coconut at the Coastal Kenya, probably suggesting high level of genetic similarity and very low genetic distances as shown in Table 6.This conclusion could also be deduced from the analysis of molecular variance (Table 7), which suggested that more variation existed within individual coconut as opposed to among groups based on the areas where they are produced.
PIC provides an estimate of discriminatory power of a marker by taking into account not only the number of alleles at a locus, but also their relative frequencies.Table 4 shows that loci CAC23 (0.368), CAC10 (0.3374), CCZ01 (0.3639) and CAC04 (0.3457) revealed a higher PIC levels than other loci, indicating that these primers are suitable for detecting the genetic diversity of coconut accessions in Kenyan Coast These markers also revealed a higher weighted average of estimated haplotype diversities in the total population (Ht).The Ht values for these markers as compared to others are relevant in view of the fact that there are only two alleles per locus.Furthermore, SSR analysis also showed that the mean gene flow, Gst of coconut in Kenya was 0.0829, suggesting that the gene exchange between individuals was minimal.Based on the fact there are  only two alleles per locus in this study, Gst is identical to Fst, Wright F statistics.Gst is indepen-dent of gene diversity within subpopulation and can be used to compare differentiation in different organisms.For this reason, Hs could be very small but the absolute Gst is small.Genetic analysis (Table 6) also revealed that these genetic similarities were very high while genetic distances were very low.This is an indication that coconuts within a region were not genetically diverse from those of the other regions.Coconut germplasm were sampled from random mating groups or individuals, thus gene diversity may be replaced with heterozygosity while gene identity may be referred to as homozygosity.

Conclusion
Coconut sampled from different ecological zones failed to cluster per county or AEZ from where they were sampled.Molecular variability within the individuals was high while within individuals was low.This is an indication of the narrow genetic diversity in Kenya.Coconut breeding scheme should therefore be initiated to enhance variability.Such a scheme should include introductions from other areas while conserving the Kenyan coconut germplam for future crop improvement.

Figure 1 .
Figure 1.Dendrogram based on genetic dissimilarities between 48 individuals sampled at the Kenyan coast.Numbers represent bootstrap confidence limits for 1000 replicates.

Figure 2 .
Figure 2. PCoA of axes 1 and 2 based on dissimilarity of 15 loci across 48 coconut individuals from different counties.

Table 2 .
List of coconut-specific microsatellite primer pairs with their forward and reverse sequences.

Table 3 .
Diversity parameters for 15 SSR loci used to analyze genetic diversity of coconut germplasm in Kenya.
m* = Major allele frequency; *ne = Effective number of alleles

Table 4 .
Nei's (1978)analysis of gene diversity in subdivided and total populations.Gene diversity in the total population; Hs = gene diversity within and between sub populations; Gst=gene differentiation relative to the total population; Nm estimate of gene flow from Gst.

Table 6 .
Analysis of molecular variance (AMOVA) for 48 coconut individuals sampled from four counties of the coastal lowlands of Kenya.

Table 7 .
Eigen value and percentage of total variation accounted for by the first three principal component axes.