Genetic diversity of blackberry ( Rubus subgenus Rubus Watson ) in selected counties in Kenya using simple sequence repeats ( SSRs ) markers

Genetic diversity of blackberry (Rubus subgenus Rubus Watson) is essential for efficient breeding and improvement of its pomological traits and yield. In this research, simple sequence repeats (SSRs) were used to determine the genetic diversity of 90 blackberry accessions collected from six different counties in Kenya. From 11 SSR markers used to genotype the blackberry accessions, a total of 127 alleles were generated. The average number of alleles (A) per locus was 4.00 while the expected heterozygosity (HE) of the SSR loci varied between 0.34 and 0.50, with a mean of 0.467. Polymorphism information content (PIC) values ranged from 0.357 to 0.753 with a mean of 0.520. HE of the blackberry accessions were higher than the observed heterozygosities (HO), having 0.75 and 0.64, respectively. Analysis of molecular variance (AMOVA) revealed 95% variability within accessions and 5% (P<0.01) among accessions. Cluster analysis using the Jaccard’s similarity coefficient grouped the accessions into three classes: I, II and III, consisting of 31, 52 and 7 accessions, respectively. The clustering was random and did not group the accessions according to their geographical origin, indicating that accessions found in Kenya are closely related. This study detected considerable levels of genetic diversity within the analyzed accessions, which could be exploited in a blackberry breeding program.


INTRODUCTION
Genetic diversity of plant species is important to their improvement and provides beginning to accruing benefits of genomics research, counteract genetic erosion and understand evolutionary relationships that lead to design of genetic conservation and breeding strategies (Mason et al., 2015;Jacob et al., 2017).As such, genetic diversity is vital for incorporation of informed breeding methods into crop breeding operations which is vital to the improvement of plant genetic resources.Conventional breeding of blackberry (Rubus subgenus Rubus Watson) is expensive, time-consuming and labour intensive.Advances in molecular techniques would improve on the efficiency, accuracy and cost of breeding this fruit crop.There is need for use of DNA information, simulate the available breeding utilities, identify efficient application schemes, have access to effective services in DNAbased diagnostics and integrate DNA information into breeding operations and decisions (Brennan et al., 2014;Peace, 2017).Genetic diversity based on DNAinformation has greatly improved breeding of crops by identifying relatedness and phylogeny and by unambiguously ascertaining germplasm identity, verifying and deducing its paternity/parentage, pedigree and distant ancestry (Alice et al., 1997;Ru et al., 2015).
Blackberry has its centre of origin in Eurasia and Northern America and is widely present as wild types in Kenya (Clark et al., 2007).However, their conservation and breeding status is under threat due to deforestation.The cultivars in Kenya are selections from South Africa and North America.Variation exists in the ploidy levels in wild and cultivated blackberry genotypes and range from 2n=2x=14 to 2n=18x=126 including odd-ploids and aneuploids (Meng andFinn, 1999, 2000).The wild types are the sources of genetic diversity.They also act as potential sources of breeding materials for blackberry breeding programs, although sometimes they act as sources of natural pests and predators that affect the blackberry crop (Graham et al., 1997).The plant introductions (PIs) on the other hand influence the genetic diversity of natural populations by way of gene loss and transfer by pollen.
Blackberry is rich in antioxidants, flavonoids and phenolic compounds and is considered as anticarcinogenic against oral, oesophageal and colon cancers (Ames et al., 1993;Moyer et al., 2002;Bowen et al., 2010;Overall et al., 2017).These beneficial health effects are associated with their antioxidant and antiinflammatory properties and chemopreventive phytochemicals such as flavonols, phenolic acids, ellagic acid, vitamins C and E, folic acid and b-sitosterol (Tulio et al., 2008).There is growing interest in the fruit crop in diets due to its pharmacological properties and health benefits.Stafne and Clark (2005) conducted a study on the relatedness of North American blackberry species using the coefficient of relationships to determine the genetic similarity (GS) of these cultivars based on pedigree analysis and detected a coefficient of relationship of 0.00 to 0.74.The apparent high levels of maximum potential similarities and coefficient of relationship in this research were attributed to higher levels of hybridization in the released cultivars.Most of the studies on genetic diversity of the Rubus species have been done in raspberry: Rubus idaeus (Parent et al., 1993;Graham and McNichol, 1995;Graham et al., 1997), Rubus occidentalis (Parent and Page, 1998), Rubus alceifolius (Amsellem et al., 2000), hybrids of Rubus idaeus and Rubus caesius (Alice, 1997), R. occidentalis (Dossett et al., 2012) and Rubus buergeri (Miyashita et al., 2015).These studies used Random Amplified Polymorphic DNA (RAPD), Restriction Fragment Length Polymorphism (RFLP), and Sequence Characterized Amplified Region (SCAR), Internal Transcribed Spacer (ITS) and Single Sequence Repeat (SSR) markers.The use of markers has made it possible to confirm Rubus and hybrids phylogeny and understand their evolution (Alice, 2002).As a result, there has been increased interest in using molecular markers to facilitate blackberry breeding.Multiplexed DNA fingerprinting, characterization of germplasm, development of primers, genetic maps and blackberry expressed sequence tag (EST) libraries, marker-assisted seedling selection, and Quantitative Trait Loci (QTL) mapping have been used in different DNA based studies in blackberry (Lewers et al., 2008;Castillo et al., 2010;Castro et al., 2013;Bassil et al., 2016).
Challenges hindering breeding of blackberry include lack of information on characterization of the genetic diversity and/or population structure within the present breeding programs and repositories, difficulty in identifying duplicate accessions in germplasm repositories, searching for promising heterotic groups and selection of core collections.There are no improved cultivars as most blackberry types in Kenya are wild except for only two introductions (CV/RBN/01 and CV/BYN/01).The objective of this study was to determine the genetic diversity of wild blackberry types in six counties in Kenya and two plant introductions (PIs) using SSR markers.This study is aimed to resolve the taxonomic uncertainty of duplicate accessions in in-situ and ex-situ blackberry gene banks and to document any amount of genetic diversity of the local blackberry.

Plant/Collection of germplasm
The blackberry samples taken were coded to reflect the county, district, division, subdivision, village and the collection number (Supplementary Information 1).Since most of the blackberry collected were either wild or named by farmers or by the communities at different times, it is difficult to explore their genuine distinct names and pedigree.Blackberry types from different nurseries were treated as independent cultivars in this research irrespective of their phonological stages.

Genomic DNA isolation and quantification
Total nucleic acid (DNA) was extracted from each dry young leaf using a modified CTAB protocol (Doyle and Doyle, 1990) for all the 90 accessions.The modification involved omission of the ammonium acetate step.
Overnight DNA precipitation time of 12 h was preferred since blackberry leaf samples have a lot of phenolic compounds.The concentration and purity of the extracted DNA samples were ascertained by using a NanoDrop spectrophotometer-ND 1000 (Thermo Fisher Scientific Inc., USA) and by resolving on 1% agarose gel (1 g of agarose powder in 100 ml of sodium borate buffer).Samples with poor quality DNA were reextracted.The DNA samples were then diluted to a working concentration of 50 ng/µl.All samples exhibited good quality and quantity of DNA for PCR amplification.

PCR amplification and microsatellites analysis
Eleven out of thirteen available blackberry SSR primer sets previously described by Castillo et al. (2010) were selected and used to screen 90 blackberry accessions in this study.Primer RhM031 was uninformative while RiG001 failed to amplify any blackberry and hybrid accessions and was used to identify raspberry genotypes.Subsequently, they were exempted in SSR data analysis.
The SSR primer pairs and sequences are shown in Table 2.The extracted DNA was then subjected to polymerase chain reaction (PCR).PCR amplifications were performed in a 10 µl volume consisting of 1.4 µl × 10 PCR buffer (Thermo Fisher Scientific Inc., USA), 0.1 µl Taq polymerase (Thermo Fisher Scientific Inc., USA), 0.8 µl each of 10 pmol forward and reverse primers (Inqaba biotech, S.A), 0.60 µl of 25 mM MgCl 2 , 4.3 µl of double distilled de-ionized water (ddH 2 O) and 2 µl of genomic DNA.Amplification was performed in an Applied Biosystems 2720 thermocycler (Life Technologies Holdings Pte Ltd, Singapore).The amplification was performed under the following conditions: 94°C for 5 min, 35 cycles of denaturation at 94°C for 30 s, annealing temperature of 50 to 62°C (Ta, depends on the sequence of the primer) for 30 s, and initial extension at 72°C for 2 min, followed by a terminal extension at 72°C for 10 min.

Gel electrophoresis of PCR products
The PCR products were mixed with 6× Orange DNA loading dye (Thermo scientific Corp, Lithuania) and separated on 3% agarose gels (Duchefa Biochemie B.V., The Netherlands) stained with 3 μL ethidium bromide (Invitrogen Corp, U.S.A) in a 1× Sodium Borate (SB) buffer at 60 V and a current of 400 mA for 2 h.The separated amplicons were visualized under an Ebox-VX5 gel visualization system (Vilber Lourmat Inc, France).
The alleles were scored as absent or present based on the size of the amplified product using a 100 bp O'geneRuler ready to use DNA Ladder (Thermo Fisher Scientific Inc., USA).

Analysis of microsatellite marker data
Molecular data were recorded in binary fashion for SSR marker loci amplified.Individuals were scored for the presence (1) or absence (0) of each allele which was   RiG001R CGCTTCTTGATCCTTGACTTGT treated as a separate locus.The similarity and/or dissimilarity between individual accessions were calculated as a proportion of shared alleles by using DARwin version 6.0 (Perrier et al., 2003;Perrier and Jacquemoud-Collet, 2006) using simple matching coefficient.The dissimilarity coefficients were then used to generate an unweighted neighbour-joining tree (Saitou and Nei, 1987) with Jaccard's Similarity Coefficient with a bootstrapping value of 1,000 by using DARwin 6.0.PowerMarker Version 3.25 (Liu and Muse, 2005) was used to calculate statistics on major allele frequencies (M) and polymorphism information content (PIC) (Botstein et al., 1980) of the SSR primer sets; the genetic distance matrices were computed using PowerMarker with the proportion of shared alleles distance, Dsa (Chakraborty and Jin, 1993): (1) where pij and qij are the frequencies of the i th allele at the j th locus, m is the number of loci examined, aj is the number of alleles at the j th locus.PowerMarker 3.25 (Liu and Muse, 2005) was also used to calculate deviations from Hardy-Weinberg equilibrium (HWE), effective number of alleles (AE) (Kimura and Crow, 1964), observed heterozygosity (Ho) and expected heterozygosity (HE)  (Nei, 1973), inbreeding coefficient (FIS), pairwise genetic distance between populations (FST) (Nei, 1978).GenAlEx 6.5 (Peakall and Smouse, 2012) was used to calculate Shannon's summary statistics and diversity (I) (Lewontin, 1972) and analysis of molecular variance (AMOVA).

DNA quantification, gel electrophoresis and analysis
All samples extracted exhibited good quality and quantity of DNA for PCR amplification (Plate 1).This was ascertained using a NanoDrop Spectrophotometer at a wavelength of 260/280 nm and at an absorbance ratio of 1.8 to 2.0.Contamination by either proteins or phenolic compounds was minimal in this study.

Diversity indices of SSR loci in blackberry accessions
The effective number of alleles (AE) per microsatellite locus varied from 1.514 (RhM021) to 1.994 (RhM003 and RiM019) with an average of 1.882 (Table 3).The average value of Shannon's diversity index (I) across the primer sets was 0.656 and ranged from 0.523 (RhM021) to 0.692 (RiM017 and RhM043).The average observed heterozygosity and expected heterozygosity values were 0.542 and 0.567, respectively.The least HO was 0.189 (RhM019) while the highest was 0.878 (RiM036 and RhM003).Among the blackberry accessions, HE ranged from 0.34 to 0.502 for marker RhM021 and RhM043, respectively while inbreeding coefficient (FIS) ranged from -0.863 to 0.711 for markers RhM003 and RiM019, respectively.The pairwise genetic distances (FST) ranged from 0.003 for RhM003 and the highest was detected by RiM017.This study revealed moderate to significant differentiation (0.05> FST≥0.15)within the blackberry accessions (Table 4).Additionally, high PIC values were observed for markers RiM019, RiM017, RhM043, RiM015 and RhM001 (Table 3).These markers also had the highest gene diversity indices.

Diversity Indices of blackberry accessions
The effective number of alleles per locus (AE) varied from 1.646 in accession NAK/NJR/NES/NES/TRT/01 to7.563 in accession KCO/CBA/CHY/CHY/UNL/02 with a mean of 1.438 (  4).The average number of alleles per locus (A) for all blackberry populations obtained from all regions was 9.26.
Based on geographical origin, the Polymorphic Information Content (PIC) of 0.794 was observed in accessions from Republic of South Africa, while the highest PIC value of 0.631 was detected in accessions    The hierarchical subdivision of the summary of Shannon's statistics indicated that most molecular variance was within populations accounting for 90.57% of the total genetic variation with only 9.43% of the molecular variation to the defined counties (Table 6).Only 9.43% of the molecular variance distinguished the six populations from Nakuru, Kericho, Nandi, Laikipia, Uasin Gishu and the RSA (P<0.01).The analysis of molecular variance (AMOVA) for blackberry partitioned the genetic variance among and within the accessions and revealed that most (95%) of the variability was within the accessions (Table 5).
The genetic variance was significant (P<0.01)among the accessions and accounted for 5% of the total variation.

Cluster analysis and population structure
UPGMA dendrogram generated from SSR marker information using Jaccard's similarity coefficient showed phylogenetic relationships among the 90 blackberry accessions.The phylogenetic tree was divided into 3 distinct clusters.However, the cluster analysis failed to clearly cluster the accessions based on their regions of collection (Figure 5).The results showed that accessions collected from different counties, especially those from Kericho County (group II), clustered together.Principal Coordinate Analysis (PCoA) confirmed results from the cluster analysis and showed that most accessions overlapped (Figures 2 and 3).The first three axes accounted for 55.48% of the total variations with each axes explaining 30.04, 13.53 and 11.91% of the variation, respectively at  95% confidence interval (Figure 4).

DISCUSSION
The observed heterozygosity (HO) and expected heterozygosity (HE) were estimated to show the level of polymorphism and usefulness of the SSR markers used in this study (Table 3).The HE ranged from 0.341 (RhM021) to 0.502 (RhM043) which had the highest heterozygosity index whilst observed heterozygosity (HO) ranged from 0.189 (RhM019) to 0.878 (RiM036 and RhM003) which is the highest.Studies independently conducted by Marulanda et al. (2007) and Castillo et al. (2010) had HE values vary from 0.00 to 0.33 and 0.21 to 0.98, respectively.The HE range of 0.41 to 0.90 in the accessions revealed genetic diversity in almost all the populations of the blackberry studied although low HO were also observed (HO=0.27) in some accessions.Often, high HE values are observed when wild populations are grown in close proximity to cultivated populations, and this may explain the high HE values obtained in the cultivated types.The HE values obtained in this study ranged between 0.41 and 0.90 and according to Nybom (2004), these values are within the range of long lived perennials (HE=0.68),although some may be endemic to their areas of collection, hence, limited distribution (HE=0.42)and others, dispersed by gravity (HE=0.47).The low HO values obtained in some blackberry accessions could be as a result of imbalanced population sampling among the different regions of germplasm collection both in wild and cultivated blackberries.
The average FST value of 0.057 obtained in this study shows the presence of heterozygotes in the blackberry accessions.The pairwise genetic distance between populations (Nei, 1978) ranged from 0.003 (RhM003) to 0.171 (RiM017) based on the SSR markers (Table 4).The pairwise genetic variation (FST) generated from this study indicates moderate to significant differentiation (0.05>FST≥0.15)within the blackberry accessions or, in this case, between and within wild and cultivated blackberry types.The multilocus estimate of genetic distance (FST) based on SSR loci also revealed that there were genetically distinct accessions with RiM017 (FST = 0.17) and RhM011 (FST = 0.14) being the best markers for identification of admixtures.The hybridity in these accessions can be maintained if the accessions are propagated using clones.
The inbreeding coefficient, determined by the Wright's (1978) fixation index (FIS), which is a measure of heterozygote deficiency or excess ranged from -0.863 (RhM003) to 0.711 (RiM019).Only one marker (RiM019) showed some evidence of excessive inbreeding (FIS ≥ 0.5).Most of the accessions showed moderate to high inbreeding levels (Table 4).This may be explained by the reproductive and invasive nature of the blackberry genotypes.Most invasive plants are clonally propagated and are usually self-compatible which could also lead to increased inbreeding levels and decreased variations (Amsellem et al., 2000;Liu et al., 2006).Inbreeding levels in invasive species can sometimes be synonymous with clonal propagation, where a species grows vigorously enabling faster spread.In such cases, the molecular variations obtained in the clonal invasives can be due to characteristics other than genetic diversity.This could also infer availability of polyploids among the accessions and subsequent dispersal mechanism across the counties of germplasm collection.Some of the regions of collection were geographically adjacent and could be considered as one large single population.Apart from the reproductive nature (clonal), blackberry genetic variability is also determined by the effect of cross-pollination between polyploid species which in turn, influences the seed and fruit quality, whilst increasing the ploidy levels and taxonomic proximity (Kollmann et. al., 2000).Some outcrossing accessions were observed (those with negative FIS) (Table 5).These accessions also had the highest HE and HO indices (genetic diversity) and could be selected as parents in a breeding program as they have the greatest genetic diversities.AMOVA revealed significant differences (P≤0.05) in partitioning genetic variances within and among the accessions.SSR markers showed greater divergence within than among the accessions (Table 5).The genetic variance within the blackberry accessions was 95% with an estimated variation of 4.12.Summary of Shannon diversity statistics also showed greater variability within than among population genetic diversity, accounting for 90.57and 9.43%, respectively (Table 6).This illustrates that much of the genetic diversity in blackberry accessions found in Kenya resided within the blackberries.In a study to evaluate the genetic diversity of wild and cultivated Rubus species in Colombia using AFLP and SSR markers, Marulanda et al. (2007) detected a considerable within population SSR variation of 80.4%.The AMOVA showed less estimated variation among accessions in different regions (0.198) accounting for only 5% of the total variation.The lower genetic diversity among the accessions may be attributed to a limited number and frequency of plant introductions, method of reproduction, in this case, clonal, frequent self-fertilization and method of dispersal that can result in redundancies especially in geographical locations of close proximity.Blackberry is often an invasive plant in nature and with multiple introductions, invasive plants tend to exhibit high levels of genetic diversity (Roman and Darling, 2007) and thus, among accessions estimated variance may be due to fewer introductions into their native habitat.An UPGMA dendrogram generated by Jaccard's similarity coefficient grouped the accessions into three clusters: I, II and III consisting of 31, 53 and 7 accessions, respectively (Figure 5).All the three clusters had sub-clusters, indicating high levels of intra-accessions heterogeneity.
Group II consisted mainly of the accessions from Nakuru.The cultivated blackberry cultivars were also clustered in this group.There was no grouping in all accessions on the basis of area of collection.This can be explained by the diverse folk nomenclature in the collection areas, which in turn influences redundancies in germplasm distribution, method of dispersal of the germplasm, and outcrossing nature of some blackberry species.Geographically, adjacent areas may have had the same types of blackberry accessions, with discriminant differences used during germplasm for molecular characterization sampling being due to environmental effects.Additionally, the invasive nature of the blackberries could have been a major driver in the results aforementioned.
Pattern recognition using PCoA failed to group accessions according to their areas of origin suggesting high levels of uniformity across the geographical locations of germplasm collections (Figures 3 and 4).This is however not synonymous with higher homozygosity or narrow genetic bases for the blackberry species found in Kenya.This is because PCoA conducted solely on accessions from each region where the accessions were collected revealed considerably genetic diversity within the accessions (Figure 2).Blackberries have a varied genetic base that includes numerous species and there could be a selective advantage of heterozygotes as shown by the results obtained in this study.

Conclusion
There exists considerable genetic diversity in each county on the blackberry accessions studied.However, between one county and the other, low indices of diversity were observed.Findings from this research revealed that even with hybridizations and inbreeding depression, there is still a wide array of genes to be explored in breeding blackberry in Kenya.The best markers for genotyping blackberry from this study were RiM017, RhM043, RiM015 and RhM001.The population structure/genetic diversity, identification of duplicate accessions in Gene Banks and selection of core collections of blackberry in Kenya is now achievable.

Figure 1 .
Figure 1.Map showing occurrence of blackberry germplasm in 6 counties in Kenya as inTable 1generated using ArcGIS.

Figure 2 .
Figure 2. PCoA of axes 1 and 2 based on dissimilarity of 11 SSR markers across 90 blackberry accessions from different regions in Kenya.

Table 2 .
Sequences, annealing temperatures and size of bands of sets of 13 primers used to screen 90 blackberry accessions collected from different regions in Kenya.

Table 3 .
Estimates of genetic diversity of SSR loci used to screen 90 blackberry accessions sampled from 6 counties in Kenya.

Table 4 .
Genetic diversity indices for a population of 90 accessions of blackberry studied from 6 counties in Kenya.

Table 5 .
Analysis of Molecular Variance (AMOVA) of the diversity of 90 blackberry accessions collected from selected counties in Kenya.

Table 6 .
Shannon statistics summary of the 90 blackberry accessions sampled from selected counties in Kenya.