The use of multiplexed simple sequence repeat ( SSR ) markers for analysis of genetic diversity in African rice genotypes

Rice is an emerging food and cash crop in Eastern Africa. Thousands of germplasm accessions have been introduced from major rice breeding centers, such as the International Rice Research Institute (IRRI), and Africa Rice but the genetic variability among the introduced rice germplasm is unknown. Knowledge on genetic diversity would be useful in designing measures for comprehensive breeding and conservation. To address this knowledge gap, 10 highly polymorphic rice simple sequence repeat (SSR) markers were used to characterize 99 rice genotypes to determine their diversity and place them in their different population groups. The SSR markers were multiplexed in 3 panels to increase their throughput. An average of 15.9 alleles was detected, ranging from 6 alleles detected by marker RM7 to 30 by marker RM333. The UPGMA dendogram based on Nei’s genetic distance cluster analysis, revealed 5 genetic groups among the genotypes tested. Analysis of molecular variance indicated that 97% of the diversity observed was explained by differences in the genotypes themselves, and only 3% was due to the sources from which the genotypes were obtained. This study sets the stage for further diversity analysis of all the available germplasm lines using SSR markers to ensure effective utilization and conservation of the germplasm.


INTRODUCTION
Rice is an important food and commercial crop in Africa but in a country like Uganda, domestic consumption is higher than production (FAOSTAT, 2012).Uganda's rice cultivars (Oryza sativa, 2n = 24, AA) include NERICA lines, landraces, and varieties developed by the Cereals Program of Uganda's National Crops Resources Research Institute (NaCRRI).In an effort to identify the best and most diverse candidates with resistance to local stresses that may provide rapid genetic improvement and be incorporated into Uganda's national rice breeding efforts, *Corresponding author.E-mail: bonnymickael@gmail.com.
Author(s) agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License NaCRRI breeders have imported thousands of rice lines from the International Rice Research Institute (IRRI) and AfricaRice center and evaluating them alongside locallyadapted lines and breeding materials.These available germplasm lines are morphologically diverse but lack adequate documented information on their genetic potential and diversity.This deters their apt utilization as a potential source of desired genes and their effective conservation for future use.Constraints to rice production in Uganda include both biotic and abiotic stresses, and these are frequently evolving.This requires genetically diverse materials to check genetic erosion resulting from the continued adoption of only particular varieties and thereby maintain and/or increase the region's rice production.
Genetic diversity in rice germplasm can be assessed by both observed morphological traits and molecular markers (Chakravarthi and Naravaneni, 2006).Though morphological traits have been used as markers for assessing genetic diversity in the past, they are often influenced by the environment, limited in number and are therefore unreliable in themselves (Miller et al., 1989).Different types of genetic (DNA) markers are available nowadays including random amplified polymorphic DNA (RAPD), amplified fragment length polymorphisms (AFLP), restriction fragment length polymorphisms (RFLP), single nucleotide polymorphisms (SNP), simple sequence repeats (SSR) among others.Each marker type deferrers in cost, time requirement, degree of polymorphism detected, application and principle.SSRs are the preferred markers for genetic analysis in rice and have been used for a number of studies (Akagi et al., 1997) because of their abundance and well distribution throughout the genome.As genetic markers, they are codominant, detect high levels of allelic variation, and can be multiplexed to increase the throughput (Guichoux et al., 2011).They are also technically efficient, costeffective and able to analyze both indica and japonica rice groups (McCouch et al., 2002).Morishima and Oka (1981) divided the cultivated species of rice into two groups; indica and japonica.The domestication process was believed to have caused the difference between these two groups, including their reproductive barriers (Harushima et al., 2002).Furthermore, three morphological groups were described by ecological distribution; tropical japonica, temperate japonica, and indica (Glaszmannand Arraudeau, 1986).McCouch et al. (2002) developed a high-density rice genetic map from fully sequenced BAC (Bacterial Artificial Chromosome) and PAC (P1-derived Artificial Chromosome) clones representing 83% of the total rice genome BAC (Bacterial Artificial Chromosome) and PAC (P1-derived Artificial Chromosome).These consist of 2340 validated markers and the information was integrated into gramene, a comparative grass genome database (http://archive.gramene.org/markers/microsat/ssr.html) to increase the density and utility of the SSR map in rice (McCouch et al., 2002).
In this study, the utility of this vast genetic resource was evaluated by identifying SSR markers reliable for use in rapid molecular characterization of the collection of rice genotypes available in NaCRRI.As a pilot project in the program, only one marker was picked per rice chromosome basing on its degree of polymorphism as reported by other rice researchers (Drame et al., 2011;Chakravarthi and Naravaneni, 2006;Ni et al., 2002).This was aimed at detecting genetically diverse lines and classification into their different groups.

Plant material
Ninety nine rice genotypes (Table 1) were used in this study.They included the local varieties; Supa, Kaiso, Sindano, NERICA 1, NERICA 2 and NERICA 10, and introductions from IRRI and AfricaRice, interspecific and intraspecific breeding lines from NaCRRI Cereals Program, and collections from Ugandan farmers.The study lines were selected on the basis of their phenotypic diversity observed in the field.They were planted in NaCRRI in central Uganda (located at 000 32' N latitude and 320 53' E longitude, and an altitude of 1,150 m asl).After one month of establishment, fresh young leaves were harvested from each genotype for DNA analysis.

DNA extraction and quantification
The genomic DNA was extracted from about 100 mg of frozen leaf tissue at the Forestry and Agricultural Biotechnology Institute (FABI) of the University of Pretoria, South Africa using the Qiagen DNeasy plant kit (QIAGEN, 2006).Aliquots of 10 μl of the freshly extracted DNA were stained with Syber green and electrophoresed on 1% agarose gel then visualized under a UV transilluminator (BioRad) to assess their quality.The concentration of DNA in the samples was determined using a Nanodrop D-1000 spectrophotometer, and then diluted to 5 ng/μl prior to use (QIAGEN, 2006).

Polymerase chain reaction (PCR)
The diluted DNA samples were amplified using 14 SSR markers (RM1, RM154, RM7, RM261, RM249, RM3, RM125, RM223, RM316, RM333, RM206, RM20A, RM273 and RM252) selected on the basis of their polymorphism level reported in rice by Chakravarthi et al. (2006) and Drame et al. (2011).The sequences of these molecular markers were obtained from the Gramene website (www.gramene.org/microsat/2013).The primers were synthesized by Inqaba Biotechnologies Inc (Pretoria, South Africa).PCR was done to confirm amplification and polymorphism of the markers in ten randomly selected DNA samples prior to labeling with fluorescent dyes.The PCR amplifications were carried out separately for each marker, in a 96-well DNA Engine Peltier Thermal Cycler (Biorad).The total volume of 10 μl PCR mix was constituted by 5 ng/μl DNA, 1x PCR buffer (Fermentas: 10 mM Tris-HCl pH 8.3, 50 mM KCl, 1.5 mM MgSO4, 1.5 mM MgCl2), 2 mM dNTPs mix, 0.2 μM each of the forward and reverse primers, 25 mM MgCl2, and 0.5U Taq polymerase (Fermentas).The PCR program used was 94°C for 5 min followed by 35 cycles of denaturation at 94°C for 30 s, annealing at 55°C for 30 s and extension at 72°C for 30 s.A final extension step at 72°C for 5 min was included.The markers such as RM1, RM261, RM3 and RM316 which were not as polymorphic as expected were not labelled.

Multiplexing and genotyping
The forward primer for each of the remaining polymorphic markers was labeled at the 5' end with one of the following fluorescent dyes: 6-FAM (Blue), VIC (Green), NED (Yellow), and PET (Red) (Life technologies, USA).To increase the throughput of standard SSR analysis, which yields genotypic information at only one locus per reaction, multiplex PCR was done to boost genotyping by amplifying three or four loci in the same reaction (Guichoux et al., 2011).Primers that have the same annealing temperature were given different dye colors since the alleles they amplify overlap (Masi et al., 2003).These multiplexed primers were used together in the same PCR cocktail to make one panel (Table 2).

Capillary electrophoresis and allele calling
The MicroAmp plates were fed into a 24 capillary genetic analyzer 3500 GeneScan (Life Technologies, USA).Four injections into each 96-well plate, lasted 45 min each.The fragments generated were analyzed using GeneMapper V4.1 software (Life Technologies) to score exact lengths of the alleles (Applied Biosystems Inc. USA).A single peak was detected at a particular SSR locus for a homozygote or a pair of peaks for a heterozygote.The internal lane size standard GeneScan-500 LIZ was used to automatically calculate the fragment sizes that range from 35 to 500 base pairs.The size standard peaks corresponding to 35 to 90 base pairs were excluded from the analysis because of their proximity to primer peaks (Life Technologies, 2014).To reduce genotyping errors and increase the precision of allele sizing, the auto-bining method of GeneMapper V4.1 was used to create bins which represented mean sizes of alleles (in base pairs) of a particular allele category.These adjusted bins were embedded into GeneMapper V4.1 and were used to correct allele sizing (allele calling) when data was reanalyzed (Life Technologies, 2014).

Data analysis
The population structure of the samples/genotypes was determined using a model-based program, STRUCTURE V2.3.4 (Pritchard et al., 2000).The maximum number of populations for the simulation was set at 10 (K=10), the length of Burnin period at 5,000 and the number of Markov chain Monte Carlo (MCMC) repetitions after analysis equal at 50,000.The Admixture model of the program was used and allele frequencies were correlated.The limit for assigning a sample to a particular population was set at 75% genomic ancestry.The genetic distance between the genotypes was calculated with DARwin V5 software, using simple matching method.A phylogenetic tree was built using the Unrooted Weighted Neighbor-Joining (UWNJ) algorithm.Statistical parameters were set using GenAlEx 6.5 software Peakall and Smouse, 2006), to define molecular diversity, such as heterozygosity, fixation index, Shannon's information index and the analysis of molecular variance (AMOVA).

Allelic diversity of the rice genotypes
The SSR markers used were able to determine determine diversity in the rice genotypes (Table 3).All 10 markers were polymorphic, detecting a total of 159 alleles ranging from 6 to 30 alleles per locus with an average of 15.9±0.9alleles/locus, clearly indicating that this set of 10 markers revealed a high level of genetic variability throughout the germplasm.The markers were picked from different linkage groups (chromosomes) of the rice genome and from the results, they amplified a varying number of alleles.The number of alleles observed in this research for markers RM125 (7) and RM7 ( 6) are comparable to those noted by earlier researchers on African rice (Drame et al., 2011) For all loci, the observed Heterozygosity (Ho) was lower (mean = 0.254 ± 0.043) than the expected Heterozygosity (He) (mean = 0.629 ± 0.032), suggesting a clear shift from the Hardy-Weinberg equilibrium.This shift can only be attributed to forces akin to inbreeding within groups (Masudaab et al., 2009) or lack of distinctly isolated populations of the available rice germplasm in Uganda.The Fst values were generally low, ranging from 0.047 to 0.192, indicating that low levels of genetic differentiation were present in the populations sampled from IRRI, AfricaRice and NaCRRI, as reported by Ogumbayo et al. (2005) regarding genotypes from AfricaRice.

Diversity groups of Uganda rice genotypes
Grouping of the rice populations was determined using ancestry-based grouping and a phylogenetic tree identifying the different groups.Using a cluster analysis of the allelic data (Figure 1), five genetic groups were identified in the rice germplasm used in the study.The classification of the genotypes was in agreement with their parentage, which comprised of Aus, Indica, tropical japonica, temperate Japonica and Basmati groups of rice.A similar population structure was also documented by Garris and Ni (Garris et al., 2005;Ni et al., 2002).Garris et al. (2005) reported 5 populations that corresponded to indica, aus, aromatic, temperate japonica and tropical japonica using samples from Asia, the Americas and Africa.The statistical program used in this study (STRUCTURE V2.3.4) tends to give more populations that are biologically relevant.Therefore, to ascertain the identified groups, the method described by Evanno et al. (2005) was used.
This method tests the number of populations that are statistically significant in the samples when patterns of dispersal among them are not homogeneous as in the case of allelic variability (Evanno et al., 2005).By using the log probability for the rate of change of the data (ΔK) between values of successive K (number of populations), STRUCTURE gave the accurate number of populations (K) at K=5.The results, presented in Figure 2, revealed the structure of the rice genotypes showing five populations consisting of admixed genotypes.This admixture in populations is probably because rice breeding centers share germplasm freely and make crosses with whichever cultivars show desirable traits, resulting in common alleles in all populations (Ni et al., 2002).This gene flow (gene migration) causes a marked change in allelic frequency (Beringer, 2003), so that alleles of various individuals end up being present in all populations.There was therefore no "island" population of rice genotypes observed in this study, indicating that the genetic base was small, as suggested by Cuevas-Pe´rez et al. (1992).
The genotypes grouped according to differences in their alleles, revealed one extra group compared to Drame et al. (2011) who reported four distinct genetic groups of African rice samples using samples of wild rice (O.glaberrima, Oryza longistaminata and Oryza barthi) and 6 cultivated genotypes.However, the present study used cultivated rice, some of which were NERICA varieties that are crosses between wild and cultivated rice, and therefore were expected to have broader genetic diversity than shown in the earlier studies by Drame et al. (2011).The groups observed in this study were not clearly separated from each other, due to admixtures, which suggested that most of them were crosses with each other.Genotypes 11 (IR 83372-B-B-133-2), 3 (BGM) and 57 (NERICA 1/WAC 117) clustered alone in group 1 (Figure 1) suggesting a unique group of alleles that was not present in other groups, and thus indicating that this group could be useful as parents to broaden the diversity of rice breeding materials (Cuevas-Pe´rez et al., 1992).

Assigning genotypes to their centers of origin
The AMOVA results are presented in Table 4, revealing most of the variation among individual genotypes (56%) and within individuals (41%).Only 3% of the variation was associated with the origins of the genotypes, suggesting that the diversity available among and within the rice genotypes was sufficient to be used for improving rice productivity in Uganda.The four origins of the 99 rice genotypes used in this study were; NaCRRI (61), IRRI (24), AfricaRice (12) and Ugandan farmers (2) and analysis using GenAlEx software was expected to assign the genotypes to their respective centers of origin.The results (Table 5) indicated that the different centers of origin of the genotypes, as their fixed population in the analysis, were assigned 53% while 47% of them were assigned to other populations (Other) at P < 0.1.The markers used were able to differentiate half of the genotypes used by their centers of origin.The other 47% were assigned different origins probably because they resulted from crosses between genotypes that could have been introduced from other centers (Guimaraes, 2009).The origin of neither of the two varieties obtained from farmers in Uganda (Supa and Sindano) was assigned to farmers, and remains unclear (Kijima et al., 2012) though they are believed to have originated from Tanzania (Kijima et al., 2012) where Sindano is a landrace (FAO, 1987).

Conclusion
The use of molecular markers to determine genetic diversity in the germplasm is demonstrated as a feasible approach in Uganda.The SSR markers RM 333, RM 20A and RM 206 were the most informative in this research.The rice germplasm from IRRI, AfricaRice and NaCRRI was assigned 5 groups.

Table 1 .
List of 99 genotypes used in the study and their alleles detected with 10 markers.

Table 2 .
Three multiplex panels consisting of 10 rice SSR markers were used in this study for diversity analysis.SSR marker allele sizes, fluorescently labeled dye, repeat motifs, chromosome no and annealing temperature information are mentioned.

Marker Annealing temp. (°C) Chromosome no. Allele size range (bp) Dye color Repeat motif
The labeled SSR markers were synthesized by Life Technologies Inc., USA.

Table 3 .
Estimated genetic diversity parameters obtained at each locus across 99 genotypes.

Table 4 .
Summary of AMOVA in the genotypes.probability value of 0.001 was used based on permutation across the full data set. A

Table 5 .
Assignment of population to origin or "other" categories.