Characterization of thirteen microsatellite loci from the Ghanian antimalarial plant Cryptolepis sanguinolenta

Cryptolepis sanguinolenta (Lindl.) Schlechter (Periplocaceae) is an herbaceous plant used in traditional medicine to treat malaria and populations of the species are diminishing due to overharvesting and lack of conservation. Codominant microsatellite markers that can be used to characterize genetic diversity and population structure are currently not available. Therefore, the study isolated 75 microsatellite loci from genomic sequence data, which were then screened for the ability to reveal polymorphisms. From the 75 candidate loci, 13 polymorphic microsatellite loci were optimized for future population genetics studies. Twenty-two C. sanguinolenta samples were collected from eight different geographical locations in Ghana. Alleles per locus ranged from 3 to 7 with a mean of 4.4. Expected heterozygosity ranged from 0.24 to 0.77, and all but one locus deviated significantly from Hardy-Weinberg equilibrium. Genetic differentiation mean was 0.06 among all loci, indicating relatively low genetic diversity in these samples. These microsatellite loci should be useful to study genetic diversity, gene flow and population structure as well as in a project involving breeding and conservation of C. sanguinolenta.


INTRODUCTION
Cryptolepis sanguinolenta (Lindl.)Schlechter (Periplocaceae) is an important perennial medicinal plant species indigenous to West and Central Africa with a long history of use in the treatment of malaria (Ankrah, 2010;Tempesta, 2010;Osei-Djarbeng et al., 2015).An aqueous extraction of its roots, the portion of the plant with the highest concentration of active antimalarial ingredients, yields indoloquinoline alkaloids, primarily cryptolepine, an N-methyl derivative of the indoloquinoline compound quindoline (Dwuma-Badu, 1978;Tachie et al., 1991).The extract may also have some anticancer properties as well (Ansah and Mensah, 2013).The harvested roots are sold in local markets.Nonsustainable destructive harvesting of the plants for the roots has resulted in a substantial decrease in wild populations (Jansen and Schmelzer, 2010).Its widespread use as a medicinal plant coupled with over-harvesting, calls for the formulation of effective management plans and conservation through cultivation with the ultimate goal of selecting high active ingredient producing plants for breeding (Amissah et al., 2016).
There is very limited information about the genetic diversity in populations of C. sanguinolenta.Amplified Fragment Length Polymorphism (AFLP) revealed low genetic diversity among 116 plants sampled from three regions in Ghana, but did not characterize the population structure or gene flow within the collected sites (Amissah et al., 2016).Microsatellites or single sequence repeats (SSRs) are codominant molecular markers, as compared to dominant AFLPs markers, that typically are polymorphic (Gupta et al., 1996) and well-suited for studying population genetic diversity and dynamics.They are most often employed to evaluate population structure, gene flow and inbreeding (Arnold et al., 2002;Zhang and Hewitt, 2003).
Currently, microsatellite molecular markers are not available to genetically characterize C. sanguinolenta individuals and populations.The study generated genome sequence data for C. sanguinolenta, and identified 13 polymorphic microsatellites that were used to characterize from eight in Ghana, West Africa.The microsatellites from this study will be used to characterize a larger sample of the population and provide information on genetic diversity, population structure and gene flow of the species.The microsatellites should be very useful for a breeding program to create elite genotypes that produce high quantities of antimalarial compounds.

MATERIALS AND METHODS
A DNA library (target 400 bp lengthreads) was constructed from a C. sanguinolenta individual that was collected in Hweehwee Oboyan, Ghana.One microgram of DNA was used for library preparation using the Ion Xpress™ Plus gDNA Fragment Library Preparation kit (Life Technologies, Carlsbad, CA).Prior to sequencing, the library was quantified with the Ion Library Quantitation Kit (Life Technologies) and diluted to 20 pM.The library was then prepared for sequencing using the Ion PGM™ Template OT2 400 Kit (Life Technologies).The library was loaded on an Ion 318™ Chip v2 (Life Technologies) and sequenced on the Ion PGM™ System (Life Technologies) using the Ion PGM™ Sequencing 400 Kit (Life Technologies).Raw sequencing reads were trimmed and then assembled de novo into contigs using the default parameters in CLC Genomics Workbench (Qiagen).Only contigs ≥ 20X coverage were searched for microsatellite motifs (dito hexa-nucleotide) and primers were designed using BatchPrimer3 (You et al., 2008) using default settings.
Genomic DNA was isolated from leaves of 22 C. sanguinolenta plants from 8 locations using a CTAB protocol (Porebski et al., 1997) and one DNA sample of Pityopsis graminifolia (Michx.)Small from Tennessee (USA) was used as a negative control for amplification.Ten µl reaction amplifications were completed as described in Hatmaker et al. (2015) using 75 primer pairs.The thermal cycler conditions consisted of 94°C for 3 min and 35 cycles of 94°C for 40 s, 55°C for 40 s and 72°C for 30 s, followed by 72°C for 4 min.Allelic products were separated via electrophoresis using the QIAxcel Capillary Electrophoresis System (Qiagen, Valencia, California USA) and the data binned into allelic classes according to the protocols described by Hatmaker et al. (2015).Data was analysed for number of alleles/locus, observed heterozygosity, expected heterozygosity, probability of deviation from Hardy-Weinberg equilibrium at P<0.05, Shannon's information index, and genetic differentiation (F ST ) using the program GenALEx 6.5 (Peakall and Smouse, 2012).

RESULTS AND DISCUSSION
A total of 4,910,467 reads (mode = 405 bp) and 1.3 Gb were generated from sequencing with a single Ion 318™ Chip (v2) on the Ion PGM™ System.After quality trimming, the reads were assembled de novo into 38,029 contigs with an average length of 1,721 bp and a N50 of 1,670 bp.Only contigs ≥ 20X coverage (n=1,928) were screened for microsatellites, and 821 were detected.Triand tetra-nucleotide motifs were the most common, with 36.1% of the motifs in each of these classes.The next most common class was the dinucleotide motifs at 13%.A total of 574 primer pairs were designed and of these 75 were randomly selected for amplification and detection of polymorphic alleles.
Seventy of the seventy-five primer pairs amplified microsatellite loci.Nineteen of the primers consistently yielded more than two products, twenty-five amplified only one allele/locus (monomorphic) for all plants included in the study and another thirteen primers amplified only two alleles (either as homozygotes or heterozygotes) per locus.Thirteen primers were polymorphic, revealed more than two alleles per locus, and were used for analysis in this study (Table 1).DNA from P. graminifolia did not amplify with any of the primer pairs, indicating accuracy of our results.The number of alleles/ locus was relatively low and ranged from 3 to 7 (mean 4.4), which has been observed in other endemic perennial species (Arroyo et al., 2016).The mean observed frequency of heterozygotes (H o ) in the total sample was 0.44, which deviated from the expected heterozygosity (H e ), indicating potential presence of population structure among the eight population collection areas (Hadziabdic et al., 2012).Expected heterozygosity ranged from 0.24 to 0.77, and all but one locus significantly deviated from Hardy-Weinberg equilibrium.However, more samples from each of these areas will be needed to determine population structure and gene flow.Shannon's diversity index averaged 0.85 among all loci.This index measures the number of unique genotypes in a sample population and how evenly the genotypes are distributed (Brown and Weir, 1983).Genetic differentiation (F ST ) value of 0.06 was calculated across the 13 loci Three of loci had a calculated F ST ≥0.05, which showed moderate diversity, however 10 of the loci had a F ST of ≤0.05 that indicated relatively low genetic diversity in these samples may portend the same for the general population (Hartl and Clark, 2007).
Annual and herbaceous perennial plants in comparison to woody plant species have lower mean genetic diversity due to lack of polymorphic loci as well as narrow geographical distribution (Hamrick et al., 1992).The study finding agrees with the report by Amissah et al. (2016) that used AFLP analysis to reveal a low (25%) genetic diversity in 116 sampled plants.Additional samples and analyses need to be completed to confirm our preliminary findings using microsatellite analysis.

Conclusion
The study provides defined codominant molecular markers for C. sanguinolenta.A total of 13 primer pairs can be used to amplify polymorphic microsatellite loci that should provide more complete assessment of genetic diversity, gene flow and population structure of this species.Additionally, these markers should be a valuable tool for breeding elite genotypes and conservation of C. sanguinolenta.

Table 1 .
Primer sequences, repeat motifs, annealing temperatures (T a ), number of alleles (A), allele size range, observed (H o ) and expected heterozygosity (H e ), Shannon's information index (I), and genetic differentiation (F ST ) of 13 microsatellite loci isolated from Cryptolepis sanguinolenta.