Assessment of plantain ( Musa sapientum L . ) accessions genotypic groups relatedness using simple sequence repeats markers

Plantains are important sources of high-calorie energy in Ghana. They are also of great socio-economic importance in the country, and very important sources of rural income. Although several species exist all over the world, plantains belonging to the AAB group are unique to West Africa and Ghanaian collections have unique features and peculiar taste. Morphological and biochemical characterization are the popular techniques used to characterize plantain genotypes in Ghana. Thus, there is limited report on molecular characterization of plantains genotypes. Characterization based on morphologic characteristics alone may be limited since the expression of quantitative traits is subjective to strong environmental influence. Alternatively, molecular characterization techniques are capable of identifying polymorphism represented by differences in DNA sequences. The objective of this research was therefore to conduct molecular characterization of Ghanaian local accessions of plantain and assess relationship amongst known genotypic groups (populations). This study sampled 40 accessions of plantains representing four popular genotypic groups. Simple Sequence Repeats (SSRs) were used to assess diversity in reference to a set of global Musa collections. The 40 accessions of plantain were clustered into populations as being French plantain, True Horn, False Horn, and Hybrid prior to analysis. PopGene version 32 was used to analyze the data. This revealed that the overall plantain population used have Shannon’s Informative Index (I) value of 0.61±0.28 in the overall plantain population, 100% polymorphism for all loci, 2.7±0.67 and 1.81±0.45 for ne and na respectively. Average heterozygosity was 0.34±0.17, loci mMaCIR231 and mMaCIR07 were the most informative, having I values of 0.85 and 0.81 respectively. The Fis and Fit values were both negative indicating lack of inbreeding and the gene flow value was 0.533. The study also revealed relationship among the various populations (French plantain, True Horn, False Horn, and Hybrid) on basis of molecular characterization.


INTRODUCTION
Plantains (Musa spp.) are major food crops widely grown across the world"s tropical and subtropical regions.The fruits are highly nutritious containing high amounts of carbohydrates, minerals such as Ca, and K as well as vitamins A and B (http://healthyeating.sfgate.com/benefits-eating-plantains-3634.html).An estimated 20 million people eat banana and plantain as their major source of dietary carbohydrate.
These crops serve as important revenue for many smallscale farmers (Bioversity International, 2007).Mostly, the world"s bananas and plantains are grown on small farms for local consumption (Ortiz and Vuylsteke, 1996).Banana and plantain production in sub-Saharan Africa, therefore provide a good source of income and serve as an important component of daily diet.
Two main centers of banana and plantain cultivation are found in Africa: the wet tropical zones of West and Central Africa, and the East African Highlands (De Langhe et al., 1995).In the west and central humid tropical areas, a very distinct type of cooking banana (plantain, AAB) is widely cultivated.Plantains are relatively rare in most of Asia as well as in other parts of Africa, and their origin in West Africa is shrouded in mystery.It is thought that they have been cultivated in this region for more than 3,000 years, but the identity of the people responsible for such cultivation is unknown (De Langhe, 1996).It is possible that the same proto-Polynesians that carried the banana east to the Pacific islands, also carried it to West Africa (De Langhe, 1996;De Langhe and De Maret, 1999).Such hypothesis fits with the finding that plantains must have reached Africa more than 3,000 years ago, but archaeological evidence for such voyages is unlikely to be found.Plantains constitute over 70% of the bananas and plantains grown in this area (Mbida et al., 2000).Recently, the production of plantains in West and Central Africa was saddled with diseases and this includes the black sigatoka disease (Dzomeku et al., 2016).Attempts to deal with these problems have led to the production of hybrid varieties through breeding programs to develop line with resistance or tolerance.These hybrids have been introduced into some of the West and Central African countries and their acceptability by the consumers may be based on various preferences, including taste, consistency and cooking properties.An insight of genetic make-up may contribute to information, vital for both breeders and consumers.Several methods have been used to investigate the genetic variability present in Musa germplasm (Silva et al., 2015;Hippolyte et al., 2010).The development and application of technologies based upon molecular markers provide the only tools that are able to reveal polymorphism at the DNA sequence level, which are adequate to detect genetic variability between individuals and within populations (Kresovich et al., 1995) which will facilitate breeding efforts to improve the crop against biotic and abiotic stresses (Rodrigues et al., 2017).Recently, several molecular tools have been used to assess the molecular make up of Musa species (Christelová et al., 2016).Microsatellites or simple sequence repeats (SSRs) are among several molecular markers used to characterize and assess genetic variability of the genus Musa, because they are highly polymorphic, multi-allelic, codominant, reproducible, easy to interpret, and amplified via polymerase chain reaction (PCR) (Crouch et al., 1999).Christelová et al. (2016) used molecular and cytological tools to characterize Musa germplasm collections and this provided insight into the diversity of banana.Biswas et al. (2015) conducted genome-wide computation analysis of Musa microsatellites and has introduced a concise procedure to SSR marker development.
The principal edible species are in the section Eumusa of the genus Musa and comprise M. acuminata Colla (2n=2x=22; A genome; 600 Mbp) and M. balbisiana Colla (2n=2x=22; B genome; 550 Mbp), and their hybrids, the triploid lines (2n=3x=33) with genome constitutions AAA (dessert or export banana), AAB (plantain) and ABB (cooking banana) (Simmonds, 1962;Gowen, 1995).Considering morphological characteristics of plantains in Ghana, they can be classified into three main subgroups; namely False Horn "Apantu group", French "Apem group", and True Horn: "Asamienu group" (Dankyi et al., 2007).The French plantains have the bunch complete at maturity, with many hands of numerous, rather small fingers.The bunch axis is covered with neutral flowers and male flowers, where the male bud is large and persistent.The False Horn plantains have incomplete bunch with no male bud at maturity.The hands consist of large fingers followed by few neutral flowers.True Horn plantain"s bunch is incomplete at maturity.The hands are few in number and consist of a few but very large fingers.There are no neutral flowers or male bud and the True Horn plantain resembles the False Horn but it has no neutral flower and has larger fingers.Basically, the available cultivars in Ghana are 10 of False Horn, four of French plantain 4 and two of True Horn 2. Hence, there is the need to study diversity among local and introduced varieties at the molecular level.This study seeks to determine the genetic relationships among genotypic groups of elite local triploid (AAA, AAB, and ABB), and tetraploid hybrid (ABBB) accessions of Musa.
The objective of this study was to assess relatedness among the collection of Musa sapientum genotypic groups (population) using B-genome derived Simple Sequence Repeat (SSR) markers.This provided fingerprint for the unique plantain genotypes in Ghana within the West African sub-region.Also, this study estimated the genetic variation or genetic diversity within and among populations, estimated the genetic population structure, and determined with reference to known morphological traits, if SSR based on known selected set of microsatellites (in reference to a reference set of Musa *Corresponding author.E-mail: mariandquain@gmail.com. Author(s) agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Genotype sampling
Plantain genotypes used for the study were selected on the basis of known morphological classification traits and genotypic groups.The samples were collected from farmers" fields and backyard gardens in the Ashanti and Eastern Regions in Ghana.Samples collected were of the French plantain (15 entries), True Horn (1 entry), False Horn (22 entries) and introduced hybrid (2 entries).Known morphological data was used to cluster the collections genotypic groups referred to in this study as "Populations" (Table 1).

Genomic DNA isolation
During collection on the field, young tissues of Cigar leaf with approximate weight of 0.2 g were harvested, washed and kept in liquid nitrogen for isolation of genomic DNA.In the laboratory, the phenol-chloroform-isoamyl alcohol-base DNA extraction protocol (Egnin et al., 1998) was used to extract genomic DNA from the samples.A spectrophotometer (Biochrom Libra S12) was used to estimate the quantity and quality of DNA at 260 nm (OD260) and 280 nm (OD280).The DNA was resolved in agarose gel at 0.8% in TAE buffer stained with ethidium bromide.The DNA in the gel was visualized with an ultra violet trans-illuminator in an alpha imager.
The agarose gel electrophoresis was used for the determination of DNA quality and quantity.

Microsatellite analysis
The Simple Sequence Repeats (SSR) microsatellite based analysis was conducted using the standardized approach for the unknown samples characterization at the molecular level.Eighteen microsatellite markers were used to screen the 40 plantain genotypes, and their scores were compared with those from a reference sample set (Christelová et al., 2011).These SSR loci are well distributed within the Musa genome (Lagoda et al., 1998;Crouch et al., 1998;Hippolyte et al., 2010).The size and composition of the reference set defined the limits for the precision with which the unknown samples were characterized.There were only two representatives of African plantains in the reference DNA collection.The SSR patterns of each individual were analyzed following the protocol of Roy et al. (1996), as applied with the automated infrared fluorescence technology of a sequencer LICOR IR2 (LICOR, Lincoln, USA).For a given SSR locus, the forward SSR primer was designed with a 5'-end M13 extension (5'-CACGACGTTGTAAAACGAC-3').The PCR amplification was performed in a 384 wells Eppendorf master cycler with PCR master mix containing 10 ng of Musa DNA in a 10 µl final volume of reaction + PCR buffer (10 mM Tris HCl (pH 8), KCl 50 mM, 0.1% Triton-X100 and 1.5 mM MgCl2) + 8 pmol M13-labelled primer + 200 µM deoxynuleoside triphophates (dNTPs) + 1 U Taq DNA polymerase (Life Technologies, U.S.A.) + 0.06 µM of M13 primerfluorescent dye IR700 or IR800 (Biolegio, Netherlands).The PCR program had initial denaturation step at 94°C for 5 min, followed by a touch-down protocol -initial decrease of annealing temperature by 1ºC for the first cycles depending on the primer pairs used.Fixed annealing temperatures for further 35 cycles was applied and denaturation at 94°C for 45 s.Annealing was at lowest primer Tm (between 43 -52°C) for 60 s and elongation at 72°C for 60 s.A final elongation step at 72°C for 5 min was added to all the protocols.Musa standard was prepared with a mix of three Musa accessions (Pisang Jari Buaya, Popoulou/Maia Maoli and Tomolo), added in order to improve allele sizes determination.The ladder used had the range 71-367 bp.The IR700 or IR800-labeled PCR products were diluted 8-fold and 5-fold respectively prior to electrophoresis on 6.5% polyacrylamide gel.The band sizes were determined by the IR fluorescence scanning system of the sequencer.Information on the reference set of samples used is available at http://www.musagenomics.org.

Genomic data analysis
Molecular data analysis was performed in the phylogenetic package Phylip under restdist and UPGMA algorithm packages.The outtree files were visualized in any tree-building/editing program (Treeview or Figtree software).The data was treated as a co-dominant marker and although number of alleles per loci ranged from 5 to 21, to analyse genetic variation among genotypes using PopGene 3.2 (Yeh et al., 1997), the data was scored as a diploid data and hence alleles that were beyond the diploid set of alleles were ignored.
Labelling each marker as a locus, the POPGENE version 32 genetic analysis packages was used to analyse the data.The absence of an amplification product with the 18 primers in an individual was considered as missing data.The genetic variation at each locus was characterised in terms of number of alleles (na), effective number of alleles (ne) (Kimura and Crow, 1964), and Shannon"s Informative index (I) (Lewontin, 1972).The summary of heterozygosity was also established at each locus in terms of observed heterozygosity (Ho) and expected heterozygosity (He).
The gene flow (Nm) was established from the genetic differentiation coefficient (Fst) as Nm = 0.25(1-Fst)/Fst.Data analysis was conducted by assigning populations into four genotypic groups on basis of known morphological traits as French Plantain (Pop1), True Horn (Pop2), False horn (Pop3), and Hybrid (Pop4).Also, all the plantain samples were considered as a single population PopAll dendrogram to establish relatedness among populations was generated, based on Nei"s genetic distances.

RESULTS AND DISCUSSION
Out of the 20 SSR markers tested, 18 provided applicable and scorable data.There were a total of 232 allele calls when the 40 Ghanaian accessions were analyzed together with a reference set of Musa accessions.All the 40 samples fell within the subgroup of triploid AAB African plantains (Figure 1).Considering the reference set of Musa genotypes used, out of the 232 allele calls, only 52 amplifications were within the Ghanaian genome, representing only 22.4% of known alleles when the selected set of SSR markers was used (Christelová et al., 2011).This study indicates that more SSR markers need to be screened to generate informative loci that can detect variation among the Ghanaian genotypes.Quain et al. (2010) used 49 Musa SSR primers of which 46 amplified a total of 233 alleles, giving an average of 5.09 alleles per locus within a range 1-13 alleles among 10 Musa accessions.Report by Brown et al. (2009) indicated that 15 decamer RAPD markers were used to screen 27 Musa accessions.Samarasinghe et al. (2010) used MaSSR primers and +ve AGMI primers to characterize 27 Musa cultivars from the AA and BB genome (Supplementary Table 1).Six of the primers used amplified a total of 38 alleles in the collections.In the current study, all the bands scores were polymorphic.Zhang et al. (2009) used ISSR markers to reveal genetic diversity among natural population of Ottelia acuminate having 79.44% polymorphic bands and the average band per genotype was 6.3.Although there were 1.3 alleles per genotype on the basis of band amplification in the current study amongst the Ghanaian samples when the total number of alleles were considered in reference to the reference set of samples that were 5.8 alleles per genotype, this was lower than value reported by Zhang et al. (2009).Resmi et al. (2011) sampled 38 banana cultivars representing AA, AB, BB, AAA, and ABB genomic groups.Using STMS, 15 primer pairs of Ma series specific to Musa species were screened for usefulness.Ten out of the 15 were selected for the analysis on basis of PCR amplification and allele scoring consistency.The 10 markers used revealed 27 alleles.In the present study, size of amplification fragments ranged from 111 to 458 bp.Resmi et al. (2011) reported amplified fragment size ranging from 50 to 290 bp.Considering all the 232 alleles called in the present study, the mean number of alleles per locus was 12.88 which is comparable to that reported by Creste et al. (2004).Similarly, Grapin et al. (1998) reported a mean number of 8 alleles per primer.Other researchers working on Musa genotypes reported average number of alleles as 3.32 (Ge et al., 2005), 2.7 and 8.3 respectively (Resmi et al., 2011(Resmi et al., , 2016)), and 2.56 (Oriero et al., 2006).
The data generated was handled as a diploid codominant data set and the allelic frequency was calculated for all genotypes at each locus.The overall allelic frequency for the 18 loci determined using PopGen 32 is presented in Table 2.The highest frequency value (0.9868) was obtained in allele B of locus mMaCIR39, although allele A of loci mMaCIR24 and mMaCIR150 also had high frequencies of 0.9744 and 0.9865 respectively.The lowest frequency value of 0.0132 was obtained in allele C of Locus mMaCIR03, D of Locus mMaCIR231, C of Locus mMaCIR01, and C and D of Locus mMaCIR07.
The Phylip phylogenetic package under the UPGMA algorithm packages was used to develop the tree.The clustering according to the dendrogram generated has two major groups (Figure 1) and seven separate clustering of individuals.One of the seven clusters was   Christelová et al., 2011), respectively, clustered within the groups of French plantain and false horn.
Although there were some distinct clusters of French plantain and false horn, some clusters had those two genotypes interlacing.Similarly, Amorim et al. (2008) did not get complete separation among improved, wild and cultivated hybrids of diploid genotypes using SSR markers.Rodrigues et al. (2017), however, reported that in investigating genetic variability in banana diploids, there was no separation of genotypes based solely on geographic origin, although genotypes were grouped based on their genomic constitution.
Genetic heterozygosity analysis at all the loci revealed the extent of polymorphism.These results are presented in Table 3.The genetic polymorphism ranged from 33.33% in Pop2 (True Horn) to 94.44 in Pop 3 (False Horn).In Pop3, 17 out of the 18 loci were polymorphic, and Pop2 had 6 out of the 18 loci being polymorphic.When all the genotypes were analyzed together, all the 18 loci were polymorphic resulting 100% polymorphism.The loci sample size was least in Pop2 (0-2), in Pop1 (French Plantain) it was 26-30, and Pop 4 (Introduced Hybrid) recorded the highest range at 38 to 42.When all the genotypes were analyzed together, the Loci sample size ranged from 70 to 78.The corresponding observed number of alleles was determined and in Pop1, the alleles ranged from 1 to 2 and 12 of the 18 loci had two effective alleles.The Pop1 mean effective number of alleles was 1.77±0.44.In Pop2, nine of the loci did not have alleles in the used sample; the average effective number of alleles was 1.67±0.5.In Pop3, loci mMaCIR07 registered the highest value (2.19) for the mean effective number of alleles.However, on the average, Pop4 had the highest number effective number of alleles" value of 1.88±0.32.When all the genotypes were analyzed, the number of observed alleles ranged from 2 to 4 and the locus m231 recording the highest value of 2.16 as the effective number of alleles.On the average 1.81±0.45alleles were effective.Resmi et al. (2011) reported that 90% of ten loci used to screen 38 Musa samples were polymorphic, where highest polymorphism was observed with primers Ma 1 to 17 and Ma 3 to 60 with four alleles.Their percentage polymorphic loci ranged from 60 to 80% among the 5 major genomic groups.
The Shannon"s information index (Lewontin, 1972) was calculated to provide a relative estimate of the degree of variation within each population, as well as within the collected genotypes as presented in Table 4.The mean measure of genetic diversity was 0.52±0.20,0.46±0.35,0.57±0.28,and 0.61±0.22 for Pops 1, 2, 3 and 4 respectively.When all genotypes were assessed, the mean Shannon"s information index was 0.61±0.22.Resmi et al. (2011) reported an I value of 0.70±0.38,which is higher than the value reported in the present study.Locus mMaCIR24 gave no measure of genetic diversity on Pops 1 and 3. Resmi et al. (2011) reported average genetic diversity among 5 groups ranging from 0.20 to 0.42, Shannon"s informative index ranged from Zygosity refers to the similarity of alleles for a trait in an organism.If both alleles are the same, the organism is homozygous for the trait.If both alleles are different, the organism is heterozygous for that trait.Heterozygosity is a measure of genetic variation in natural populations.High heterozygosity indicates lots of genetic variability, whereas low heterozygosity means little genetic variability.Usually, the observed level of heterozygosity (Ho) is compared with the expected level, under Hardy-Weinberg equilibrium.If the observed heterozygosity is lower than expected, it is attributed to the forces such as inbreeding.If observed heterozygosity is higher than expected, it might be suspected that the genotypes have an isolate-breaking effect (the mixing of two previously isolated populations).The expected heterozygosity (He) is defined as the estimated fraction of all individuals who would be heterozygous for any randomly chosen locus.The "He" differs from the "Ho" because it is a prediction based on the known allele frequency from a sample of individuals.Deviation of the observed from the expected can be used as an indicator of important population dynamics.In this study, heterozygosity was determined at all the loci and the averages are presented in Table 4.In all the populations, the average expected heterozygosity was lower than the observed   2011) reported an average genetic diversity computed in terms of Nei"s expected heterozygosity of 0.42±0.22 which is higher than that reported in the present study (0.39±0.20), although Pop4 value recorded was similar to that reported by Resmi et al. (2011).The report indicated that the Ghanaian accessions were heterozygous, and this may be responsible for the morphological variations among the popular genotypes in Ghana.Oriero et al. (2006) reported average observed heterozygosity of 0.63 among 40 Musa accessions which is lower than the value recorded in this present study.In a population, heterozygosity is measured by determining the proportion of genes that are heterozygous and the number of individuals that are heterozygous to each particular gene.Average expected heterozygosity indicates genetic diversity in a population.Resmi et al. (2011) reported average genetic diversity among 5 groups ranging from 0.20 to 0.42, whereas Shannon"s informative index ranged from 0.46 to 0.61.The fixation index (F ST ) which is a measure of population differentiation is also referred to as the Fstatistics.In this study, the F-statistics and gene flow at all the loci was determined and the average F ST value was 0.319.The average Fis (average inbreeding coefficient with all genotypes) value was -0.97, and this is indicative on absence of inbreeding among the populations.In a population, gene flow or gene migration is the transfer of alleles of genes from one population to another.The average gene flow (Nm) value for all the populations at the various loci was 0.533.As Musa species are pathenocarpic, the study has revealed that there has not been inbreeding among the genotypes, and this was expected; consequently the low level of gene flow was 0.533.To promote genetic diversity, gene flow should be encouraged and can be achieved by utilizing in vitro micro-propagation to introgress genes from varieties that produce seeds into the cultivated plantains in West Africa.Oriero et al. (2006) reported negative Fis and Fit values and also, observed heterozygosity was higher than the expected.

Conclusion
This work, to the best of our knowledge, is one of the first ever reports on application of SSR markers to study molecular diversity in genotypes of plantain in Ghana.The SSR markers used in this study have limitations as they could not adequately group the collections into their genotypic groups.There is thus the need to develop SSR markers that will detect more variation among our genotypic groups.
The data generated would thus contribute to the development of a database for Ghanaian Plantain Germplasm.Informative Molecular markers identified in this study and be used to support plantain germplasm fingerprinting.

Figure 1 .
Figure 1.Dendrogram generated with Phylip phylogenetic package under the UPGMA algorithm package.

Table 1 .
Plantain genotypes sampled for analysis.

Table 3 .
Summary statistics of population genic variation statistics for all loci.

Table 4 .
Summary of heterozygosity statistics for all loci.

Table 5 .
Nei's original measures of genetic identity and genetic distance.