Genetic diversity in cocoa ( Theobroma cacao L . ) plus trees in Tamil Nadu by simple sequence repeat ( SSR ) markers

The range of polymorphism of about 27 cocoa plus trees screened in the major cocoa growing regions of Tamil Nadu were assessed using 10 simple sequence repeat (SSR) markers. The gene diversity, genetic differentiation and genetic similarities were analyzed for the cocoa trees. The number of alleles detected by different primers ranged from 0 to 3 and the level of polymorphism was 0 to 100%. The polymorphism information content (PIC) value ranged from 0.000 to 0.677. The higher the PIC value, the more informative was the SSR marker. Hence, primer mTcCIR33 was found to be highly informative. The Jaccard’s similarity coefficient for the SSR data set varied from 0.39 to 1.00. The SSR marker profiles resulted in nine clusters at nearly 54% similarity. From this study, it could be inferred that the diversity exists in cocoa plantations in Tamil Nadu and can be exploited in crop improvement research.


INTRODUCTION
The cocoa, botanically known as Theobroma cacao L. belongs to the family Malvaceae (Alverson et al., 1999).In India, cocoa is cultivated predominantly in four states viz., Kerala, Andhra Pradesh, Tamil Nadu and Karnataka.It occupies an area of 46,318 ha with the production of 12,954 MT and the national productivity is 550 kg dry beans per ha.Kerala leads in production of cocoa from an area of 11,044 hectares with production of 6344 MT.
The productivity of cocoa beans in the state is 592 kg per hectare.Tamil Nadu occupies third position in cocoa cultivation with an area of 9347 ha.It produces 900 MT cocoa beans with the productivity of 443 kg dry beans per hectare (DCCD, 2012).Three distinct varieties of cocoa viz., Forastero, Criollo and Trinitario are cultivated of which, Forastero accounts for 90% of the cocoa beans produced in the world.It is found widely in West Africa and Brazil.The second type is Criollo, which produces fine and flavor beans, mostly grown in parts of the Caribbean, Venezuela, Papua New Guinea the West Indies, Sri Lanka, East Timor and Java.The third type is the Trinitario variety, which is a cross between Criollo and Forastero (Anonymous, 2010).Cocoa is highly cross pollinated due to the prevalence of self-incompatibility, thereby paving the way for high variability in cocoa population.In India, the early introductions made in the early 20 th century included presumably Criollos and Forasteros.
The exact sources from which cocoa was introduced into India are not known.Around 1930, all the Forastero plants available in the country were removed and only Criollos were maintained.When large-scale cultivation of cocoa was taken up in the 1960's, seeds of presumably fresh cocoa beans were introduced from selected plants from Malaysian estates and other sources and subsequent introductions were made in 1970's from Cocoa Research Institute of Ghana and Kew Botanical Garden of United Kingdom.Further systematic introductions were made since 1990 from University of Reading, United Kingdom.Crop improvement research in cocoa is in progress in India especially at Kerala Agricultural University and Central Plantation Crops Research Institute, Vittal for more than 40 years and many improved selections and hybrids were evolved (Nair et al., 1990).
The F 1 progenies of these cultivars have been introduced in most of the plantations in Tamil Nadu of India.Tamil Nadu happens to be the potential state for cocoa production on account of more area under irrigated coconut and cocoa Coimbatore and Erode districts.The trees have adapted to the local climatic conditions of the region for many years.High degree of genetic variability has been observed in these plantations.However, the genetic diversity of cocoa existing in Tamil Nadu state has not been scientifically documented.Systematic identification, documentation and conservation of genetic diversity of cocoa either ex situ or in situ will be the viable approach for improvement of cocoa in Tamil Nadu.Therefore, this study was conducted with the following objectives: to assess the genetic diversity of cocoa in Tamil Nadu using molecular markers and to construct phylogenetic tree to display the relationship among the trees for further crop improvement work.

DNA extraction
The DNA was extracted from young leaves of cocoa following the protocol of Echevarria-Machado et al. (2005) for Malvaceae plant species with slight modification which was based on the extraction procedure of Dellaporta et al. (1983).0.1 g of tender leaf material was homogenized with 1 ml of extraction buffer, transferred to a 2.0 ml eppendorf tube, and 100 µL of 20% Sodium dodecyl sulfate (SDS) was added.After mixing, the mixture was incubated at 65°C for 10 min.500 µl of 5 M potassium acetate was added.The tube was shaken vigorously and incubated for 20 min on ice.The tubes were centrifuged at 12,000 rpm for 20 min.The supernatant was transferred to a new 1.5-ml tube to which 300 µl of silica was added and mixed manually for 3 to 5 min.The tubes were centrifuged at 12,000 rpm for 30 s.The pellet was washed twice with 70% ethanol and dried.The pellet was re-suspended in 50 µl of distilled water and incubated at 55°C for 5 min.The tubes were centrifuged at 12,000 rpm for 2 min and supernatant transferred to a new 500-µl eppendorf tube.Aliquot was used for quantification of total DNA in gel electrophoresis.
DNA is visualized and quantified on 0.8% (w/v) agarose gels containing 3 µl / 40 ml of agarose solution in 1 X TBE buffer.The gel was run at 80 V current for 45 min.Then, the gel was visualized under U-V lamp and documented using Alpha Imager.DNA concentration for polymerase chain reaction (PCR) amplification was estimated by comparing the band intensity of a sample with the standard lamda DNA concentration.The dilutions were carried out by dissolving the genomic DNA in appropriate quantity of TE buffer (pH 8.0).

DNA amplification
DNA from 27 cocoa candidate plus trees was amplified using a set of 10 simple sequence repeat (SSR) primers.The sequence details of the SSR primers were as follows: 2 µl of template DNA, 1.2 µl of dNTP were mixed, and 2 µl of 10 µM primer, 1.5 µl of 10X assay buffer, 0.20 µl of 1U Taq DNA polymerase and sterile water (9.5 µl) were used as reaction mixture.Annealing temperature was calculated based on melting temperature of primer.Denaturation (95°C) and extension (72°C) temperatures were set.After running 35 cycles in PCR (Make:MJ Research), the products were run in 3% agarose gel containing 5 µl of ethidium bromide dye.Gels were documented using Alpha imager TM 1200-Documentation and analysis System of Alpha Innotech Corporation, USA.The primers that produced amplification were used for analyzing diversity in all the selected trees.Allele sizes of PCR products were determined by referring to the 100 bp ladders (Bangalore Genei Cat.No. 105656).

Polymorphism survey of SSR markers
Polymorphism survey of SSR markers was carried out by considering only the clear and unambiguous bands.Markers were scored for the presence and absence of the corresponding band among the different trees.The scores '1' and '0' were given for the presence and absence of bands, respectively.

Cluster analysis
The data obtained by scoring the SSR profiles of different primers were subjected to cluster analysis.Similarity matrix was constructed using Jaccard's coefficient and the similarity values were used for cluster analysis and dendrogram was constructed by unweighted pair-group method using arithmetic averages (UPGMA) with the Sequential Agglomerative Hierarchical and Nested (SAHN) function (Sneath and Sokal, 1973).Data analysis was done using NTSYSpc version 2.02i (Rohlf, 1998).

Genetic diversity estimation
After visualizing the gel, amplified fragments of each SSR marker were scored as ''1'' and ''0'', where ''1'' indicated the presence of a specific allele (band) and ''0'' indicated its absence.Polymorphism information content (PIC) of SSR markers was calculated using the formula developed by Anderson et al. (1993).A PIC value of each locus was calculated as Where, P ij is the relative frequency of the i th allele for the locus j and was summed across all the alleles (L) over all lines.PIC provided an estimate of the discriminatory power of a locus by taking into

Primer name Sequence
account, not only the number of alleles that are expressed, but also the relative frequencies of those alleles.PIC values ranged from 0 (monomorphic) to 1 (very highly discriminative), with many alleles in equal frequencies.Genetic diversity estimate related analyses were done using NTSYSpc ver.2.02i (Rohlf, 2000).Genetic similarities (GS) between pairs of trees were measured by the DICE similarity coefficient based on the proportion of shared alleles with SIMQUAL module.Genetic distances between pairs of lines were estimated as GD or D = 1 -GS.The clustering of trees was done based on a similarity matrix using UPGMA algorithm following SAHN module.
The clustering result was used to construct a dendrogram following TREE module (Ali et al., 2008).

RESULTS AND DISCUSSION
All the SSR markers used in this study was developed at CIRAD by Lanaud et al. (1999).Among the 10 SSR primers used in the present study, 7 primers produced discrete, scorable and unambiguous bands.The details of SSR primers used for assessing the mole-cular diversity among 27 candidate plus trees are pre-sented in Table 1.The PCR product size obtained by the amplification of SSR primers ranged from 190 to 350 bp.The amplification pattern of SSR markers namely mTcCIR11 and mTcCIR33 in 27 candidate plus trees are presented in Figures 2 (mTcCIR11) and 3 (mTcCIR33), respectively.

Allele diversity of SSR marker analysis
The allele diversity and polymorphism information content (PIC) was calculated for all the SSR markers used in this study (Table 2).Highest PIC was recorded by the primer TcCIR33 (0.667) and it was found to be the lowest for the primers mTcCIR22, mTcCIR28 and mTcCIR40 (0.000).The higher the PIC value, the more informative is the SSR marker.Hence, primer mTcCIR33 was found to be highly informative.

Cluster analysis
The banding pattern of the SSR markers scored in the form of binary data was used for computing Jaccard's similarity index.The similarity index values obtained for each pair wise comparison among the 27 candidate plus trees are presented in Table 4.The similarity coefficients based on 10 SSR markers ranged from 0.39 to 1.00.Among the 27 candidate plus trees studied, the highest similarity index (1.00) was recorded between the candidate plus trees SMJ 10 and VPS 13 and also between SME 21 and SME24.The lowest similarity index (0.39) was recorded by the candidate plus trees SMJ 25 and KUL 25.The similarity values obtained for each pair wise comparison of SSR markers among the 27 candidate plus trees were used to construct dendrogram based on hierarchical clustering and the results are presented in Figure 1.27 candidate plus trees were grouped into eight clusters at nearly 54% similarity level.The cluster size varied from eight (cluster IV) to 1 (Clusters II, V and VI).
The list of all the eight clusters along with the trees included is presented in Table 3.The cluster I consisted of the candidate plus trees of SMJ 33, SMJ 34 and SMJ 50.The cluster II consisted of SMJ 3. The cluster III Among the primers used in the study, the primer mTcCIR11 was able to show 330 bp allele specific to SME 5 candidate plus tree and the primer mTcCIR15 showed 250 and 230 bp allele specific to SMJ 3. Similar studies were carried out by Motamayor (2001), Motamayor et al. (2002) in Central America, Efombagn et al. (2006) in South Cameroon, Johnson et al. (2009) in Trinidad andTobagao, andZhang et al. (2009) and Aikpokpodion et al. (2009) in West Africa.These findings suggest that ample diversity exists in cocoa in Tamil Nadu.These trees have to be observed for both yield and quality parameters for few more years and promising trees have to be clonally raised and tested before using    in breeding programmes.

Figure. 2 .
Figure.2.SSR marker profile of cocoa candidate plus trees generated by the primer mTcCIR11.

Figure 3 .
Figure 3. SSR marker profile of cocoa candidate plus trees generated by the primer mTcCIR33.

Table 1 .
List of primers used for SSR analysis.

Table 2 .
SSR marker profile across candidate plus trees of cocoa.

Table 4 .
Genetic similarity co-efficient among the candidate plus trees of cocoa based on SSR analysis.

Table 3 .
Cluster composition of candidate plus trees of cocoa for SSR markers.