Identification and characterization of Curcuma comosa Roxb . , phytoestrogens-producing plant , using AFLP markers and morphological characteristics

‘Waan chak modlook’ (Curcuma comosa Roxb.) is an important medicinal plant of Thailand, but the high similarity in morphological characteristics observed among Curcuma spp. may cause confusion in its utilization. The amplified fragment length polymorphism (AFLP) marker was used to identify and elucidate the phylogenetic relationships among 97 accessions of ‘Waan chak modlook’ collected throughout Thailand. Nine AFLP primer combinations generated a total of 202 bands, of which, 158 bands were polymorphic, with an average of 17.56 bands per primer pair. Pairwise similarity estimated between all samples ranged from 0.39 to 1.00 with an average of 0.67. The phylogenetic tree derived from AFLP data showed that the ‘Waan chak modlook’ accessions were divided into five clusters. Based on morphological characterizations, all samples could be assigned to four species: Curcuma sp.; Curcuma latifolia Rosc.; Curcuma elata Roxb.; and C. comosa. The results indicated that there were other Curcuma species that were misused as C. comosa. The DNA fingerprint data along with the morphological data provided the keys for the accurate identification of C. comosa from the other three related species.


INTRODUCTION
The genus Curcuma consists of about 80 species that have widespread existence in the tropics of Asia and extend to Africa and Australia.More than 50 species have been found in Thailand (Larsen, 1964).Locally, Curcuma comosa (Roxb.) is called 'Waan chak modlook' and has been used generally in Thai indigenous medicine as an anti-inflammatory agent for the alleviation of post partum uterine pain, enhancement of uterine involution and for anti-inflammation of the uterus after delivery (Jantaratnotai et al., 2006;Sodsai et al., 2007).Pharmacological research found that the rhizome of C. comosa contains phytoestrogens, which are plantderived, estrogenic-like compounds (Kurzer and Xu, 1997).Several pure compounds had been isolated from the rhizomes of C. comosa and two major groups of structures, sesquiterpenes and diarylheptanoids, were isolated and identified (Suksamrarn et al., 2008;Xu et al., 2008).Some of the isolated diarylheptanoids and their modified analogues exhibit estrogenic activity comparable to or higher than that of the phytoestrogens genistein (Suksamrarn et al., 2008).Recently, the phytoestrogens have been used as dietary supplements or alternative remedies and have high potential to replace estrogen, which has been reported to have side-effects (Gruber et al., 2002;Yeh, 2007).
The identification of C. comosa has been ambiguous, usually relying on the appearance of the master rhizome of the raw herb.This may not be sufficient, because many of the Curcuma species have very similar rhizome morphology.Misidentification of this herbal plant can lead to its substitution with potentially toxic plants.For example, Pimkaew et al. (2008) reported that the rhizome of Curcuma latifolia Rosc. is morphologically similar to C. comosa, has less estrogenic activity and is very toxic.The hexane extract of C. latifolia causes enlargement of the liver, kidney and spleen in guinea pigs.There was a report that C. latifolia had been misused as a substitute for C. comosa (Soontornchainaksaeng and Jenjittikul, 2010).Currently, several companies have extensive manufacturing facilities for C. comosa supplements in the form of a tonic juice for women.Thus, C. comosa is widely cultivated for economic purposes.If the plant source is misidentified, it could affect many consumers severely.The use of morphological markers or other methods for the accurate identification of C. comosa and its related species is necessary.Nowadays, molecular markers are powerful tools, not only for the evaluation of genetic diversity, but also in the identification of species.Many researchers have reported the use of DNA markers to identify plant species Sreeja (2002) characterized five Curcuma spp., namely Curcuma longa, Curcuma zedoaria, Curcuma caesia, Curcuma amada and Curcuma aromatica, based on random amplified polymorphic DNA (RAPD) data.Theerakulpisut et al. (2005) also reported the use of RAPD markers to classify members of the genus Zingiber.Amplified fragment length polymorphism (AFLP) is one of the DNA marker techniques and was developed by Vos et al. (1995).This technique is an effective, cost efficient and reproducible method for revealing DNA polymorphisms without any prior knowledge of the genome of the species being studied.In the current experiment, AFLP markers were used to identify and elucidate the phylogenetic relationships among 97 accessions of 'Waan chak modlook' collected throughout Thailand.Moreover, the morphological characteristics were observed.The DNA fingerprint data, along with the morphological data, were used for the accurate identification of C. comosa.

Plant material and genomic DNA extraction
A total of 97 accessions of Curcuma spp., locally called 'Waan chak modlook', was collected from cultivated sites in 38 provinces throughout Thailand.All samples were grown at the National Corn and Sorghum Research Center, Pakchong, Nakhon-Ratchasima province and the following characteristics in each accession were observed: plant height, shape of master rhizome, inside color of master rhizome, presence of sessile tubers, presence of red path along the midrib, inflorescence and flower morphology.Total genomic DNA was extracted from leaf tissue according to the CTAB method, following the procedures of Agrawal et al. (1992).The concentration of DNA was quantified by measuring the absorbance of UV light (260 nm) by spectrophotometer and then adjusting the concentration to 50 ng/µL for AFLP analysis.
The pre-selective amplification reaction was performed using 2 µL of digestion/ligation reaction solution in 25 µL of PCR reaction solution, containing 200 mM Tris-HCl pH 8.4, 500 mM KCl, 1.5 mM MgCl2, 0.2 mM of each dNTP, 0.2 pmol of EcoRI and MseI adapterdirected primers (each possessing a single selective base, E+1; M+1) and 1 U of Taq DNA polymerase (Invitrogen, Brazil).PCR reactions were performed with the following profile: 94°C for 3 min, 30 cycles of 30 s denaturing at 94°C, 30 s annealing at 56°C and 60 s extension at 72°C, ending with 5 min at 72°C to complete extension.After checking for the presence of a smear of fragments (100 to 1000 bp in length) by agarose gel electrophoresis, the amplification product was diluted 20 times in 0.1 × TE.Selective amplification (second PCR) of the diluted pre-amplification products was carried out using nine primer combinations (Table 1).
Selective PCR reactions were performed with the following profile: 94°C for 60 s, 36 cycles of 30 s denaturing at 94°C, 30 s annealing and 60 s extension at 72°C, ending with 10 min at 72°C to complete extension.Annealing was initiated at a temperature of 65°C, which was then reduced by 0.7°C for the next 12 cycles and maintained at 56°C for the subsequent 23 cycles.The second PCR products were mixed with 10 µL of loading dye (98% formamide, 10 mM EDTA, 0.01% w/v bromophenol blue and 0.01% w/v xylene cyanol), denatured at 95°C for 5 min and separated on 6% denaturing polyacrylamide gels (6% polyacrylamide 29:1, 7 M urea) in 1 × TBE buffer.The gels were pre-run at 300 V for about 30 min before 10 µL of the mix was loaded.Gels were run at 300 V for about 2.5 h.The AFLP fragments were visualized by silver staining (Benbuasa et al., 2006).

Data analysis
For the diversity analysis, each PCR product was assumed to represent a single locus and was scored as present (1) or absent (0).A binary matrix was imported into NTSYS-pc version 2.20k (Rohlf, 2005) for cluster analysis.Genetic similarity among all accessions was calculated according to Jaccard's similarity index (JSI) (Jaccard, 1908) by the SIMQUAL subprogram, and the SAHN subprogram was used for cluster analysis by the UPGMA method (unweighted pair-group method with arithmetic means) (Sneath and Sokal, 1973).A co-phenetic matrix was produced using the hierarchical cluster system, by means of the COPH routine, and correlated with the original distance matrices for the AFLP data, in order to test for agreement between the cluster in the dendrogram and the JSI matrix.The genetic relationships between the accessions were portrayed in a dendrogram, based on the results from cluster analysis.The polymorphic information (PIC), which is an index for the analysis of the polymorphism of each amplified DNA fragment, was calculated by Equation 1 (Anderson et al., 1993): Where, Pi =allele frequency

RESULTS
A total of 202 bands, ranging from 100 to 1,100 bp, was scored.The average number of bands per primer combination was 22.44, while the range for the nine primer combinations was 16 to 31 (Table 1).The number of polymorphic bands was 158 (78.22% of the total bands) with an average of 17.56 bands per primer pair.The E-ACC/M-CTG primer combination produced the highest percentage of polymorphisms (100%), while the lowest percentage was obtained from E-AAG/M-CAG (58.33%).The genetic similarity among all accessions was calculated by Jaccard's coefficient.The results showed that the genetic similarity varied from 0.39 to 1.00.The mean similarity was 0.67.The PIC value ranged from 0.00 to 0.50 (mean 0.25).
UPGMA analysis of the genetic similarity estimates was performed (Figure 1).All samples were separated into five major clusters at a cut-off genetic similarity value of about 0.63.The clustering of the accessions based on genetic similarity did not correlate with the region of origin of the samples.The co-phenetic correlation coefficient (rvalue) between the AFLP-based data phylogenetic tree and the similarity matrix clustering was 0.99.The morphological characters of each cluster were recorded Keeratinijakal et al. 2653 in Table 2.The samples were identified using available flora and monographs (Hooker, 1894;Backer and Bakhuizen Van Den Brink, 1968).The 'Waan chak modlook' accessions in cluster IV and V could be assigned to C. comosa.The samples in cluster II could be identified as C. latifolia (accession nos.39, 48, 49, 56 and 59), while the samples in cluster III could be identified as Curcuma elata Roxb.(Accession nos.96 and 97).However, the samples in cluster I could not be identified at the species level, because their morphological characteristics were incompatible with the previous taxonomic reference data.The typical rhizome characteristics of 'Waan chak modlook' samples were shown in Figure 2.

DISCUSSION
Currently, little is known about the genome of C. comosa in Thailand, because it has not been examined.The AFLP technique is most appropriate for the analysis of unknown genomes (Tomkins et al., 2001) as it is more reproducible than other molecular marker systems and AFLP profiles do not alter with minor variations in experimental conditions (Singh et al., 1999).In the current study, the AFLP technique has proven to be useful in investigating the genetic variation of 'Waan chak modlook'.After the classification data based on AFLP markers were obtained, the morphological characters of each cluster were observed.Interestingly, the samples within the same cluster shared some phenotypical characteristics.The agreement observed between the AFLP-based dendrogram and the classification based on morphological characters proved that AFLPs can be successfully applied to study genetic relationships between Curcuma spp.All of the primer pairs applied in the current study revealed high levels of polymorphism, and similar levels were observed among almost all of the primer combinations tested, confirming that high genetic diversity exists within the Curcuma genomes.The same result was reported by Syamkumar and Sasikumar (2007), who used inter-simple sequence repeats (ISSR) and RAPD markers to evaluate the genetic diversity of 15 Curcuma species in India.Since 'Waan chak modlook' is a clonally propagated plant, lower genetic diversity was expected (Eckert, 1999).This plant was recently collected from its natural habitats for cultivation on farms for economic purposes.The selection of favorite phenotypes by the farmers or breeders had just begun.Thus, the genetic variability of 'Waan chak modlook' remained high and the germplasm collection in this study was a valuable source of promising raw material for further crop improvement.As shown in Figure 1, the genetic relationships among all samples were clear.The dendrogram showed that 'Waan chak modlook' samples were separated into five major clusters.The co-phenetic correlation coefficient (rvalue) between the AFLP-based data phylogenetic tree and the similarity matrix clustering was very high (0.99), demonstrating a good fit between the phylogenetic tree clusters and the similarity matrix from which they were derived, and it confirmed that the clusters in the phylogenetic tree were reliable.The morphological classification results indicated that only 'Waan chak modlook' samples in clusters IV and V were C. comosa.It is interesting to note that 'Waan chak modlook' cluster I (57 samples) were unidentified Curcuma species and have been widely mistaken for medicinal C. comosa because of their similar master rhizome morphologies.Interestingly, the samples in cluster I were widely used.These samples may contain phytoestrogenic compounds.Future research on the pharmacological activities of these samples should be explored.
The other related species of C. comosa were identified as C. latifolia, C. elata and Curcuma sp., of which, C. latifolia and C. elata have been reported previously to have been misused as C. comosa.Soontornchainaksaeng and Jenjittikul (2010) collected 'Waan chak modlook' from 16 sites in Thailand for observation of the chromosome numbers some morphological characteristics.The results revealed that 'Waan chak modlook' can be separated in to five cultivars belonging to three species; C. comosa, C. elata and C. latifolia and they also reported that C. comosa consisted of two cultivars.One cultivar has a cylindrical spike that is 13 to 17 cm long and 5 to 8 cm wide, whereas the cylindrical spike of the other cultivar is shorter (10 to 15 cm) but wider (8 to 12 cm).The former cultivar chromosome number was 2n = 42 and the latter was 2n = 63, and seldom 2n = 62 or 64.The dendrogram result in the current study showed that C. comosa could be divided into two clusters, which was in agreement with Soontornchainaksaeng and Jenjittikul (2010), based on the chromosome number.
The results from morphological characterization in the current study provided useful keys to identify four Curcuma species.C. comosa samples (cluster IV and V) could be distinguished from the other three related species (clusters I, II and III) by the following characteristics: peduncle length, leaf lower surface, presence of sessile tubers and master rhizome.C. comosa samples had short peduncles (2 to 5 cm), while other species had long peduncles (8 to 15 cm).C. comosa samples had a glabrous lower leaf surface, whereas the other samples in clusters II and III had a pubescent lower leaf surface.All the samples in clusters I, II and III had lateral sessile tubers, but C. comosa did not (Figure 2).When the master rhizomes or sessile tubers of C. comosa were cut or broken, the fine spindles were not found and the inside texture of the rhizomes was fine.Another unique characteristic was the young mango-like odour observed in C. comosa rhizomes.
In the case of Curcuma sp.(cluster I) and C. latifolia (cluster II), the morphological characteristics that could differentiate these two species were: leaf lower surface; the ratio of coma bract length to flower bract length; and bract apexes.The Curcuma sp.samples had a glabrous lower leaf surface, while C. latifolia had a pubescent lower leaf surface.The ratio of the coma bract length to flower bract length of the samples in cluster I was 2:1 or 1:2, whereas for the samples in cluster II it was 1:1.The bract apex of samples in cluster I was obtuse or rounded, while for the cluster II samples it was acute.C. latifolia and C. elata are morphologically very similar.The only characteristic that could be used to separate these two species was a red strip along the midrib on the upper leaf surface found on C. latifolia.
The morphological characteristics that could be used to separate C. comosa cluster IV from cluster V were inflorescence shape and flower color.The inflorescence shape of C. comosa in cluster IV was a slender cylindrical spike, while the inflorescence shape of cluster V was a large cylindrical spike with the same length as cluster IV, but larger in diameter.Moreover, the cluster IV samples had a light yellow flower, but the samples in cluster V had a white flower.The samples in cluster I could not be identified at the species level, because their morphological characteristics were incompatible with the previous taxonomic data.It appears that further taxonomic study is necessary to identify the Curcuma species in this cluster.
This study provided the key characteristics to identify the medicinal plant 'Waan chak modlook' (C.comosa) from the other three related species, which were always misidentified.Only the master rhizomes were sold on the market.The master rhizomes of the other species, where the sessile tubers were always removed by the merchants, closely resemble that of C. comosa (Figure 2).Thus, it is hard for customers to identify the master rhizome of C. comosa accurately.The large ovoid to ovate spheroidal shape and the diameter of about 8 to 15 cm of the master rhizome of Curcuma L. was always called 'Waan chak modlook' (Soontornchainaksaeng and Jenjittikul, 2010).Misidentification of the herbal plant as C. comosa could lead to substitution with potentially toxic plants.In the current study, the unique master rhizome characteristics of C. comosa were: the absence of fine spindles when the rhizomes were cut or broken, young mango-like odour and fine texture inside.The customer may use these characteristics as a simple method to identify C. comosa master rhizomes.With regard to upper ground characteristics, C. comosa can be distinguished from the other three related species by the peduncle length and lower leaf surface.Pimkaew et al. (2008) reported that the rhizome of C. latifolia has less estrogenic activity and is very toxic.In addition to C. latifolia, two Curcuma species (C.elata and Curcuma sp.) were reported to have been mistaken as C. comosa.However, the pharmacological activities and toxicities of C. elata and the unidentified Curcuma sp. have not been investigated yet.Nevertheless, the DNA fingerprint data revealed that C. comosa samples could be classified into 24 genotypes.These different genotypes might contain various levels of biologically active compounds.Further investigation should focus on the pharmacological and toxicological characterization of the related species and the various genotypes of C. comosa.It may be possible to find a correlation between the species or genotypes and the variation in the biologically active compounds.This information would be beneficial for plant breeders in selecting the proper varieties or species that provide appropriate levels of phytoestrogens and are not toxic, in consideration of the safety of consumers.

Conclusion
C. comosa, locally called 'Waan chak modlook', is an important indigenous plant in Thailand.The AFLP technique could be used to evaluate genetic relationships among 'Waan chak modlook' samples.The results revealed that the germplasm collection of 'Waan chak modlook' had great genetic diversity.The morphological characteristics of all samples were observed and the results indicated that there were three Curcuma species (C.latifolia, C. elata and Curcuma sp.), which were misused as C. comosa.The current study also provided some morphological traits that could be used to distinguish C. comosa from the other three related species.

Figure 1 .
Figure 1.UPGMA-derived phylogenetic tree illustrating the relationship among 97 accessions as inferred by AFLP analysis.

Figure 2 .
Figure 2. Rhizomes of 'Waan chak modlook' samples; (A) typical C. comosa rhizomes (cluster IV and V) and (B) typical rhizomes of samples in cluster I-III.The arrow head indicates a sessile tuber.

Table 1 .
Average number of bands, number of alleles and proportion of polymorphic bands obtained for the 97 accessions from nine selective primer combinations.

Table 2
Morphological characteristics of samples in each cluster.