Analysis of genetic diversity and construction of core collection of local mulberry varieties from Shanxi Province based on ISSR marker

1 Weed Research Laboratory, Life Sciences College of Nanjing Agricultural University, Nanjing 210095, China. 2 The Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhen jiang 212018, China. 3 Biological and Environmental Engineering College, Jiangsu University of Science and Technology, Zhen jiang 212018, China. 4 The Key Laboratory of Genetic Improvement of Silkworm and Mulberry, Ministry of Agriculture, Zhenjiang 212018, China.


INTRODUCTION
The cultivated mulberry in China could be divided into 8 different eco-types and the local varieties of mulberry in Shanxi province is one of the 8 mulberry eco-types, becoming the most important component of mulberry gene bank of China.Through natural and artificial selection, the local varieties of mulberry in Shanxi province had adapted to the natural environment and formed the unique botanic and biological characteristics (Pan et al., 2000).
markers are unaffected by environment, detectable at all stages of development and ubiquitous in number covering the entire genome.ISSR molecular markers of simple and reproducible benefits have been used in cultivar identification and genome mapping, genetic distance analysis and population genetics studies.Vijayan et al. (2004) led to the genetic diversity analysis on the Indian wild species of mulberry with 17 ISSR markers.Awasthi et al. (2004) identified the relationship of 15 mulberry species with six ISSR markers, Zhao et al. (2005Zhao et al. ( , 2006aZhao et al. ( , b, c, 2007Zhao et al. ( , 2008) ) studied on genetic diversity and phylogeny of cultivated and wild mulberry, diploid and homologous triploid mulberry, breeding variety, Feng Wei Sang and different ecotypes with ISSR markers.Prasanta et al. (2008) analyzed the genetic variability and association of ISSR markers with some biochemical traits in mulberry (Morus spp.) genetic resources available in India.Huang et al. (2008) analyzed the genetic relationship of local varieties of Morus alba L. from Shandong province.Zhang et al. (2010) analyzed the genetic relationship of local varieties of mulberry from the lower area of Yellow River based on ISSR marker.
In 1984, the core collection construction was first proposed by Australian scholar Frankel, that is, with the small amount of genetic resources sample, to maximize the representative of the diversity of the main genetic resources.The aim of core collection construction is to give priority to the evaluation and utilization of core collection and to improve the management of genetic resource bank.
The traditional construction methods of core collection were based on the morphology or isozyme marker.Because morphology marker could be influenced by environmental factors easily and the results would not be accurate.DNA marker is rapid, accurate, efficient and not influenced by environmental factors, so it is an efficient method to construct core collection.Currently, by using the molecular marker, RFLP, RAPD, SSR, AFLP and so on, the construction of core collection of small germplasm samples has been reported: Sun et al. (2001) used RFLP to construct the core collection of common wild rice (Oryza rufipogon Griff.) and Asian cultivated rice (Oryza sativa L.).Hintum et al. (1994) used different DNA markers to construct the core collection of European spring barley (Hordeum vulgare S.) and compared the core collection to initial collection.Shen et al. (2001) used SSR to construct and evaluate the core collection of 120 collections of Yuannan local rice (O.sativa L.).Skroch et al. (1998) used RAPD to construct core collection of Mexico common soybean.Liu et al. (2006) used SSR and AFLP to construct and evaluate the core collection of 110 collections of pomelos (Citrus grandis Osbeck); the results showed that the core collection can well represent the initial collection.
In order to facilitate the preservation and further evaluation of germplasm of mulberry and in order to promote the management of the national mulberry germplasm gene bank in China, this research studied the diversity of 73 local varieties from Shanxi province with ISSR molecular marker, obtained the clustering UPGMA charts and constructed core collection using stepwise clustering and random sampling method based on the clustering results and clustering charts and finally evaluated as well the core collection using the related parameters of genetic diversity.

Plant materials
The 73 local mulberry varieties used in this study were obtained from the national mulberry gene bank of the sericultural research institute, Chinese Academy of Agricultural Sciences (CAAS), Lin et al. 7757 Zhenjiang, Jiangsu Province, China.The county of origin and the number of the tested varieties are listed in Table 1.

DNA isolation
Total DNA was extracted from approximately 1.5 g of young leaves with the modified CTAB method (Zhao et al., 2000).The genomic DNA was quantified on 0.8% agarose gels and the samples were stored at -20°C for ISSR analysis.

ISSR amplification, separation and visualization
Twenty-two (22) ISSR markers (synthesized by Shanghai Bioasia Technology Co. Ltd., China) were screened using DNA samples from five varieties: Hong Ge Lu (73), Da Jing Sang (11), Ling Gu Da Ye (52), Jin Cheng Bai Pi Sang (56) and Yang Cheng Hei Ge Lu (61).Amplifications for screened primers and DNA samples were conducted independently for two to three times with the same procedure to verify the reproducibility and consistency of the ISSR markers.15 primers were chosen out from 22 for ISSR analysis of genetic diversity based on their reproducible producing bands (Table 2).PCR reactions were carried out in a volume of l5 µl containing 10 ng of total DNA, 10 × PCR buffer (200 mmol/l Tris-HCl pH 8.4, 2.5 mmol/l, 500 mmol/l KCl), 0.25 mmol/l of each dNTP, 6 pmol/l of each primer and 1 U of Taq DNA polymerase.The optimum annealing temperature was determined for each primer.PCR cycling conditions for all mulberry varieties (Flexigene thermal cycler) were: 2 min initial denaturation step (94°C); 36 cycles of 40 s at 94°C, 45 s at each specific annealing temperature and 90 s at 72°C; 7 min at 72°C.DNA fragments amplified were separated in 2.2% agarose gels at 90 W for 4 h in 1 X TBE buffer (100 mmol/l Tris-borate, pH 8. 0.2 mmol/l EDTA).The gel was dyed with ethidium bromide, visualized under ultra-violet light and photographed using a Kodak Digital Science 1D -EDAS 120 computerized gel analysis system.Molecular sizes of the amplified fragments were roughly estimated using a 2000 bp ladder (TaKaRa Dalian Biotechnology Co., Ltd., China).

Diversity data analysis
DNA banding patterns generated by ISSR were scored for the present (1) or the absent (0) of each amplified band and all ISSR assays were repeated twice and only distinct, reproducible, wellresolved bands were scored.Calculate the number of all PCR amplification bands and polymorphic bands per primer and evaluate the percentage of polymorphism.Nei'S (Nei and Li, 1979) gene diversity, Shannon's information index, genetic similarity, genetic distance estimated by Nei'S coefficient between pairs and dendrograms based upon the unweighted pair group method with arithmetical averages (UPGMA) were analyzed using Popgene software, version 3.5.

Construction method of core collection
A dendrogram of all 73 local varieties of mulberry based on the genetic similarity coefficient was generated by UPGMA cluster method.According to the clustering results and dendrogram, we used stepwise clustering and random sampling method to construct core collection, that is, according to a dendrogram, one accession of each group with two accessions of similar genetic variation was randomly chosen for next cluster, the accession went into next cluster if there was only one accession in a group.The sample from the first cluster was clustered and chosen again in the same way.When the sample number meets the designed standard, the cluster was stopped and the core collection could be constructed by these accessions.The difference in genetic diversity between core collection and initial samples was measured by t-test for means, coefficient of variation and range (Hu et al., 2000(Hu et al., , 2001)).

Core collection data analysis
With the original data (0 and 1 composed matrix) gained from ISSR marker, we calculated the number of polymorphic loci, the percentage of polymorphic loci, the number of observed alleles, the number of effective alleles, Nei's gene diversity and Shannon's information index of the samples by PopGene32 software and did ttest by SPSS 13.0 software (Liu et al., 2006).

Levels of polymorphism revealed by ISSR-PCR markers
From prescreening assays with five mulberry varieties using 22 ISSR primers, 15 markers generated bright amplification products and polymorphisms and were used in further analysis (Table 2).A total of 129 reliable fragments were obtained.The number of bands per primer ranged from 5 to 12 with an average of each primer amplified 8.6 bands.Among them, 115 bands were polymorphic, accounting for 89.15%.The number of polymorphic bands per primer ranged from 4 to 12 with the average number of bands per primer being 7.7.The results of PCR amplification are given in Figure 1.

Genetic variation and cluster analysis of local varieties mulberry from Shanxi Province
Using the data from all PCR amplification bands shown by 15 ISSR markers, the genetic similarity matrix among all sources used in this study was obtained by multivariate analysis using Nei's coefficient.Similarity coefficients ranged from 0.5891 to 0.9457 with an average of 0.7674.The highest genetic similarity coefficient (0. 9457) was found between Bai Ge Lu No.1 hao(2) and Bai Ge Lu No.2(3),indicating that they are closely related.The lowest genetic similarity coefficient (0.5891) was found between Ge Mo Sang(28) and He Kou No.23(71), indicating that they are relatively remote in relationship.Evenly, each loci owned that the observed number of alleles, effective number of alleles, Nei's gene diversity, Shannon's information index was 1.8915, 1.4771, 0.2780 and 0.4197, respectively.List of genetic diversity information are given in Table 3.The observed number of alleles of each loci, effective number of alleles of each loci, Nei's gene diversity, Shannon's information index were 1.8915, 1.4771, 0.2780 and 0.4197, respectively.
A dendrogram was obtained by UPGMA method using the total number of amplified fragments of the 15 ISSR primers.Clustering results showed that the tested varieties could be divided into three different groups (49 mulberry cultivars were clustered into Group i, 23 mulberry cultivars were clustered into Group ii, only 1 mulberry cultivar Jin Newer Sang was clustered into

Core collection construction
With the stepwise clustering and random sampling method, six primary core collection groups (i, ii, iii, iv, v, vi) were chosen out, which was composed of 62, 47, 30, 21, 18 and 15 collections, respectively.The ratio of primary core collection samples was 84.93, 64.38, 41.10, 28.77, 24.66 and 20.55%.The number of polymorphic loci, the percentage of polymorphic loci, the number of observed alleles, the number of effective alleles, Nei's genetic diversity and Shannon's information index of Group i, ii, iii, iv, v, vi were calculated by PopGene32 software (the results are shown in Table 4).Comparing the genetic data gained with different groups, we discovered that the number of effective alleles, Nei's genetic diversity and Shannon's information index of Group iv which were composed by 21 samples were the highest among all the groups, although, the number of polymorphic loci, percentage of polymorphic loci, number of observed alleles of Group iv were lower than the initial Group and Group i, ii, iii.When the sampling rate falls to 24.66%, some of the molecular marker loci were lost due to sampling.Therefore, sampling rate of 28.77% is the best and Group iv could preserve the original diversity of samples.So, we regarded Group iv which was composed by 21 samples as the core collection.The core collections were as follows: Nan He No.7(1), Jin Niu Er Sang(9), Da Jin Sang (11), Jin Luo Sang ( 16), Hong Yan Sang (20), Wu Zhi Sang (24), Zhang Zhuang No.5 (25), Xian Yi No.5 (29), Hong Ya Sang No.1 (32), Da Hei Lian (36), Zhong Yang No.3 (39), Bai Guo San(44), Ling Lu Sang(47), Jin Cheng Huang Lu Tou No.1 (54), Jin Hei Ge Lu (59), Hei Ge Lu No.4 (62), Jin Cheng Bai Ge Lu No.1 (66), Heng He Hong Ge Lu(67), Nan He No.26 (70), He Kou No.23 (71), Yang Cheng Huang Ge Lu (72).

Comparison of core collection with initial sample
The core collection reserved 28.77% of initial sample, nevertheless, its retention rate of the number of poly-morphic loci, the percentage of polymorphic loci, the number of observed alleles, the number of effective alleles, Nei's genetic diversity and Shannon's information index were 93.91,93.91,97.13,101.48,104.25 and 103.36 (Table 5), indicating that the core collection could remain the basic structure and the rich genetic diversity of the initial sample.
We did t-test to the parameters of the core collection and initial sample by SPSS software.The results showed that the core collection can well represent the initial sample (Table 6).As seen from Table 6, the variance of effective number of alleles (NE), Nei's gene diversity (H) and Shannon's information index (I) of the core collection were similar to that of the initial sample, the standard deviation of effective number of alleles (NE), Nei's gene diversity (H) and Shannon's information index (I) were not significant at 0.05 levels between the core collection and initial sample, with the exception of observed the number of alleles (NA).

DISCUSSION
China holds over 3000 collections of mulberry germplasm resource, containing 15 species and 4 subspecies.With the amount of mulberry germplasm resource gradually increasing, the conservation, evaluation, research, utilization and management of mulberry would become more and more difficult.The construction study of the core collection was of important significance for the management, utilization, evaluation and identification of germplasm resource.Correct evaluation on genetic similarity of different collections is the premise to construct the core collection; meanwhile, appropriate sampling methods and reasonable percentage were of great importance to construct the core collection.Generally, the sampling percentage was regulated according to the size of the initial collection (Boukema et al., 1997).A low sampling percentage, such as 5 to 10%, was adopted when the size of initial collection is large, whereas a high sampling percentage, such as 20 to 30%,   was adopted when the size of initial collection is small (Frankel and Brown, 1984).The size of sample in this study is 73, that is, a small sample.When the sampling rate in this study fell to 24.66%, some of the molecular marker loci were lost due to sampling.Therefore, the best sampling rate of 28.77% was obtained, in other words, the core collection construction in this study is in line with common practice in building the core collection when the size of initial collection is small.Chen et al. (2008) established core collection of mulberry germplasm resources from Shandong and Hebei province based on ISSR molecular markers.In this study, the core collection retained the initial 23.91%, the retention rate of core collection in the number of polymorphic loci, the percentage of polymorphic loci, the number of observed alleles, the number of effective alleles, Nei's genetic diversity and Shannon's information index has reached 89.02, 89.03, 95, 102.24, 103.99 and 101.26%.In our study, the core collection retained the initial 28.77%, the retention rate of core collection in the number of polymorphic loci, the percentage of polymorphic loci, the number of observed alleles, the number of effective alleles, Nei's genetic diversity and Shannon's information index has reached 93.91, 93.91, 97.13, 101.48, 104.25 and 103.36%, respectively.As was shown earlier, most of the parameters of the latter were higher than those of the former, that is to say this study created a good and representative core collection.
Although, the local mulberry varieties in Shanxi Province were rich and abundant and distributed in various localities, including Jincheng county, Lingchuan county, Gaoping county, Changzhi county and other places, but Xu proposed that Qinshui county and(or) Yangcheng county were the origin of the main mulberry varieties in Shanxi Province (Xu , 1991).Clustering results of this study was consistent with Xu′s view.The 73 varieties in Shanxi Province were clustered into 3 categories, 9 sub-categories (A,B,C,D,E,F,G,H,I,). From the categories to see, 22 species of Yangcheng county and 13 varieties of Qinshui county distributed in every category of the three categories; from the sub-category to see, all of eight sub-categories (A, B, C, D, E, F, G, I ) contain varieties from Yangcheng county and/or Qinshui county, only one sub-category (H) with the exception that all the other counties of Shanxi Province introduced the mulberry from the Yangcheng county and/or Qinshui county in time to come.After the introduction and domestication, the locals might use the mulberry local varieties from Yangcheng county and Qinshui county as female or male breeding material, that is, there was gene flow between each other and the local mulberry varieties all over Shanxi Province had a common blood relationship, the clustering results of this study provides an evidence to Xu′s view " the origin of the main mulberry varieties in Shanxi Province was Yangcheng county and/or Qinshui county.

Figure 1 .
Figure 1.Electrophoretic pattern of 73 mulberry varieties amplified by primer ISSR02.The numbers in the figure are the same as those listed in Table1.M is the DNA marker (DL2000).
NPL = number of polymorphic loci; PPL = percentage of polymorphic loci; NA = observed the number of alleles; NE = effective number of alleles; H = Nei's gene diversity; I = Shannon's information index.

Table 1 .
Varieties of the local mulberries' origins and names.

Table 2 .
List of primers, amplification conditions and polymorphism of ISSR markers used.

Table 3 .
List of genetic diversity index.

Observed the number of alleles (NA) Effective number of alleles (NE) Nei's gene diversity (H) Shannon's information index (I)
Figure 2. A dendrogram obtained by UPGMA for 73 mulberry cultivars based on ISSR markers.The numbers in the figure are the same as those listed inTable 1.

Table 4 .
Comparison of genetic diversity among different sampling groups.PS= percentage of sample; NPL = number of polymorphic loci; PPL = percentage of polymorphic loci; NA = observed the number of alleles; NE = effective number of alleles; H = Nei's gene diversity; I = Shannon's information index.

Table 5 .
Comparison of the genetic diversity between initial sample and core collection.

Table 6 .
T-test results of mean, std dev, difference mean, difference std dev, t value between initial sample and core collection.
NA = observed the number of alleles; NE = effective number of alleles; H = Nei's gene diversity; I = Shannon's information index; *indicates significant difference at 0.05 level between the core collection and initial sample.**indicates no significant difference at 0.05 level between the core collection and initial sample.