Genetic diversity and demographic history of wild Yak ( Bos grunniens mutus ) inferred from mtDNA D-loop sequences

Academy of Animal Science and Veterinary Medicine, Qinghai University, Xining 810016, China. College of Life Science and Technology, Southwest University for Nationalities, Chengdu 610041, China. International Livestock Research Institute, P. O. Box 30709, Nairobi 00100, Kenya. College of Animal Science and Veterinary Medicine, Shenyang Agricultural University, Shenyang, 110161, China. Datong Breeding Farm of Qinghai Province, Datong 810102, China.


INTRODUCTION
The wild yak (Bos grunniens mutus), a species of Bovidae family, is a rare and valuable wildlife found on Qinghai-Tibetan Plateau with an altitude above 4,500 m.At present, there are about 15,000 heads of wild yak only survived in China (Wiener et al., 2003).The wild yak is considered as one of the endangered species.Over the past decades, some researchers have analyzed the genetic diversity of wild yak at morphological, physiological, biochemical and chromosomal levels (Li and Lu, 1990;Li et al., 1998).In recent years, preliminary analysis about the genetic diversity of wild yak has been done at the molecular level (Wang, 2000;Hu, 2001;Guo et al., 2006), which indicating abundant genetic diversity in wild yak population.Moreover, to the problem of demographic * Corresponding author.E-mail: maziwise2004@yahoo.com.Fax: 86-971-5318783.history, little is also known only for the wild yak.The mtDNA control region has been proven to be a powerful tool for investigating intra-or inter-species genetic variation, population structure and demographic history (Wolf et al., 1999).Therefore, in this study, we sequenced six wild yak mtDNA D-loop sequences and downloaded 15 corresponding wild yak sequences.Based on these 21 wild yak mtDNA D-loop sequences, we analyzed the genetic diversity of the wild yak and inferred their population expansion status.The results are useful for the conservation and utilization of wild yak genetic resources.

Samples collection
Fresh blood samples of six wild yaks were collected from the Datong Breeding Farm of Qinghai province in China.They were captured in infancy on the Qinghai-Tibetan Plateau, but no precise details of the locations of the captures are available.
Other mtDNA D-loop sequences in this study were downloaded from GenBank which included 15 complete or partial D-loop sequences of the wild yak (Accession numbers: AY749414, AY722118 and DQ139202-DQ139214) and a complete D-loop sequence of the cattle (B.taurus) (Accession number: V00654) as an out-group.

DNA extraction, PCR amplification and sequencing
DNA was extracted from frozen blood of six wild yaks using a standard phenol-chloroform method (Sambrook and Russell, 2001).The mtDNA D-loop region was amplified using the primers PF 5 -cta cag tct cac cgt caa cc-3 and PR 5 -ggg gtg tag atg ctt gc-3 .The polymerase chain reaction (PCR) reaction mixture contained 50 to 100 ng genomic DNA of wild yak, 20 pM of each primer, 0.50 U ExTaq DNA polymerase (TakaRa, Dalian, China), 10 × ExTaq Buffer (Mg 2+ free), 0.25 mM dNTP (deoxynucleotides), 2.5 mM MgCl2 and ddH2O in a final volume of 25 l.The following procedures were applied: an initial denaturation at 95°C for 4 min, followed by 35 cycles at 95°C/ 1 min, 58.5°C/45 s and 72°C/1 min 30 s, and a final extension at 72°C for 5 min.The PCR products were electrophoresed on a 1.0% agarose gel and purified using the DNA Agarose Gel Extraction Kit (Omega) according to the manufacturer's instructions.The purified fragments were cloned into pMD18-T vector and subsequently transformed into Escherichia coli JM109.After 15 to 20 h, single colonies were inoculated to obtain recombinant plasmid.The recombinant plasmid DNA was extracted and then sequenced using an ABI 3730 automated sequencer (Applied Biosystems).

Data analysis
All nucleotide sequences of the wild yak were aligned using BioEdit 7.0.9(Hall 1999) software with the Clustal W multiple alignment program and refined manually.Sequence variation sites, average nucleotide composition, nucleotide diversity ( ) and haplotype diversity (h) within population were estimated using DnaSP 4.10.1 (Rozas et al., 2003) and Arlequin 3.11 software (Excoffier et al., 2005).A neighbor-joining tree was constructed for all haplotypes with the cattle sequence (Accession number: V00654) as an outgroup based on the Kimura's 2 parameter model using Mega 4.0 (Kumar et al., 2004).
Two different approaches were used to investigate the demographic history of all samples and each lineage.First, Fu's Fs (Fu 1997) and Tajima's D statistics (Tajima 1989) were used to test whether the sequences conformed to the expectations of neutrality because both two test methods are appropriate for short DNA sequences.Fu's Fs statistic was very sensitive to population demographic expansion, which generally leads to a large negative Fs value, and significant D value may be because of factors like population expansion and bottleneck.Secondly, the observed distribution of pair-wise differences between sequences (Slatkin and Hudson, 1991) was examined.Populations that have been stable over time are predicted to have a bimodal or multimodal mismatch distribution, whereas a unimodal distribution is generally found in population having passed through a bottleneck or recent demographic expansion.In addition, the goodness-of-fit of observed distribution with the expected distribution was tested by calculating the sum of squared deviation (SSD) and raggedness index (r) with 1000 bootstrap replicates.The above analysis about demographic history of the wild yak was also carried out using DnaSP 4.10.1 and Arlequin 3.11 software.

Sequence variation and nucleotide diversity
The mtDNA D-loop sequences of six wild yaks were PCR-amplified, sequenced and all sequences were deposited in GenBank (Accession numbers: FJ548840-FJ548845).Combined with the other 15 D-loop sequences of wild yak and taking the sequence of DQ139203 (Hap1, Figure 1) as a reference, all 21 sequences were aligned.A 637-bp fragment of the hypervariable region of mtDNA D-loop sequences was used to analyze the genetic diversity.There were 45 variable sites within the fragment (Figure 1), accounting for 7.06% of the total number of sites.Among these polymorphic sites, ten were singleton variable sites and 34 were parsimony-informative sites.The 45 variable sites were comprised of one in/del, 41 transitions and three transversions.The average nucleotide composition of all sequences was 31.86%A, 27.70% T, 15.45% G and 24.99% C and the average nucleotide content of A + T (59.56%) was obviously higher than that of G + C (40.44%).The nucleotide diversity and the mean number of pair wise differences were 0.024430 ± 0.012685 and 15.561905 ± 7.238573, respectively, indicating a relatively rich genetic diversity in wild yak.

Haplotype diversity and phylogenetic tree
Fifteen haplotypes were identified from 21 wild yak Dloop sequences based on nucleotide variation (Figure 1).The number of sequences with each haplotypes varied: the haplotypes with the largest number of sequences (Hap2 and Hap11) consisted of three sequences each; the Hap3 and Hap13 included two sequences each; and the rest haplotypes included only one sequence each.Compared with the previous study (Guo et al., 2006), three haplotypes (Hap4, Hap8 and Hap12) were firstly detected in wild yak populations in this study.Haplotype diversity (h) was 0.9619 ± 0.0260, indicating a rich genetic diversity in wild yak population.Based on Kimura's two-parameter model, the pair-wise genetic distances between haplotypes fell in the range of 0.010 ± 0.004 to 0.046 ± 0.009 and the mean pair-wise genetic distances among all the haplotypes was 0.025 ± 0.004.
Taking the cattle (B.taurus) counterpart as an out-group (Accession number: V00654), a neighbor-joining tree of the 15 haplotypes from 21 D-loop sequences of the wild yak was constructed based on Kimura's two-parameter genetic distances (Figure 2), demonstrating all haplotypes could be classified into two lineages (termed as lineages A and B) that included eight and seven haplotypes, representing 11 and 10 D-loop sequences, respectively.

Population expansion analysis of whole population and
Ma et al.  each lineage was taken to investigate demographic history of the wild yak.The result showed that neither the selective neutrality test nor the mismatch distribution test supported the hypothesis that the wild yak had passed through a population expansion.Firstly, no statistical significance for Fu's Fs or Tajima's D values was observed for whole population or for each lineage (P > 0.10) (Table 1).Secondly, the mismatch distribution revealed that the shapes of the distributions were all ragged and multimodal which showed whole population and each lineage of the wild yak has not suffered from a population expansion (Figure 3).At the same time, the model of recent population expansion was also rejected by significant SSD and r values for all tested datasets (Table 1).Thus we concluded that the wild yak population had not passed through a population expansion during the past decades and the size of the population may have kept stable.

DISCUSSION
Many kinds of molecular markers have provided an opportunity for assessing accurately the genetic diversity of wild yak.In this study, the analysis of the 21 wild yak mtDNA D-loop sequences showed that the D-loop control region is an A+T-rich region of the mtDNA genome which is consistent with the result from vertebrate mtDNA Dloop sequences (Brown et al., 1986).Besides, the haplotype diversity and nucleotide diversity of the wild yak in our study are higher (h = 0.9619 ± 0.0260 and = 0.024430 ± 0.012685), indicating a rich genetic diversity in wild yak population.The result is also similar with the results from wild yak based on microsatellite markers (Wang 2000;Hu 2001) and partial mtDNA D-loop sequences (Guo et al., 2006).However, compared with the previous study (Guo et al., 2006), this study identified three new haplotypes (Hap4, Hap8 and Hap12) in wild yak.This indicated that there was more abundant genetic information existing in wild yak population.
In our study, 15 D-loop haplotypes of wild yak were classified into two lineages which consistent with the result implied by Guo et al. (2006).They also classified 10 D-loop haplotypes of wild yak into two lineages (one lineage with 4 haplotypes and another lineage with 6 haplotypes).Based on mtDNA D-loop sequences of 31 domestic yaks, Lai et al. (2005) analyzed the population expansion of domestic yak and showed that domestic yak had not passed through population expansion in the past.But the research by Guo et al. (2006) to 278 domestic yak and 13 wild yaks showed that every clades in yak had undergone population expansion.However, the demographic history of wild yak has not been investigated independently and no information is available.In this study, our research suggested firstly no population expansion events occurring in the demographic history of the wild yak.
The wild yak is one of the wild species only lived in Qinghai-Tibetan Plateau.Our study showed that the genetic diversity of wild yak population was relatively higher.Thus, if proper protection and utilization measures are taken, it is possible to conserve the genetic diversity and increase their population size.So, the following strategies for conservation and utilization of this species should be considered.First, to avoid inbreeding among related individuals and a decline in population genetic diversity, appropriate genetic markers should be used to identify the pedigree and build the pedigree records of wild yak population.Secondly, because some haplotypes (such as hap1,4 to 10, 12, 14 to 15) were represented by only one individual in this study, these living individuals should be given more chances to produce offspring in order to preserve these rare alleles.Thirdly, according to the results of this study, wild yak population could be divided into two lineages which implies that the wild yak maybe derived from two independent populations or be domesticated from two independent domestication places, or from a domesticated hybrid population which underwent a certain stress condition within the population resulted in the differentiation of two lineages.Thus, the wild yak population can be considered as two kinds of gene banks to be protected in the future.Fourth, collection of more samples to carry out population genetics research on wild yak is necessary for providing the theoretical guidance for the protection and utilization of this species.

Figure 1 .Figure 2 .
Figure 1.Polymorphic sites within 15 mtDNA D-loop region haplotypes of wild yak.The genbank numbers that share the same haplotype are listed in the right column.Dots (.) denote the nucleotide identical to that of reference sequences (Hap1, DQ139203).Horizontal lines (-) denote the nucleotide in/del.The top three rows of numbers represent the concrete polymorphic positions and should be read from up to down.

Figure 3 .
Figure 3. Expected (solid lines) and observed (broken lines) mismatch distribution for whole population and two lineages of wild yak: (a) whole population; (b) lineage A; (c) lineage B.