Isolation , characterization , and phylogenetic analysis of copia-like retrotransposons in the Egyptian cotton Gossypium barbadense and its progenitors

We have used the polymerase chain reaction to analyze copia-like retrotransposons in the Egyptian cotton and its progenitors. All three cotton species studied contain reverse transcriptase fragments from copia-like retrotransposons. Sequence analysis of these reverse transcriptase fragments reveals that each is different from the others, with predicted amino acid diversities between 9 and 94%. The detection of stop codons and insertions/deletions in the derived amino acid sequences of the Gossypium RT clones, suggests that these clones represent defective retrotransposons. The presence of these sequences in G. barbadense progenitors, however, suggests the presence of active retrotransposons capable of producing new functional copies at an appropriate rate to compensate for the mutational loss of old ones. Phylogenetic analysis provided strong bootstrap support for a monophyletic origin of plant copia-like retrotransposons, yet showed high diversity within all species. Our results suggest that both vertical transmission of copia-like retrotransposons within G. barbadense lineages, and horizontal transmission between G. barbadense and its progenitors have played major roles in the evolution of copia-like retrotransposons in Gossypium.


INTRODUCTION
Copia-like group retrotransposons are one of the best characterized groups of plant retrotransposons (for review see Kumar and Bennetzen, 1999).They have been reported in a wide range of plant taxa, including angiosperms, gymnosperms, ferns, lycopods, and bryophytes (Konieczny et al., 1991;Flavell et al., 1992;Voytas et al., 1992;Friesen et al., 2001).Their ubiquity in the plant kingdom suggests that they are of very ancient origin (Bennetzen, 2000).In addition, their abundance has played a major role in plant genome structure and evolution (Bennetzen, 2002).
The phylogenetic relationships of the approximately 50 diploid and 5 allotetraploid species of Gossypium are well characterized (reviewed in Wendel and Cronn, 2002).The five allotetraploid Gossypium species (designated AF-genome) diverged from a single recent allopolyploidization event, and their parental diploids (Wendel, 1989).In this regard, copia-like retrotransposons were previously identified in G. hirsutum (Vanderwiel et al. 1993).In addition, fluorescent in-situ hybridization was used to study their chromosomal distributions (Hanson et al., 1999).In the current study, we isolated, cloned, and sequenced part of the reverse transcriptase (RT) domain of copia-like retrotransposons in the Egyptian allotetraploid cotton, G. barbadense cotton, and its progenitors.Our results revealed that all three cotton species studied here contain RT fragments from copia-like retrotransposons, suggesting that copialike retrotransposons is a standard component of the Gossypium genome, and supporting the fact that copialike retrotransposons represents a major component of the plant genome.

Plant materials and genomic DNA extraction
Gossypium species and cultivars, listed in Table 1, were kindly provided by Dr. Percival.Total DNA was extracted using Qiagen DNeasy kit (Qiagen, Germany).
Table 1.Gossypium species used in this study: isolated clones and their GenBank accession numbers.

Species
Clone Accession number Dar U75245

PCR
Total DNA was subject to PCR with specific primers to amplify an approximately 280 bp region of the copia-like reverse transcriptase (5`-GGAATTCGAYGTNAARACNGCNTTYYT-3`) and (5`-GGGATC CAYRTCRTCNACRTANARNA`), where N= A+C+G+T, R= A+G, and Y= T+C (Voytas et al. 1992).DNA amplifications were carried in an ABI GeneAmp PCR system 9700 cycler with a denaturing step at 95°C for 5 min and the step cycle program set for 45 cycles (with a cycle consisting of denaturing 94°C for 30s, annealing at 47°C for 1 min and extension step at 72°C for 2 min), followed by a final extension step at 72°C for 10 min.

Cloning and sequencing of PCR-amplified fragments
Expected PCR-amplified fragments were excised from the agarose gel and purified using Qiagen Gel Extraction kit (Qiagen, Germany).Purified DNA fragments were then cloned in pCR 4-TOPO vector with TOPO TA cloning kit (Invitrogen, USA) in the competent E. coli strain TOPO 10.Plasmid DNA was isolated using QIA Spin miniprep kit (Qiagen, Germany), and sequenced in both directions using BigDye Sequencing Kit and ABI 377 DNA sequencer (ABI, USA).

Accession numbers
DNA sequences, reported in the current study, were deposited in the NCBI nucleotide sequence database, GenBank, and are listed in Table 1.

RESULTS AND DISCUSSION
PCR amplification with degenerate primers for the copialike reverse transcriptase (RT) domain (Voytas et al., 1992) produced 5 putative RT clones: 3 from G. barbadense, and 1 clone from G. arboretum and G. darwinii respectively (Table 1).Blast search confirmed the RT nature of the cloned products.The high amino acid similarities, observed in the Blast search, supports the interpretation that the 5 sequences generated in this study represent portions of copia-like retrotransposons RT genes.
Extreme sequence diversity amongst RT genes of the copia-like retrotransposons has been observed both within and between plant species (Flavell et al., 1992).Furthermore, they are more closely related to elements present in other plant species (Voytas et al., 1992).A similar pattern of sequence heterogeneity is observed in G. barbadense and its progenitors.Comparative nucleotides and amino acids sequences using CLUSTALW were performed (Figure 1).In addition, pairwise comparisons (Table 2) showed amino acid diversities of 28% (Bah163/Bah185), and 26% (Bah163/Ash) within G. barbadense cultivars, and 9% (Arb/Dar), and 10% (Arb/Bah163) between Gossypium species.The detection of either stop codons and insertions/deletions that have caused frame shifts in the derived amino acid sequences of the Gossypium RT clones, suggests that these clones represent defective retrotransposons.In addition, a number of the retrotransposons in the Egyptian cotton and its progenitors obviously are not functional and currently must be evolving as pseudogens.The long-term survival of these sequences, however, suggests the presence of active retrotransposons capable of producing new functional copies at an appropriate rate to compensate for the mutational loss of old ones.It is noteworthy that the majority of plant copia-like retrotransposons are thought to be rarely active (Kumar and Bennetzen, 1999).
Relationships among the derived amino acid sequences of the 5 clones with each other and other retrotransposons were investigated by constructing a neighbor-joining tree (Saitou and Nei, 1987), with accession numbers are shown on the tree, and Ty1 as the outgroup (Figure 2).The neighbour-joining phylogram provided strong bootstrap support for a monophyletic origin of plant copia-like retrotransposons, yet showed high diversity within all species.Bah163 from G. barbadense has the strongest affinity with Dar from G. darwinii with 93% amino acids identity.On the other hand, Bah185 from G. barbadense has the strongest affinity with Ash from G. barbadense with 94% identity.
This study aimed to use the PCR technique to study the evolution of copia-like retrotransposons in the Egyptian cotton and its progenitors.We have shown that the amplified fragments are comprised of a very Abdel Ghany and Zaki 167 heterogeneous collection of RT sequences.These results are consistent with previous which concluded that copialike retrotransposons are present as highly heterogeneous populations within all higher plants (Flavell et al., 1992, Voytas et al., 1992).Such sequence heterogeneity contrasts strongly with the limited diversity seen in such retrotransposons in yeast and Drosophila (Peterson-Burch and Voytas, 2002).Plant genomes seem to be inherently more susceptible to the generation of sequence diversity in copia-like retrotransposons than the genomes of Drosophila and yeast (Bennetzen, 2002).This has been contributed to the distinctive strategy adopted by plants for determination of their germlines, and their great tolerance to chromosomal alterations (Flavell et al., 1992).In this regard, the fact that G. barbadense represents an allopolyploid cotton that appears to have arisen as a consequence of transoceanic dispersal of an A-genome taxon to the New World followed by hybridization with an indigenous Dgenome diploid (Wendel and Cronn, 2002) raises the question to the effect that hybridization may have had on the copia-like retrotransposons sequence diversity in G. barbadense.Further experimental data such as copy number determination, chromosomal distribution, and sequencing of large contiguous regions of the cotton will significantly add to fundamental knowledge about the population heterogeneity of copia-like retrotransposons in the Gossypium genome.
The origin and evolutionary relationships of copia-like retrotransposons still remain very interesting, yet, debatable, and controversial (Eickbush and Furano, 2002).Previous studies have suggested three possible mechanisms, not mutually exclusive, may account for the evolution of copia-like retrotransposons in plants (Stuart-Rogers andFlavell, 2001, Peterson-Burch andVoytas, 2002).First, there may have been an explosive radiation of retrotransposons in the common ancestor of all plants, with most of these elements maintained with only limited further divergence.Second, multiple horizontal transfers of retrotransposons may have occurred between physically and phylogenetically distant populations or taxa.Finally, the constraints on retrotransposons evolution may be such that these elements have reached near-identical ranges of sequence diversity in widespread modern plant taxa.In this regard, according to our phylogenetic analyses, copia-like retrotransposons in G. barbadense and its progenitors is dominated by germ line vertical transmission.Our data, however, revealed that Bah163 of G. barbadense closest homologue is that of G. darwinii.Horizontal transfer of retrotransposons has been suggested as an explanation of this phenomenon in plants (Friesen et al., 2001).This suggestion is supported by the fact that G. darwinii is believed to have diverged from a common ancestor about 4-11 million years prior to being united in a common polyploid nucleus (Wendel, 1989).A more comprehensive survey of copia-like retrotransposons in Gossypium species is certainly  (Saitou and Nei, 1987) was used to construct the tree.The numbers on the branches represent bootsrap support for 1,000 replicates.Names refer to the accession number of the nucleotide sequences that encode the corresponding reverse transcriptase genes.
required to further clarify their evolutionary relationships.In conclusion, we suggest that both vertical transmission of copia-like retrotransposons within G. barbadense lineages, and horizontal transmission between G. barbadense and its progenitors have played major roles in the evolution of copia-like retrotransposons in Gossypium.

Figure 2 .
Figure2.Phylogenetic tree showing relationship between reverse transcriptase amino acid sequences of G. Barbadense and its progenitors and plant, yeast, and Drosophila copia-like retrotransposons.The Neighbor-Joining method(Saitou and Nei, 1987) was used to construct the tree.The numbers on the branches represent bootsrap support for 1,000 replicates.Names refer to the accession number of the nucleotide sequences that encode the corresponding reverse transcriptase genes.

Table 2 .
Amino acids pairwise comparisons of copia-like putative RT sequences in G. barbadense and its progenitors.