Cloning and bioinformatics analysis of an ubiquitin gene of the rice stem borer, Chilo suppressalis Walker (Lepidoptera: Pyralidae)

Ubiquitin which has the function of selective protein degradation may play an important role in the regulation of insect growth and development. The coding sequence of an ubiquitin gene from the larvae of the rice stem borer, Chilo suppressalis Walker (Lepidoptera: Pyralidae) named CsUB (GenBank Accession No. GU238420) was cloned by RT-PCR and sequenced in this study, with primers according to the sequences of ubiquitin genes from Homo sapiens, Drosophila melanogaster and Lepidopteran insects. Sequence analysis showed that the length of the coding sequence is 228 bp, encoding 76 amino acids with calculated molecular weight of 8.50 kDa and the theoretical isoeletric point of 5.26. Signal sequence and transmembrane domain had not been found. Multiple sequence alignment indicated that CsUB gene sequence with other known gene sequences of invertebrates and vertebrates had a high degree of homology (more than 72% similarity) and a shorter genetic distance (lower than 0.360). During the genetic diversity analysis, the total of 104 polymorphic sites was detected from 18 ubiquitin gene sequences and 18 haplotypes were sorted. Abundant genetic diversity and strong codon usage bias were found by the haplotype diversity (1.000), average number of nucleotide differences (47.475), nucleotide diversity (0.20866), effective number of codons (44.526), codon bias index (0.559) and scaled Chi-square (0.779). The predicated secondary structure composition of CsUB protein had about 32.89% extended strands, 36.84% random colis, 15.79% alpha helixes and 14.47% beta turns. Subcellular localization analysis showed that CsUB protein of cytoplasm, cell nucleus, mitochondrion, cell skeleton and plasma membrane occupied about 47.80, 26.10, 17.40, 4.30 and 4.30%, respectively. Sequence, homology and structural analysis confirmed that CsUB gene was highly conserved during evolution and belonged to ubiquitin gene family. The results might provide some fundamental data for further studies on expressed characteristics and physiological functions of CsUB gene.


INTRODUCTION
Ubiquitin is a highly conserved 76 amino acid protein hich is widely distributed in eukaryotic cells and linked to a vast range of protein (Ciechanover, 1998;Goldstein TC, et al;Yamao, 1999).Based on sequencing of either cDNA lones or the protein, the amino acid sequence of ubiquitin proved to be identical in various species (Arribas et al., 1986;Bond and Schlesinger, 1985).In comparison with Homo sapiens sequence, there are only 1 to 5 amino acid substitutions among plants, animals, yeast and so on (Gill, 2004).Selective protein degradation is mainly carried out by the ubiquitin system which plays important roles in many cellular functions, including immune regulation, cell cycle control, signal transduction, transcriptional regulation, the nuclear transport process, membrane receptor control by endocytosis and so on (Bai J, et al;Hershko, et al;Pickart et al., 2004;Patterson, 2006).Ubiquitin also appear conjugated with certain nuclear, cytoplasmic and cell-surface protein without causing their degradation (Leung et al., 1987).For example, Mezquita et al. (1997) proposed that ubiquitin conjugation plays an essential role in spermatogenesis.Functional ubiquitin is produced from two different types of ubiquitin genes, named polyubiquitin and ubiquitin extension genes (Callis and Vierstra, 1989).Ubiquitin is functionally a very important protein which takes part in mediating intracellular ATP-dependent protein degradation by 26S proteasome (Vierestra, 1996).On the face of it, the authors think that the ubiquitin is useful to investigate the cell physiology of the agricultural important pests and is the main focus of research these years.
Therefore, the main purpose of this study was: (1) to clone the cDNA of ubiquitin gene in the rice stem borer larvae (named CsUb) and (2) to predict its characterization and structure using bioinformatics analysis.

MATERIALS AND METHODS
The rice stems including the larvae of the rice stem borer, C. suppressalis Walker, were originally collected from the paddy fields in suburbs of Hefei City, China.The larvae were reared at 28°C on the fewflower wildrice, Zizania latifolia, as described by Zheng et al. (2009).

Total RNA isolation and RT-PCR
Total RNA was isolated using the RNeasy kit (Qiagen, Valencia, CA, USA) from the rice stem borer larvaes.RNA samples were prepared and stored at -70°C.RNA concentration and purity were assessed spectrophotometrically by measuring their absorbances at 260 and 280 nm in biophotometer (Eppendorf, Germany).2 µg RNA was used as the template to synthesize first-strand cDNA.Based on multi-alignment of highly conserved sequences of ubiquitin sequences from various insects (Ostrinia furnacalis, D. melanogaster, H. armigera, B. mandanina) available in the GenBank database, one set of degenerate primer was designed corresponding to the conserved sites of the ubiquitins, such as Qiang et al. 12853 forward: 5'-ATGCARAT HTTYGTNAARAC-3', reverse: 5'-RCCACCNCGVAGNCKVARSAC-3'.Polymerase chain reactions (PCR) temperature profile was 94°C for 5 min followed by 33 cycles of 94°C for 30 s, 45°C for 30 s, 72°C for 1 min and a final extension step at 72°C for 10 min.

Cloning and nucleotide sequencing
The PCR products were then purified using the Cycle Pure kit (Omega Bio-Tek, Norcross, GA, USA) and were ligated into a pGEM-T vector (Promega, Madison, WI, USA) according to the manufacter's instructions.Afterwards, plasmids were transformed into competent Escherichia coli DH5α competent cells and then plated out on a carbenicillin-containing LB agar plate.After 15 h incubation, formed colonies were checked by colony PCR and several of these positive colonies were then purified using Plasmid mini prep kit (Omega Bio-tek) and sent to Shanghai BioAsia Biotech Company (China) for sequencing.

Cloning and sequence analysis of CsUB gene
Based on the highly conserved sequences of the ubiquitins, the cDNA designated CsUB was isolated from the rice stem borer larvae.Cloning and sequence analysis of CsUB, cDNA (GenBank Accession No. GU238420) yielded a 228 bp sequence containing an initial ATG codon and a predicted protein of 76 amino acids (Figure 1), and a calculated molecular weight of 8.50 kDa, the theoretical isoeletric point of 5.26, negatively charged residues of 10, positively charged residues of 12, extinction coefficient of 1490, estimated half-life of 30 h, instability index of 30.86 and grand average of hydropathicity of -0.441.No signal sequence and transmembrane domain were identified in the transcript using the SignalP 3.0 Server and TMHMM 2.0 Server, respectively.In comparison with the length of the nucleotide and amino acid sequences of cloned CsUB, cDNA was in good agreement with earlier reported sizes of other ubiquitins.

Homology analysis of CsUB gene
We aligned CsUB gene sequence with other known gene sequences of invertebrates and vertebrates using DNAStar (version 5.01) and MEGA (version 3.1) software.The alignments displayed a high degree of homology (more than 72% similarity in all the matches), and a shorter genetic distances (lower than 0.360) (Table 1).Multiple sequence alignment suggested that CsUB gene was highly conserved during evolution and belonged to ubiquitin gene family.
One hundred and twenty four (124) monomorphic sites and 104 polymorphic sites were detected from 18 ubiquitin gene sequences by the genetic diversity analysis of DnaSP (version 4.0) software.Singleton variable sites and parsimory informative sites was 15 (amounting to 6.58%) and 89 (amounting to 39.4%), respectively.At the same time, 18 haplotypes were also sorted.Haplotype diversity, average number of nucleotide differences and nucleotide diversity was 1.000, 47.475 and 0.20866, respectively.According to calculation using the total number of mutations, there was no significance (P>0.10).Codon usage analysis showed that the effective number of codons, codon bias index and scaled Chi-square was 44.526, 0.559 and 0.779, respectively.Then a strong codon bias was found among 18 ubiquitin gene sequences.Multiple sequence alignment indicated that synonymous sites and nonsynonymous sites were 51.67 ~ 56.33 and 171.67 ~ 176.33, respectively (Table 2).Furthermore, we also found that nonsynonymous sites were three times as much as synonymous sites.

Structural analysis of CsUB protein
Phosphorylation sites of Ser and Thr occurred in CsUB amino acid residue 57 and 22 by NetPhos 2.0 Server, respectively.The predicated secondary structure composition of CsUB protein had about 32.89% extended strands, 36.84%random coils, 15.79% alpha helixes and 14.47% beta turns in a further study.Subcellular localization analysis demonstrated that CsUB protein of cytoplasm, cell nucleus, mitochondrion, cell skeleton and plasma membrane occupied about 47.80, 26.10, 17.40, 4.30 and 4.30%, respectively.
The three-dimensional structure of CsUB protein was built by the homology-based modeling, based on the structure of H. sapiens ubiquitin as template.The resolution based on the template and the E-value was 2.60 Å and 1.13e-32, respectively.The sequence similarity between the CsUB and H. sapiens ubiquitin homologue was about 97.33%, indicating that the target sequence was well compatible with the template (Figures 2 and 3).The overall folding pattern contained 3.5 alpha helix (aquamarine blue), 1 310-helix (buff), 5 beta foldings (navy blue, yellow, green, yellow green and sorrel) and 7 reverse angles (Arnold et al., 2006).Evaluation of atomic empirical mean force potential showed that only two amino acid residues of CsUB protein did not yield preferable result for closely related protein.

DISCUSSION
Ubiquitin is a small protein consisting of 76 amino acids that play a major role in both cellular stress response and protein degradation in eukaryotes (Masatoshi et al., 2000).In this study, the cDNA sequence of CsUB gene of the rice stem borer larvae was first cloned and reported.The nucleotide sequence was proven to be more than 72% similar to those of other known invertebrates and vertebrates (Table 1).The phenomenon was the same with the reports of Glickman and Ciechanover (2002) and Sharp and Li (1987).The primary structures of these ubiquitins among invertebrates and vertebrates had a high level of similarity and a shorter genetic distance.The result showed that CsUB gene was highly conserved during evolution and belonged to ubiquitin gene family.In addition, the very similar three-dimensional molecular modeling of the ubiquitins between the rice stem borer larvae and H. sapiens was also observed in B. mandarina  (Zhang et al., 2008).Therefore, it could be concluded that all ubiquitin genes in various species might originate from the same ancestor's gene.Since all the ubiquitin genes of the organisms were so highly conserved, the authors thought that the ubiquitin protein might not be used as a phylogenetic marker for evolutionary clock.But some different relationships that appeared in this study could be due to the association with genetic differentiation to a certain extent of organisms exposed to environmental stress for a long time (Zhang et al., 2008;Li et al., 1998).
This strong sequence conservation suggested that the vast majority of amino acids made up ubiquitin were essential as apparently any mutation that had occurred over evolutionary history had been removed by natural selection (Glickman and Ciechanover, 2002).In this study, 104 polymorphic sites were detected and 18 haplotypes were sorted from 18 ubiquitin gene sequences by the genetic diversity analysis.The results indicated that these ubiquitin genes had a strongly genetic adaptability.For example, Jin et al. (2008) reported that ubiquitin antibody of M. domestica had a positive reaction with both ubiquitin fusion proteins of M. domestica and S. litura, and suggested that it retained the original immunogenicity.Abundant genetic diversity and strong codon usage bias were found by the haplotype diversity (1.000), average number of nucleotide differences (47.475), nucleotide diversity (0.20866), effective number of codons (44.526), codon bias index (0.559) and scaled Chi-square (0.779).Statistical analysis suggested that there was a single major trend in codon usage variation among the genes for encoding specific  proteins (Ghosh et al., 2000).It was presumed that CsUB gene might be an important candidate marker gene of signal transduction during immune regulation of C. suppressalis larvae.But it also remained to be further studied.In addition, synonymous sites and nonsynonymous sites were 51.67 ~ 56.33 and 171.67 ~ 176.33 among 18 ubiquitin gene sequences in this study, respectively (Table 2).And nonsynonymous sites were three times as much as synonymous sites.Thus, we deduced that CsUB gene during molecular evolution was under positive selection according to Guo (1993) report.Many reports had confirmed that ubiquitin gene expression was enriched in the midgut, fat body, malpighian tubule and flight muscle, and ubiquitin played a very important role in insect life activities (Barrio et al., 1994).Since the 1980s, ubiquitin studies have grown enormously and ubiquitin-dependent proteolysis degradation pathways have been shown to play major roles in a legion of biological processes (Varshavsky, 1997).It is well known that some abnormal protein degradation is an integral component of cell physiology.So far, little is known for its physiological functions in insect growth and development, such as insect molting, pupation and metamorphosis.In this study, although we find a novel ubiquitin gene cloned from the rice stem borer larvae, the gene expression in various growth conditions of the larvae and the underlying mechanisms need to be further researched to identify the precise biological properties in the various physiological processes.

Figure 1 .
Figure 1.Nucleotide and the deduced amino acid sequence of the coding sequence of CsUB gene (GenBank Accession no.GU238420) using DNAMAN (version 5.2) software.

Figure 2 .
Figure 2. Theoretical three-dimensional-structure modeling of the deduced CsUB protein was based on the crytal structure of H. sapiens ubiquitin as template using SWISS-MODEL and WebLab viewer.

Figure 3 .
Figure 3. Evaluation of atomic empirical mean force potential.

Table 2 .
Synonymous sites and nonsynonymous sites of 18 ubiquitin gene sequences among invertebrates and vertebrates using DnaSP (version 4.0) software.