Paper cDNA , genomic sequence cloning and overexpression of ribosomal protein s 20 gene ( RPS 20 ) from the Giant Panda ( Ailuropoda melanoleuca )

RPS20 is a component of the 40S small ribosomal subunit encoded by RPS20 gene, which is conserved between eukaryotes, prokaryotes and archaebacteria. The cDNA and the genomic sequence of RPS20 were cloned successfully from the Giant Panda (Ailuropoda melanoleuca) using RT-PCR technology and touchdown-PCR, respectively. Both sequences were analyzed preliminarily and the cDNA of the RPS20 gene was also overexpressed in Escherichia coli BL21. The cDNA of the RPS20 cloned from Giant Panda is 392 bp in size, containing an open reading frame of 360 bp encoding 119 amino acids. The length of the genomic sequence is 1205 bp, which was found to possess 4 exons and 3 introns. Alignment analysis indicated that the nucleotide sequence of the coding sequence shows a high homology to those of Homo sapiens, Pongo abelii, Macaca fascicularis, Mus musculus, Bos taurus and Rattus norvegicus are 93.1, 92.5, 92.2, 91.1, 90.6 and 90.0% respectively. The amino acid sequence encoded by RPS20 gene of the Giant Panda shared a high homology (100%) with those of H. sapiens, Mac. fascicularis, Mus musculus, B. taurus and R. norvegicus, except for P. abelii (99.88%). Primary structure analysis revealed that the molecular weight of the putative RPS20 protein is 13.373 kD with a theoretical pI 9.95. Topology prediction showed there is one ATP/GTP-binding site motif A, one ribosomal protein S10 signature site, 5 protein kinase C phosphorylation sites and three Casein kinase II phosphorylation sites in the RPS20 protein of the Giant Panda. The RPS20 gene can be really expressed in E. coli and the RPS20 protein fusioned with the N-terminally GST-tagged form gave rise to the accumulation of an expected 39 kDa polypeptide.


INTRODUCTION
The mammalian ribosome is composed of 4 RNA species and approximately 80 different proteins (Yoshihama et al., 2002;Hwang et al., 2004).Increasing evidence suggests that many ribosomal proteins are not only involved in basic machinery of protein synthesis and regulation, but also in various extra ribosomal activities, including the regulation of cell proliferation, DNA repair, transcription and RNA processing (Wool et al., 1995;Wool, 1996).The *Corresponding author.E-mail: hwr168@yahoo.com.cn.Tel./Fax: +86-0817-2568653.
ribosomal protein S20 gene encodes a ribosomal protein that is a component of the 40S subunit and conserved between eukaryotes, prokaryotes and archaebacteria.The protein belongs to the S10P family of ribosomal proteins.It is located in the cytoplasm.This gene is cotranscribed with the small nucleolar RNA gene U54, which is located in its second intron.As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome (Chan et al., 1997;Chan and Wool, 1990;Chu et al., 1993).Two transcript variants encoding different isoforms have been identified for this gene.Apart from being part of the ribosome, RPS20 and its Esche-richia coli homologue RPS10 also may have roles in transcriptional control (Nodwell and Greenblatt, 1993;Hermann-LeDenmat et al., 1994).Further researches indicate that mutations in genes encoding ribosomal protein 20 can cause the minute phenotype in drosophila and mice and Diamond-Blackfan syndrome which is the p53-mediated dark skin and other pleiotropic effects in humans (McGowan et al., 2008).Medulloblastoma outcome is also adversely associated with over expression of RPS20 on the long arm of chromosome 8 (De Bortoli et al., 2006).
This study was conducted using RT-PCR technique to amplify the cDNA of RPS20 gene from the total RNA and Touchdown-PCR technique to amplify the genomic sequence of the RPS20 from DNA from the skeleton muscle of the Giant Panda and then analyzed the sequence characteristics of the protein encoded by the cDNA and compared it with those of human and other mammalian species reported.We also over expressed it in E. coli using PGEX4T-1 plasmids.The study provides scientific data for inquiring into the hereditary traits of the gene from Giant Panda and formulating the protective strategy for the Giant Panda.

Sample
Skeletal muscle was collected from a dead Giant Panda at the Wolong Conservation Center of the Giant Panda, Sichuan, China.The collected skeletal muscle was frozen in liquid nitrogen and then used for DNA and RNA isolation.

DNA and RNA isolation
The genomic DNA was isolated from Giant Panda muscle tissue according to the literature (Sambrook et al., 1989).The DNA obtained was dissolved in TE buffer and kept at -20°C.
Total RNAs were isolated from about 400 mg of muscle tissue using the Total Tissue/Cell RNA Extraction Kits (Waton Inc., Shanghai, China) according to the manufacturer's instructions and then dissolved in RNase-free ddH2O and kept at -70°C.DNA and RNA sample quality was checked using Experion (Bio-Rad) and quantification was performed spectrophotometrically.

Primers design, RT-PCR, cloning of cDNA sequence and sequencing
The PCR primers were designed by primer premier 5.0, based on the mRNA sequence of RPS20 from Homo sapiens (NM_001023), Pongo abelii (NM_001133984), Macaca fascicularis (AB169614), Mus musculus (NM_026147), Rattus norvegicus (NM_001007603) and Bos taurus (NM_001034438).The specific primers of cDNA sequence are as follows: Total RNAs (1 µg) were synthesized into the first-stranded cDNAs using a reverse transcription kit with Oligo dT as the primers followed by PCR amplification according to the manufacturer's instructions (Promega-Shanghai China).Reverse transcription reactions were performed in duplicate.Lack of genomic DNA contamination was confirmed by PCR amplification of RNA samples in the absence of cDNA synthesis.After amplification, PCR products were separated by electrophoresis in 1.5% agarose gel with 1× TAE buffer, stained with ethidium bromide and visualized under UV light.The expected fragments of PCR products were harvested and purified from gel using a DNA harvesting kit (Omega) and then ligated into pMD19-T vector (TaKaRa) at 16°C for 2 h.The recombinant molecules were transformed into E. coli competent cells (DH5α) and then spread on the LB-plate containing 50 µg/mL ampicillin, 200 mg/mL IPTG and 20 mg/mL X-gal.Plasmid DNA was isolated and digested by PstI and ScaII to verify the insert size.Plasmid DNA was sequenced by Huada Zhongsheng Scientific Corporation (Beijing, China).

Cloning the genomic sequence of RPS20
The PCR primers were designed basing on the cDNA sequence of the RPS20 from Giant Panda obtained above.The specific primers of genomic sequence are as follows: The genomic sequence of the RPS20 gene was amplified using Touchdown-PCR with the following conditions: 94°C for 30 s, 56°C for 45 s, 72°C for 2 min in the first cycle and the anneal temperature deceased 0.2°C per cycle; after 20 cycles conditions changed to 94°C for 30 s, 52°C for 45 s, 72°C for 2 min for another 20 cycles.The fragment amplified was also purified, ligated into the clone vector and tansformed into the E. coli competent cells.Finally, the recombinant fragment was sequenced by Huada Zhongsheng Scientific Corporation.

Construction of the expression vector and over expression of recombinant RPS20
PCR fragment corresponding to the RPS20 polypeptide was amplified from the RPS20 cDNA clone with the forward primer and the reverse primer, respectively.RPS20-F′′: 5'-CCGGAATTCGGAACAAGTCGGTCAGGAAG-3' (EcoR I) RPS20-R′′: 5'-CCGGTCGACAACCTCAACTCCTGGCTCAA-3' (SalI) The PCR was performed at 94°C for 2 min; 35 cycles of 30 s at 94°C, 45 s at 56°C and 1 min at 72°C; 7 min at 72°C.The amplified PCR product was cut and ligated into corresponding site of pGEX 4T-1 vector (Stratagen, Shanghai, China).The resulting construct was transformed into E. coli BL21 strain (Novagen, Shanghai, China) and used for the induction by IPTG (Isopropyl-b-D-thiogalactopyranoside) at an OD600 of 0.6 and culturing further for 4 h at 37°C, using the empty vector transformed BL21 as a control.The culture was centrifuged at 10000 g for 5 min at room temperature after it was induced for 0.5, 1, 1.5, 2, 2.5, 3, 4 h, respectively.The culture supernatant was concentrated with methanol and chloroform (3:1,v/v) and SDS-PAGE(SDS polyacrylamide gel electrophoresis) was performed to investigate protein production and purity using slab gels containing 12% (w/v) polyacrylamide on a miniprotean II slab cell apparatus (Bio-Rad,Hercules,CA).Protein samples were visualized by Coomassie brilliant blue R-250 staining.

Analysis of the cDNA of RPS20 from the Giant panda
A cDNA fragment of about 400 bp was amplified from the Giant Panda with primers RPS20-F and RPS20-R (Figure 1).The length of the cDNA cloned is 392 bp.Blast research showed that the cDNA sequence cloned shares a high homology wih the RPS20 from some reported mammals, including H. sapiens, P. abelii, Mus musculus, B. taurus and Rattus norvegicus.On the basis of the high identity, we concluded that we had cloned the cDNA encoding the Giant Panda RPS20 protein.The RPS20 cDNA sequence was submitted to Genbank (accession Zhang et al. 5629 number: FJ903447), containing the 5'-untranslated sequence in size of 32 bp.An ORF of 360 bp encoding 119 amino acids was found in the cDNA (Figure 2).

Analysis of the genomic sequence of RPS20 from the Giant Panda
A DNA fragment of about 1200 bp was amplified with primers RPS20-F′ and RPS20-R′ (Figure 3).The length of the DNA fragment cloned is 1205 bp.Comparison between the cDNA sequence and the DNA fragment sequence of the RPS20 amplified from Giant Panda was performed by DNAMAN version 6.0.The result indicated that the cDNA sequence is in full accord with 4 fragments in the DNA fragment, which manifests that the DNA fragment amplified is the genomic sequence of the RPS20 from Giant Panda.The genomic sequence of the RPS20 has been submitted to Genbank (accession number: FJ903448).

Overexpression of the RPS20 gene in E. coli
The RPS20 gene was over expressed in E. coli and amplified individually by PCR, then cloned in a pGEX 4T-1 plasmid, resulting in a gene fusion coding for a protein bearing a GST-tag extension at the N-terminus.Expression was tested by SDS-PAGE analysis of protein extracts from recombinant in E. coli BL21 (Figure 4).The results indicated that the protein RPS20 fusion with the N-terminally GST-tagged form gave rise to the accumulation of an expected 39 kDa polypeptide that formed inclusion bodies.Apparently, the recombinant protein was expressed after half an hour of induction and the output of the induction has kept growing with the time.

DISCUSSION
Alignment analysis of the cDNA sequence of RPS20 and the deduced amino acid sequence between the Giant Panda with other mammals reported including H. sapiens, P. abelii , Mac. fascicularis, Mus musculus, B. taurus and R. norvegicus was performed by software DNAstar Lasergene.The homologies for coding sequence between the Giant Panda and the 6 mammals above are 93.1, 92.5, 92.2, 91.1, 90.6 and 90.0%, respectively.The homologies for deduced amino acid sequence are all 100%, except for P. abelii (99.88%).These results indicated the coding sequence of RPS20 and the deduced amino acid sequence are highly conserved.Among them, the Giant Panda shares the highest homology for nucleotide sequence with H. sapiens.Phylogenetic analysis clearly separated the vertebrates from invertebrates.Physical and chemical analysis revealed that the molecular weight of the putative RPS20 protein among these mammalians is 13.37271 kDa with a theo-  The protein possesses extensive secondary structure.A quantitative estimate of the content of α-helices and βstrands in the protein secondary structure shows that 18.5% of the protein sequence is folded in α-helices and 33.6% in β-strands.21.85% of the amino acid residues are basic, 11.76% are acidic, 43.7% are hydrophobic and 59.66% are polar.Most of the basic and hydrophobic residues are scattered throughout the whole RPS20 protein, except for the C-terminus.
The phosphorylation states of several ribosomal proteins are important for their functions (Banham et al., 1993;Simonin et al., 1995;Kim et al., 1996;Patel et al., 1996;Proud, 1996).Topology prediction showed there is one Ribosomal protein S10 signature, one ATP/GTPbinding site motif A (P-loop), four Protein kinase C phosphorylation sites and 3 Casein kinase II phosphorylation sites in the RPS20 protein of the Giant Panda (Figure 5).Alignment analysis of RPS20 among those protein revealed that the functional sites are entirely identical in RPS20 proteins of these mammalians except for P. abelii.Among these polymorphic sites, site 66 is located in protein kinase C phosphorylation site and it results in one difference from P. abelii and other 6 mammalian species in the functional sites.Alternatively, the mutation of the amino acid site 66 results in the deletion and insertion of 1 protein kinase C phosphorylation site.These fact shows that the variation of sites has no affect on the structure and function of RPS20 protein and it may be the result during the evolution of these species.However, what changes caused by other mutations outside the functional sites in the structure and the function of RPS20 need further studies.
The genomic sequence of RPS20 is 1205 bp in size.A comparison of the nucleotide sequences of the genomic and cDNA sequences indicated that the genomic suquence of RPS20 possesses 4 exons and 3 introns, which is also supported by restriction mapping of the genomic and cDNA sequences.Compared with some mammals incluing H. sapiens ( NC_ 000008 ), B. taurus (NC_007 312), Mus musculus (NC_000070 ) and R. norvegicus (NC_005104), the 4 exons, which comprise the cDNA sequence of RPS20 gene after RNA splicing, is highly conserved and remain essentially the same.The restricttion sites in the exons are the same in both the cDNA and the gennomic sequences.On the contrary, the genomic, the introns, the 5′-untranslated sequence and the 3′untranslated sequence are different in length (Table 1).The variations in lengths of the introns determine the lengths of the RPS20 genes.
The RPS20 gene obtained is expressed efficiently in prokaryotic organism such as the E. coli using pGEX 4T-1 plasmids and the gained fusion protein is in accordance with the expected 39 kDa polypeptide.These results suggest that the protein is active and it is just the protein encoded by the RPS20 from the Giant Panda.The expression product obtained could be used for purification and study of its function further.
The cDNA and the genomic sequence of RPS20 were cloned successfully for the first time from the Giant Panda, respectively, which were both sequenced and analyzed preliminarily and the cDNA of the RPS20 gene was also overexpressed in E. coli BL21 strains, which is the first report on the RPS20 gene from the Giant Panda.The data will enrich and supplement the information about RPS20.In addition, it will contribute to the protection for gene resources and the discussion of the genetic polymorphism.

Figure 2 .
Figure 2. Nucleotide sequence of cDNA encoding the Giant Panda RPS20 and the amino acid sequence deduced from its ORF.*Indicates the stop codon.

Figure 4 .Figure 5 .
Figure 4. Protein extracted from recombinant E. coli BL21 strains were analyzed by SDS-PAGE gel stained with Commassie blue R250.Numbers on right shows the molecular weight and the arrow indicates the recombinant protein bands induced by IPTG with 0, 0.5, 1, 1.5, 2, 2.5, 3 and 4 h (lane 2 -9), respectively.The lane 1 represents the products of the E. coli strains with the empty vectors.

Table 1 .
Comparison of gene structures