Comparative analysis of the miRNA profiles from Taenia solium and Taenia asiatica adult

Taenia solium and Taenia asiatica are zoonotic parasites which are transmitted between pigs and humans, with pigs acting as the intermediate host and humans as the definitive host. The aim of the present study was to compare the microRNA (miRNA) profiles between T. solium and T. asiatica by Solexa deep sequencing and bioinformatic analyses. A total of 18.26 and 15.24 million reads with high quality and three new miRNAs were obtained in each species, respectively. The two cestodes shared nearly the same level of total sRNA, but the known miRNAs were in a very different manner. Nucleotide guanine (G) and uracil (U) were the most frequently used first nucleotide in both T. solium and T. asiatica, respectively. Furthermore, it was also found that there were great differences in the kinds and copy numbers of miRNAs among the two Taenia species, which might indicate the different evolutions in the two parasites. To our knowledge, this is the first report on the characterization of the miRNA profiles for the two Taenia cestodes, and it would lay a foundation for further functional studies of miRNAs of T. solium and T. asiatica.


INTRODUCTION
Human taeniasis is caused by intestinal infection of adult Taenia spp., with Taenia saginata, Taenia asiatica and Taenia solium as the most common causative agents (Anantaphruti et al., 2007;Ito et al., 2011).Due to the special eating habits of local people in Asian countries, pork and pig visceral organs including the liver, omentum, serosa, and lung have been the main dishes in daily food.Moreover, in some rural areas, people are likely to consume uncooked or raw meat (Eom and Rim, 1993;Ito et al., 2004;Murell, 2005).This eating habit has led to the high occurrence of taeniasis, especially those caused by T. solium and T. asiatica.
T. solium can be transmitted between humans and pigs (O'Neal et al., 2011).In humans, the larval stage can parasitize in the central nervous system, leading to neurocysticercosis with symptom of epilepsy or even death (Garcia et al., 2005a, b).This disease causes serious public health problem (Afonso et al., 2008;Sorvillo et al., 2011).
T. asiatica is morphologically and phylogenetically close to T. saginata and T. solium and pig is the intermediate host (Eom and Rim, 1993).T. asiatica is an important human parasite in Asian countries including Korea, China, Taiwan, Thailand, Indonesia, Vietnam, Japan and the Philippines (Nakao et al., 2002).This species has T. saginata-like morphology, but has a T. solium-like lifecycle (Bowles and McManus, 1994;Hoberg et al., 2000).Currently, it is still not clear whether *Corresponding author.E-mail: chenjiaxu1962@163.com.
T. asiatica causes human cysticercosis, and whether this parasite is also distributed out of Asia (Jeon et al., 2005;Jeon and Eom, 2007).
Due to their similarities in morphology, identification and differentiation between T. asiatica and T. solium can be difficult (Beveridge and Gregory et al., 1976;Eom et al., 2002;Proctor, 1972).Furthermore, if the samples are from different hosts or various developing stages, morphology will not be enough for identification and differentiation.With the development of molecular technologies, PCR-based techniques have been developed for the identification and differentiation of Taenia species, including T. solium or T. asiatica (Al-Sabi and Kapel, 2011;Jeon et al., 2011a, b).However, there are currently no published data about the microRNA (miRNA) profiles of T. solium or T. asiatica.Herein we developed an integrative approach combining Solexa deep sequencing with bioinformatic analysis to comparatively study the miRNA profiles between the two Taenia species.

Samples
Adult Taenia tapeworms were collected from stools of two patients of Miao nationality in Guizhou province, China.The two samples were identified by morphology.T. solium: Adult measures 2 to 10 m and contains 1000 proglottids.The scolex bears a rostellum with two crowns of hooks (Beaver et al., 1984).The gravid segment measures about 12 by 6 mm and the lateral uterine branches, visible by the mass of eggs.Gravid segments are less mobile and usually expelled with the stools, separately or in groups of three to six proglottids (WHO, 1983).T. asiatica: Adults are large sized tapeworms (mean 341 cm long and 9.5 mm wide) with on average 712 segments.The scolex is spheroidal, with a cuspidal rostellum.The cervical swelling is distinct.Proglottids are rectangular.Anterior proglottids are wide and short; the posterior proglottids are long and narrow.
Free proglottids bear a posterior protuberance.Mature proglottids have two ovary lobes that are unequal in size.The vaginal sphincter is round to oval (Eom and Rim, 1993).The uterus has numerous lateral branches, between 16-32.The way of leaving the host is as a single proglottid and spontaneously, independent from defaecation as for T. saginata (Fan, 1988;Ito et al., 2003) and polymerase (PCR) (Jeon et al., 2011a, b).Gravid proglottids of the two species were respectively incubated in physiological saline for 3 h at 37°C individually and washed for 3 times to get rid of contamination from the host.The parasites were then stored at -80°C until use.
The two patients were volunteers accepting the treatment by pumpkin seed and betel nut in order to get live adult worms (Feng, 1956), and all work was approved by the Chinese Center for Disease Control and Prevention committees.

Total RNA isolation and small RNA preparation
Samples were weighted and grounded into fine powder with mortar in liquid nitrogen.Total RNA was isolated with Trizol reagent (Invitrogen) according to the manufacturer's protocol.Then total RNA was examined on 1% agarose gel, and the concentration was determined using a BioPhotometer (Eppendorf).The purified total RNA was stored at -80°C until use.
Novex 15% TBE-Urea gel (Invitrogen) was used for small RNA isolation according to the protocols reported previously (Lau et al., 2001;Chen et al., 2010a, b).The RNA fragments of 20-30 bases were reverse transcripted with a RT-PCR kit (Invitrogen) and purified with a 6% TBE PAGE gel.Then the purified small cDNA fragments were stored at -80°C until use.

High-throughput sequencing and computational analysis
High-throughput sequencing was performed with a Solexa sequencer at Huada Genomics Institute Co. Ltd, China.After removing adaptors, contamination of adaptor-adaptor, low quality sequences and reads smaller than 18 nt, total reads of the small cDNA fragments were obtained.Then non-coding RNAs were removed by search against GenBank and Rfam database (version 9.1) (http://www.sanger.ac.uk/software/Rfam/mirna).
The copy numbers of conserved miRNAs of the two Taenia species were statistically analyzed to compare the expression difference.The value of fold change was obtained with the formula: log2 (copy number of T. solium / copy number of T. asiatica); and Pvalue was calculated as described previously (Allen et al., 2005;Schwab et al., 2005;Xu et al., 2010).

Profile differences of short RNAs from the two cestodes
After Solexa deep sequencing and removing of 5' and 3' adaptors, as well as contamination formed by adaptoradaptor ligation and low quality tags was removed, 18.59 and 15.54 million reads were obtained for T. solium and T. asicatia, respectively, with a total of 18.26 and 15.24 million high quality reads.Analysis of length distribution showed that there were some differences in the length distribution of miRNAs between T. solium and T. asicatia.In T. solium, the most abundant reads was 22 nt long with a percentage of 21.42%, and it was 16.86%, 14.14%, 13.71% and 7.53%, respectively for reads of 21, 20, 23 and 24 nt long (Figure 1A).While in T. asicatia, reads were significantly focused on 22 and 21 nt with percentages of 28.81% and 19.55% (Figure 1B).
After getting rid of reads smaller than 18 nt, there were 17.51 and 14.90 million clean reads for T. solium and T. asicatia, with 12.28 and 14.72 million unique reads, respectively.Among the clean reads, 78.63% and 85.79% (mapped to S. japonicum) were identified as non-coding sRNA.The total percentage of ncRNA was at nearly the same level between the two species.The known miRNAs were 1,931,115 (11.02%) in T. solium and 1,131,646 (7.06%) T. asicatia, with 140 (0.01%) and 135 (0.01%) unique reads, respectively (Table 1).However, from Table 1, the ratio of unique reads between them was the same, but there was big difference of total sRNAs.On the other hand, it indicated that T. asicatia is an independent Taenia species with T. solium except for miRNAs, the total percentage of other ncRNA in T. solium (8.97%) was nearly two fold than that in T. asicatia (4.97%), including rRNA (4.41%), snRNA (0.09%), snoRNA (0.00%) and tRNA (4.47%) in T. solium and 0.65%, 0.02% and 4.30% in T. asicatia, respectively.Except for ncRNAs mentioned above, there were 1.38 (78.63%) and 1.28 (85.79%) million of total reads not matched to the public database in each species, respectively, which were marked as unannotated reads (Table 1).

sRNAs distribution between T. solium and T. asicatia
A total of 32.41 million total reads for both T. solium and T. asicatia was analyzed.It was found that these two cestodes shared 21.78 million sRNAs in common with a percentage of 67.21%, and only 14.47% (4.69 million reads) and 18.33% (5.94 million reads) were T. solium and T. asicatia specific, respectively (Table 2 and Figure  2A).
Out of 32.41 million total reads, 2.59 million reads was unique ones.Among the unique reads, only 4.35% (0.11 million unique reads) was shared between the two cestodes.Among the 2.59 million unique reads, 1.12 million was specific for T. solium with a percentage of 43.11%, while 1.36 million was specific for T. asicatia with a percentage of 52.54%.The unique ncRNA of T. asicatia specific was higher than that of T. solium specific (Table 2 and Figure 2B).

Expression difference of known miRNAs
The expression level of known miRNAs in T. solium and   and miR-71 were expressed in a nearly equal level (with blue) in both species (Figure 3).Among all known miRNAs of both species, 5 of them were expressed with copy numbers higher than 1000 (miR-7-5p, miR-71, miR-277, miR-219-5p and miR-2b-3p).In the miRNAs expression in T. solium, miR-10-5p was expressed with copy number higher than 100; miR-307 and miR-124-3p were expressed with copy number fewer than 100.There were some differences in miRNAs of T. asicatia, miR-307 and miR-10-5p have copy numbers fewer than 100.The copy number of miR-124-3p was higher than 100 in T. asicatin, which was only half of the number in T. solium (Table 3).
Just as the phenomenon of miR-124-3p, the differential expressed miRNAs of the two parasites.In T. asicatia, this miRNA expressed higher than T. solium, which may indicatd this miRNA will play more essential activity in the life cycle of T. asicatia.

Nucleotide bias analysis of miRNA profiles
The two Taenia species had different characteristics in the first nucleotide.The nucleotide guanine (G) was the most frequently used first nucleotide in T. solium with a percentage of 57.40%, followed by U (27.97%) and A (11.30%), while C was seldomly used as the first  nucleotide (3.30%) (Appendix 1).For T. asicatia, the nucleotide uracil (U) was the most frequently used first nucleotide with a percentage of 46.84% followed by C (20.67%) and G (19.46%), while A was seldom used as the first nucleotide (13.03%) (Appendix 1).The nucleotide bias for each position of known miRNAs was also different between T. solium and T. asicatia (Appendix 1).In T. solium, nucleotide A mainly was distributed at the 8-14th, 22th, and 23th positions.Nucleotide U was distributed with the highest percentages at the 16th (82.31%) and 4th positions (80.27%), and did not appear at the end of miRNAs.Altogether, (G+C) possessed a predominant percentage at the middle of known miRNAs than (A+U), especially at the 3th and 8-11th positions which reached a percentage as high as more than 80%.Another interesting phenomenon was that the percentage of C was very low at all of the positions (1-24) with the percentage ranging from 0-9.30%, except at position 21 (26.89%).For T. asicatia, nucleotide A was mainly distributed at the end positions (18-24).Nucleotide U was distributed with the highest percentages at the 4th (70.87%) and 16th positions (70.86%).(A+U) possessed a predominant percentage at the end of known miRNAs than (C+G), especially at the positions of 22-24 nucleotides which reached a percentage as high as100%.

Novel miRNAs in T. solium and T. asicatia
Due to the fact that no genome sequences of Taenia spp.are available at public database at present, the genome of the most related species S. japonicum was used as reference geonome.A total of 2.90 (16.56%) and 1.71 (11.48%) million reads for T. solium and T. asicatia, respectively, were perfectly mapped, including 19,120 and 10,468 unique reads, respectively.The un-annotated unique reads that cannot match onto the S. japonicum genome were marked as potential novel candidates and used for novel miRNA prediction.Altogether, three novel miRNAs was found in T. solium and T. asicatia, respectively (Appendix 2 and 3).

DISCUSSION
In previous studies, genetic markers such as the first and second internal transcribed spacers (ITS-1, ITS-2) and mitochnodrial cytochrome c oxidase subunit (cox1) have been used to reveal genetic differentiation between T. solium and T. asicatia (Jeon et al., 2011a, b;Nkouawa et al., 2009).Due to their sufficient differences in reproductive biology, host preference and gene sequences, these two Taenia taxa are considered as separated species (Jeri et al., 2004;Jeon et al., 2011a, b;Nkouawa et al., 2009).T. solium and T. asicatia have complex life cycles and can be transmitted between pigs and humans.The present study compared the miRNA profiles between adult T. solium and T. asicatia.In different developmental stages of their life cycle, Taenia may express various miRNAs to regulate their gene expression, which warrants further studies.
The present study revealed that the two Taenia cestodes shared a very high percentage of total reads (67.21%), while with a very small percentage of unique reads (4.35%).This is possibly due to a high redundant expression of a few kinds of sRNAs, and these sRNAs might be essential for their fundamental metabolism.On the other hand, the total percentage of ncRNA was nearly at the same level in the two cestodes (19.99% in T. solium and 12.57% in T. asicatia).
Although, the expression of the predominant miRNAs (such as miR- 7-5p, miR-71, miR-277, miR-219-5p and miR-2b-3p) were at the same level in the two Taenia taxa, which occupy the most copy numbers, some miRNAs were expressed differently between T. solium and T. asicatia, such as miR-10-5p in T. solium and miR-124-3p in T. asicatia.This phenomenon indicated that these two closely-related Taenia species are different in gene expression patterns.The two species also differ in the nucleotides in different positions of miRNAs.

Conclusions
The study represents the first characterization of T. solium and T. asicatia miRNAs, which will help to better understand, explore the complex biology of this zoonotic parasite and provide effective control methods for diseases caused by these important parasites.The reported data of T. solium and T. asicatia miRNAs should provide valuable references for further miRNA studies.

Figure 1 .
Figure 1.Length distribution of small RNA from T. solium and T. asicatia by Solexa deep sequencing (A: Length distribution of small RNA from T. solium; B: Length distribution of small RNA from T. asicatia).

Figure 2 .
Figure 2. Coverage of sRNA in T. solium and T. asicatia by Solexa sequencing (A: Coverage of sRNA in T. solium; B: Coverage of sRNA in T. asicatia).

Table 1 .
Detailed classification of reads of T. solium and T. asiatica.

Table 2 .
Common and species-specific reads of Taenia solium and Taenia asiatica.