Proteomic analysis of the diagnostic biomarker for childhood infectious mononucleosis

To investigate the different expressions of protein spectra in sera from children with infectious mononucleosis (IM) at acute stage and recovery stage in order to screen out potential protein biomarkers for children IM, the fingerprints of serum protein were obtained from the healthy (controls), acute upper respiratory infection (AURI), acute IM and recovery IM children using surface enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) and gold chip technique. Data were analyzed by Biomarker Wizard 3.1 and diagnostic models were established by Biomarker Patterns System 5.0. Within the mass to charge (m/z) ratios, there were six protein peaks (five were down-expressed and one over-expressed) showing significant differences between the acute IM group and the control group (P < 0.05). One down-expressed protein peak was found with differential expression levels in the acute and recover IM groups (P < 0.05). Two protein peaks were found significant differences between the acute IM and AURI (P < 0.05), one was down-expressed and the other was over-expressed. No significant difference in protein expression between the recovery IM and the controls (P > 0.05). The established diagnostic model based on significative peak test, the specificity and the sensitivity of IM. All the analytical results suggested that the protein at 6421.5 (M/Z value) may be the serum biomarker for IM; the protein bank showed that the protein at 6421.5 is a new protein; the diagnostic models based on this peak could accurately distinguish acute IM from normal children, the recovery IM children. SELI-TOF-MS technology is an effective tool to search for diseaserelated proteins.


INTRODUCTION
Infectious mononucleosis (IM) is a kind of acute infectious disease caused by Epstein-Barr virus (EBV) infection.IM is commonly seen in children and adolescents.
Normally, the course of acute IM lasts 2 to 3 weeks.Because of the recessive latent infectious characteristic of EBV, the symptoms (such as low-grade fever, lymphadenopathy, fatigue and so on) can last several weeks to several months.Even for those abnormities detected at laboratory in some cases, the recovery is very slow.It is a self-limited disease and has good prognosis except for the cases with complications.
IM can be complicated with upper respiratory tract *Corresponding author.E-mail: wenjunliucn@163.com.
obstruction, spontaneous rupture of spleen, acute diffuse type encephalomyelitis, etc. Though, the incidence of these complications is very low, they can lead to fatal consequence (Bahadori et al., 2007;Khoo et al., 2007;Stephenson and Dubois, 2007).It is also found that some IM cases can be complicated with cholecystitis, appendix mass and other system symptoms (Daffinoti et al., 2011;Keramidas et al., 2007;Lagona et al., 2007).Moreover, it has attracted more and more attention that IM can be complicated with hemophagocytic syndrome in recent years.EBV is also an important tumor-related virus.It is closely correlated with lymphadenoma, gastric carcinoma, nasopharyngeal carcinoma, post-graft lymphoproliferative syndrome, etc.Thus, IM poses seriously threat to children's health.
Research has shown that 1% of population suffers from EBV-related tumor worldwide (Kimura et al., 2008;Delecluse et al., 2007).The main methods of diagnosis of EBV infections are serological methods that detect certain specific antibodies such as IgG and IgM.But these antibodies cannot express positive results or reach to the detected antibody titer using molecular biological methods such as PCR or in situ hybridization.However, due to the lack of specific signs and symptoms, missed diagnosis and misdiagnosis of IM often occur.Therefore, it is urgent to find a specific biomarker as a laboratory parameter for early IM diagnosis, patients' condition monitoring and follow-up.
Serum proteomics is a solution to this problem.Proteomics is a newly-rising discipline in recent year, which studies all the proteins expressed in a particular cell, tissue or organism as well as their activities (Petricoin et al., 2002).Proteins are the end products of gene expression.As the physiological changes of organic tissues will lead to the proteomic changes in blood and different diseases have different serum polypeptide spectra, the study from the perspective of proteomics may help us obtain the biomarker for IM directly (Petricoin and Liotta, 2004;Wulfkuhle et al., 2003).
All diseases will lead to dynamic changes of proteins, and those proteins whose early changes can be affirmed have the potentials to become the clinical early diagnosis indexes of diseases (Garrisi et al., 2008;Mehrotra and Dwijendra, 2011;Somasundaram et al., 2009;Whelan et al., 2008).Thus, the dynamic observation of proteins can screen out the indexes of diseases at their early stage.At present, the main techniques in proteomics include two dimensional gel electrophoresis, mass spectrographic analysis, bioinformatics, SELDI-TOF-MS and so on (Marshall et al., 2003;Merrell et al., 2004;Pandey and Mann, 2000).SELDI-TOF-MS is a newly-emerging proteomic technique in recent years.It can directly be used for the detection of samples without any special treatment such as serum, urine, cerebrospinal fluid, serous cavity efflusion, etc, which has accomplished a great leap in the application of spectrography for clinical use (Cadron et al., 2009;Wu et al., 2006).Based on the above, sera from the healthy, AURI (acute upper respiratory infection), acute IM and recovery IM children were detected by using surface enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) in this study.
The pathogenesis of IM was investigated from the perspective of protein expression, and the specific biomarker was screened out for early IM diagnosis, patients' condition monitoring after treatment and follow-up.

Patients, controls and serum samples
All the cases were collected from January 2010 to March 2011.Four groups were divided in this study: the control group(stochastic collected from children received the physical examination test in the Department of Paediatrics and Child Health); AURI (acute upper respiratory infection), acute IM and recovery IM groups.The latter three groups chose the patients in Department of Paediatrics of Affiliated Hospital of Luzhou Medical College, Sichuan Province, China.The control group: 11 healthy children (5 boys and 6 girls) with the average age of 3.6 ± 1.6 years.The AURI group: 12 children (7 boys and 5 girls) with the average age of 3.7 ± 1.5 years, lymphocytes > 50% and atypical lymphocytes within 1 to 9%, the serological test was negative.The acute IM group: 26 children (12 boys and 14 girls) with the average age of 3.5 ± 1.4 years.The classification of the acute IM group was based on the diagnostic criteria for IM provided by the seventh version of Zhu Futang Textbook of Pediatrics (Hu and Jiang, 2005) and children were newly diagnosed with any treatment.The recovery group: 18 children (8 boys and 10 girls) with the average age of 4.5 ± 4.7 years.They were all diagnosed with IM and had received symptomatic treatment with ganciclocir and other drug for 14 days, blood routine results in the recovery group showed that the percentage of atypical lymphocytes was less than 10% and the clinical symptoms were recovered (fever, angina and deradenoncus disappeared).There were no significant differences in age or sex among these four groups.
The diagnostic criteria for IM were based on Zhu Futang Textbook of Pediatrics (the 7th version) (Hu and Jiang, 2005).Clinical symptoms (more than three types are positive at least as following): (1) Fever; (2) Sore throat, (3) Swllen lymph glands, (4) An enlarged liver, (5) An enlarged spleen.Hemogram: the presence of 50% lymphocytes at least or total number lymphocytes greater than 5.0 × 10 9 /L in peripheral blood.At least 10% atypical lymphocytes or total number atypical lymphocytes greater than 1.0 × 10 9 /L in peripheral blood.Epstein-Barr virus antibodies test: the antibody of Epstein-Barr nuclear antigen (EBNA) is negative in acute phase with one type as follows: (1) The antibody of VCA-IgG is positive in initial stage, then turn to negative later; (2) Paired serum VCA-IgG antibody titers > 1:4; (3) There is a transient increase of EA antibody; (4) The antibody of VCA-IgG is positive in initial stage, the antibody of EBNA turn to positive latterly.

Serum collection and processing
3 ml of venous blood was collected and placed at 4°C. 1 h later, the blood sample was centrifuged at 3000 rpm (4°C for 5 min).Serum was transferred into an Eppendorf tube, centrifuged for 5 min again.50 μl was transferred into a tube and kept at -80°C for later use.Before use, the serum sample was taken out, thawed out on the ice and then centrifuged at 10000 rpm (4°C for 2 min).The isolated serum was mixed up with 10 μl of Phosphate buffer saline (PBS) buffer according to a certain ratio, and meanwhile, half-saturation sinapinic acid (SPA) in twice volume was added, mixed well and then allow to stand for 5 to 10 min.

Chip pretreatment and sample detection
Some preliminary experiment had been done to find out a suitable chip which could reveal perfect protein separated result and distinct difference between protein peaks.The result showed that Gold chip is good at differentiating protein peaks with ease to operate and low cost.Gold chips were mounted in the sample injector and the chip numbers were recorded.50μl of acetone was respectively applied into each sample injection hole in the chip, placed in the chromatography freezer and patted dry 5 min later (600 rpm).50 μl of hydrochloric acid was added, placed in the chromatography freezer and patted dry 5 min later (600 rpm).Then, 50 μl of hydrochloric acid/methanol mixture was added, placed the chromatography freezer and patted dry 5 min later (600 rpm).Lastly, 50 μl of methanol liquid was added, placed in the chromatography freezer, patted dry 5 min later (600 rpm), and then dried in the air for later Use (the sample must be injected within 10 min). 2 μl of serum solution was added onto the activated Au chip hole.Holes for the controls were preserved.
After drying, 1 μl of half-saturation SPA was added into each hole.Samples were detected after drying.

Data collection
All-in-one-protein standard molecular chip was used for correction with the mass deviation before testing all samples.The serum proteins adhered to the surface of the chip were detected by PBSII/C (Ciphergen Biosystems, Fremont, CA, USA) protein fingerprint spectrometer.The optimized range was set to 2000 to 20000 with the highest detectable molecular weight of 100000, the laser intensity was set to 210 and the detector sensitivity to 9. All-inone protein standard molecule chips (Ciphergen, USA) were used for correction with the mass deviation ≤ 0.1%.Raw data were automatically collected and stored by ProteinChip Biomarker Software version 3.1 (Ciphergen Biosystems, Fremont, CA, USA).All obtained serum protein spectra were pre-processed.

The establishment of IM serum fingerprint screening models
The marking protein peak was employed for the construction of the diagnostic models for acute IM, IM staging (acute and recovery) and recovery IM, respectively.During model establishment, samples were divided into the modeling and verification groups by the blind method.The value of the marking protein peak in the modeling groups was input into Biomarker Pattern Software (BPS) for the establishment of diagnostic models, and then the value of the marking protein peak in verification groups was input into their corresponding diagnostic models for verification, respectively.

Statistical analysis and differently expressed protein screening
The serum protein fingerprints in four groups were analyzed by Biomarker Wizard 3.1 software.S/N was respectively set to 5 and 2 for filtration.
The threshold of frequency for a significant protein peak was 15%.F test was carried out for comparisons of preliminarily screened protein peaks among different groups, and P < 0.05 was considered statistically significant.Grouping data were further analyzed by SPSS 11.5 software and the M/Z value at the protein peak which was statistically significant and can be taken as the serum biomarker was screened out.Meanwhile, data were provided for BP artificial neutral network to establish the neural network diagnostic models.

Verification of differently expressed proteins
The M/Z values which showed significantly differences were input into Protein Data Bank (PDB), and the corresponding proteins to these values were obtained.

Analysis of mass spectrometric detection
The optimized M/Z value range for 67 serum sample detection was 2000 to 20000.The protein spectra of all serum samples were obtained based on the detection by PBS II/C protein fingerprint spectrometer as well as the proteins adhered to the gold chips.These spectra were analyzed.There were six protein peaks showing significant differences between the acute IM group and the control (P < 0.05), among which one was over-expressed and five were down-expressed.One protein peak in the acute IM group was significantly higher than that in the recovery group (P < 0.05).Five protein peaks in the recovery group showed significant differences compared to the control (P < 0.05), among which one was over-expressed and four were down-expressed.There were two protein peaks exhibiting significant differences between the acute IM group and the AURI group (P < 0.05), in which one was over-expressed, but no significant difference was found between the recovery group and the AURI group (P > 0.05).Meanwhile, results showed that the M/Z value of all over-expressed protein peaks was 6421.5.The protein peak at M/Z value of 6421.5 was low in the control, higher in the AURI group and the highest in the acute IM group.
In addition, the protein peak at 6421.5 (M/Z value) in the recovery group was lower than that in the acute IM group but higher than that in the control or AURI group (Table 1, Figures 1 and 2).

Protein identification
M/Z values of the protein peaks showing significant differences among different groups were respectively input into PDB, and the corresponded proteins were obtained (Table 2).

Establishment of diagnostic models
Analyses by Biomarker Wizard 3.1 software showed that the protein peak at 6421.5 was low in the control, increased in the recovery group and reached the highest in the acute IM group (Figure 1).Pairwise comparisons among these three groups showed that there were significant differences at this value.Thus, the corresponded protein at 6421.5 was screened out as the serum biomarker in this study, based on which diagnostic models were established.The diagnostic model for acute IM was constructed based on the protein peak at 6421.5.37 samples were divided into the modeling group (n =1 9) and the verification group (n = 18).The peak values at 6421.5 (M/Z) of 19 modeling samples were input into BPS and the diagnostic model of acute IM was constructed.This model included an input layer, a hidden layer and an output layer.The output value bound at between 0 to 1.1 and 0 were respectively corresponding to the desired output value of the acute IM patients and the  controls (that is, the output value close to 1 judged as patient, the value close to 0 judged as healthy, 0.5 was taken as the criterion).
Our results showed that BPS could discriminate the 19 modeling samples from each other when the input layer was 1, the hidden layer was 2 and the output layer was 1.And after 18, verification samples were tested by the blind method, sensitivity of 100% and specificity of 100% were obtained without any miscarriage of justice (Table 3).The diagnostic model for IM staging was constructed based on the protein peak at 6421.5.44 samples were divided into the modeling group (n = 22) and the verification group (n = 22).The peak values at 6421.5 (M/Z) of 22 modeling samples were input into BPS and the diagnostic model for IM staging was constructed.This model included an input, a hidden and an output layers.The desired outputs for recovery IM and acute IM were respectively set to 1 and 0, and 0.5 was taken as the criterion.Our results showed that BPS could discriminate the 22 modeling samples from each other when the input layer was 1, the hidden layer was 2 and the output layer was 1.And after 22, verification samples were tested using the blind method, sensitivity of 88.9% and specificity of 84.6% were obtained with three miscarriages of justice (Table 4).
The diagnostic model for recovery IM was constructed based on the protein peak at 6421.5.29 samples were divided into the modeling group (n = 15) and the verification group (n = 14).The peak values at 6421.5 (M/Z) of 15 modeling samples were input into BPS and the diagnostic model of recovery IM was constructed.This model included an input, a hidden and an output layers.The desired output for recovery IM children and normal children were set to 1, 0 and 0.5 was taken as the watershed.Our results showed that BPS could discriminate these 15 modeling samples from each other when the input layer was 1, the hidden layer was 2 and the output layer was 1.And after 14, verification samples were tested using the blind method, sensitivity of 100% and specificity of 100% was obtained without any miscarriages of justice (Table 5).

DISCUSSION
Infectious mononucleosis (IM) represents an uncommon benign self-limiting lymphoproliferative disorder characterized by primary EBV infection of B lymphocytes and massive proliferation of activated cytotoxic T cells (Verbeke et al., 2000).EBV is the principal etiological agent of infectious mononucleosis (IM) and is associated with several human lymphoproliferative malignant diseases such as Hodgkin's lymphoma, nasopharyngeal carcinoma, gastric carcinoma and carcinomas.There are at least 125,000 new cases of IM reported in the United States each year, and 200,000 new cases of EBV-associated malignances are reported each year worldwide (Cohen et al., 2011).Participants at the February 2011 meeting at the U.S. National Institutes of Health on Epstein-Barr virus (EBV) vaccine research recommend that future clinical trials have two goals: prevention of infectious mononucleosis and EBV-associated cancers, facilitated by identification of disease-predictive surrogate markers (Cohen et al., 2011).The reliable bases for IM diagnosis include epidemiological data, typical clinical manifestations (fever, adenopharyngitis, lymphadenectasis and splenohepatomegalia), atypical lymphocytes in peripheral blood > 10%, positive specific IgM (that is, VCA-IgM), etc (Bell et al., 2006;Cheng et al., 2007;Siennicka and Trzcinska, 2007), in which the increase of atypical lymphocytes and positive specific IgM are the two important bases for IM diagnosis.However, due to the diversity of serological responses to EBV infection, some cases may not display the increase of atypical lymphocytes (< 10%).
Study showed that patients with atypical lymphocytes > 10% accounted for 41.8% in IM patients, and children with atypical lymphocytes > 10% accounted for 21.39% in the first week of IM course and reached 71.94% in the second week (Tsai et al., 2005).Meanwhile, the positive rate of specific VCA-IgM was 25% in children with acute IM, and some cases might show delay, lasting absence or long-time existence of anti-VCA-IgM (Dohno et al., 2010).Besides the main methods of diagnosis of EBV infection based on detecting certain specific antibodies, recently more attention are drawing to molecular biological methods such as PCR or in situ hybridization (Bocian et al., 2011).The review which was analyzed with the data from articles providing diagnosis of IM found that the evaluated diagnostic methods were real-time PCR (RT-PCR), IgM/IgG antibodies [measurement of Epstein-Barr virus viral load (EBV-VL) in peripheral blood, neutrophil/lymphocyte/monocyte counts, C-reactive protein values and monospot test].
RT-PCR and measurement of EBV-VL may provide useful tools for the early diagnosis of infectious mononucleosis in cases with inconclusive serological results (Vouloumanou et al., 2012).Flow cytometric (FC) immunophenotyping is a method of choice in the diagnosis of lymphoproliferative disorders.The lymphocytes showed good expression of HLADR along with partial down regulation of CD5 from FC analysis of a case of acute IM.Serological testing has shown IgM antibodies against EBVN1 antigen for EBV with significant titer confirming the diagnosis of acute IM due to EBV infection (Tembhare et al., 2010).Children infectious of IM expressed higher level of CCR3 + and lower level of CCR5 + and there was a tendency of Th2 polarization with over production of T helper cell divide imbalance.CCR3 + and CCR5 + may be important targets to judge the degree of seriousness of IM (Qi et al., 2011).Another research showed that IL-18 was markedly elevated during acute EBV infections and EBV-associated diseases, while ferritin concentrations were also elevated during acute EBV infection and correlate with IL-18.Therefore, IL-18 and ferritin may represent infection markers for viral infections such as EBV, similar to CRP for bacterial infections (van de Veerdonk et al., 2012).A study analyzed the genotypes of infectious mononucleosis (IM) and acute lymphocytic leukemia (ALL) in children; children carrying GSTT1 or GSTM1 null genotype have a high risk of suffering from IM or ALL.GSTT1 and GSTM1 might play a potential role in the pathogenesis of both IM and ALL (Li et al., 2012).Epstein-Barr virus (EBV) genotypes can be distinguished based on gene sequence differences in EBV nuclear antigens 2, 3A, 3B, and 3C, and the BZLF1 promoter zone (Zp).
A novel variant previously identified in Chinese children with infectious mononucleosis, Zp-V1 was also found in 3 of 18 samples of infectious mononucleosis, where it coexisted with the Zp-P prototype.The expression levels of 29 chronic active EBV infection-associated cellular genes were also compared in the three EBV-related disorders, using quantitative real-time reverse transcription polymerase chain reaction analysis.Two upregulated genes, RIPK2 and CDH9 were identified as common specific markers for chronic active EBV infection in both in vitro and in vivo studies (Imajoh et al., 2012).As there are no specific signs and symptoms for IM diagnosis, it is not feasible to diagnose IM using one reference parameter in laboratory, which obviously increases the difficulty in IM diagnosis.Different diagnosis methods for IM are continuously exploring to find significant serum biomarkers for early IM diagnosis, patients' condition monitoring after treatment and follow-up.A study showed indicated that the rapid and simple IMFA is suitable for point-of-care testing, and it may be use as a first-line assay for the diagnosis of EBV IM, especially in young children (Bravo et al., 2009).In this study, we obtained serum protein mass spectra among the acute IM group, the recovery IM group, the control group and the AURI group using SELDI-TOF-MS.
Our results showed that a total of six significantly different protein peaks were screened out among four groups.There were six protein peaks showing significant differences between the acute IM group and the control group (P < 0.05), among which one was over-expressed and other five were down-expressed.There lies the cellular immune dysfunctions in the children with infectious mononucleosis, the protein with high expressive protein peak in the initial stage may be a number of cytokines secreted by cells EBV infected, including EBV early antibody components, while the protein corresponding to low expressive protein peak may be attributed to decreasing secretion of normal cell infected EBV.It is useful for studying the pathogenesis and the early diagnosis of IM by further analysis and identification of those proteins.There was one protein peak showing a significant difference in the acute IM group compared to the recovery IM group and the M/Z value of this peak was 6421.5.Meanwhile, results also showed that the expression of this protein peak was correspondingly reduced with the relief of clinical symptoms after treatment.The study showed patients with IM have secondary humoral immunosuppression, which continued for a long time after the recovery of the disease (Wang et al., 2008).There were still a lot of low expressive protein peaks in the recovery stage that indicated that EBV persistent affected the normal cell secretive function and induced the low level cytokine.
To detect and identify the low proteins will reveal the pathological mechanisms of the recovery stage, monitor Liu et al. 501 the situation after treatment and provide theoretical basis for the late follow-up and the medication.Robertson et al. (2003) reported that anti-VCA-IgG low-affinity antibodies could be detected within 10 day after the occurrence of clinical symptoms in more than 90% of primary acute EBV infection cases, and the anti-VCA-IgG low-affinity antibodies can still be detected in 50% of cases even 30 days later (Robertson et al., 2003).
Results in our study showed that the protein peak at 6421.5 notably increased in the acute IM group compared to the recovery IM group.After screening, a diagnostic model for acute IM, a model for IM staging (acute and recovery stages) and a model for recovery IM were established, taking the protein peak at 6421.5 as the serum biomarker for IM, the models based on this peak could accurately discriminate the acute IM, the recovery IM and the controls.And research result in protein data suggested that the corresponded protein at 6421.5 is very likely to be a new protein.Thus, further analysis and verification of the expression intensity of proteins at different IM stages as well as in IM-related diseases will be of great significance for the early IM diagnosis, condition monitoring after treatment, follow-up and prognosis.
In summary, we discovery a protein peaks that could discriminate pediatric IM from healthy controls.This panel of markers is likely to be limited to distinguishing pediatric IM from healthy controls.Further studies with additional populations or using pre-diagnostic sera are needed to confirm the importance of these findings as diagnostic markers of pediatric IM.

Figure 1 .
Figure 1.The protein peak expressions at the M/Z value of 6421.5 in different groups.Black represents the acute IM group; wine represents the recovery IM group; white represents the AURI group; wathet blue represents the control group.

Figure 2 .
Figure 2. The typical expression spectra of the protein peaks at 6421.5 in different groups.From top to bottom, the four spectra represented the control group, the AURI group, the acute IM group and the recovery IM group.

Table 1 .
The expressions of differently expressed proteins in different groups based on serum protein spectra.

Table 2 .
The corresponding proteins to the M/Z values.
Note: M/Z, values of mass electron ration (M/Z); Number, the number of protein corresponds to one peak; MW, molecular weight of protein; PI, isoelectric point of each protein.

Table 3 .
Prediction results of 18 blind test samples by BPS.

Table 4 .
Prediction results of 22 blind test samples by BPS.

Table 5 .
Prediction results of 14 blind test samples by BPS.