Microarray based comparative genome-wide expression profiling of major subtypes of leukemia

The uncontrolled proliferation of hematopoietic cells with no capacity to differentiate into mature blood cells leads to leukemia. Though considerable amount of work has been done in understanding the molecular basis and gene expression profiles of hematologic malignancies viz., chronic lymphocytic leukemia (CLL), chronic myelogenous leukemia (CML), acute lymphocytic leukemia (ALL) and acute myelogenous leukemia (AML), the role of various underlying genes and mechanisms predisposing the disease are poorly understood. To develop the early diagnosis, preventive and therapeutic strategies, identification of population specific novel mutations and candidate genes are required. Micro array based gene expression profiling was performed for total of 18 samples (4 from each subtype of leukemia that is, CLL, CML, ALL, AML and 2 controls) from Indian population using single color hybridization. The expression of all genes presented in terms of fold variation was subjected to F-test. The microarray data of genes showing differential regulation with respect to the control samples have been obtained from total 50, 238 probes covering 14,992 genes on Agilent’s Human 8X60K Array. The experiment was conducted with expectation to have similar patterns of result in terms of gene expression but it demonstrates statistically significant relationship only among CML and ALL which are of myeloid and lymphoid origin, respectively, in contrast to other combinations. Gene expression profiles of four subtypes of leukemia were compared to each other to ascertain the overall association and significance of genes for occurrence of different types of leukemiawhich would guide in the development of common probable biomarkers for leukemias followed by effective diagnosis, prognosis and treatment. Based on their geomean fold values, the highly upregulated genes found in this study are listed.


INTRODUCTION
Leukemia is an uncontrolled proliferation of hematopoietic cells having no capacity to differentiate normally to mature blood cells.It is generally classified into myeloid and lymphocytic categories based on affected cell lineages (Sawyers et al., 1991).
Several external agents like chemical exposures, treatment with chemotherapeutic agents, radiation or intrinsic factor like heredity have been entailed for the development of leukemia (Smith and Zhang, 1998).Human T-cell leukemia/-lymphotropic virus type I (HTLV-1) is also a well-empathised cause of adult T-cell leukemia (Franchini, 1995).Microarray (MA) based gene expression analyses has proved to be an important aspect of clinical and biomedical research and helps in furnishing vital information regarding pathogenesis, diagnosis and prognosis of leukemias by increasing the knowledge on deregulated pathways in leukemia.The incisively positioned DNA probes of microarrays are projected to specifically supervise the gene expression level in parallel processing (Dunphy, 2006).Analyses of differences in gene expression at a large scale for cancer investigations can be performed by DNA microarray technology (Majeti et al., 2009).
There are several reports of gene expression profiles of chronic myelogenous leukemia (CML) (Nowicki et al., 2003;Cohen et al., 2001) and acute myelogenous leukemia (AML) (Bullinger et al., 2004;Valk et al., 2004) in bulk, whereas in few studies individual types of leukemia have been directly compared to normal hematopoietic cells (Stirewalt et al., 2008).For the first time we are reporting from Indian population about comparative gene expression profiles between four major subtypes of leukemia viz.CML, chronic lymphocytic leukemia (CLL), AML and acute lymphocytic leukemia (ALL) along with controls.
In case of leukemia, which are normally associated with a single gene abnormality viz. a single genemutation like C/EBPα, NPMc, FLT3-ITD mutation or a fusion gene due to a chromosomal translocation (for example, AML1-ETO, BCR-ABL) the use of global gene expression analysis techniques reserves for a cryptic understanding of the cellular consequences and the disease as a whole.These techniques have also been used extensively to identify prognostic determinants in leukemia patients, as well as to better understand the molecular basis of response to therapeutic agents in AML (Goswami et al., 2009).
Leukemic thymocytes disclosed typical gene expression patterns being strongly consociated with specific oncogenic transcription factors after being gone through microarray studies.Closely related signatures were also found in several samples which lacked activation of known T-ALL oncogenes.It leads to predict alternative oncogenic transcription factors able to initiate gene expression showing similar patterns (Ferrando et al., 2002).Supervised and unsupervised approaches of microarray analysis showed a distinctive pattern of gene machinery expression of the CLL clone in regression (Haslinger et al., 2004).Wang et al. (2004) andZent et al. (2003) reported that genes like FGR, PTPN12, IL4R, FCER2 (CD23), TMEM1, TNFRSF1B, CHS1, CCR7 and FMOD among others were differentially expressed in a consistent manner in CLL when compared with tonsillar B lymphocytes and plasma cells.The comparative gene expression profiling using cDNA micro array analysis of 5315 genes of CML and of normal donors revealed at least a 4-fold difference in the mean expression of 263 genes in which 148 up-regulated and 115 down-regulated were observed in the CMLs compared with the normal specimens (Nowicki et al., 2003).
Today, genome-wide gene expression profiling based on DNA microarrays represents one of the most powerful tools in the area of genomics (Liotta and Petricoin, 2000;Ramaswamy and Golub, 2002) since it has become economically feasible and widely accessible, thereby contributing significantly to our understanding of different types of cancers (Care et al., 2003;Kiyoi et al., 1999).
To gain discernments into the molecular alterations that cause different types of leukemia, we accomplished genome-wide comparative gene expression profiling of sixteen cases representative of four different forms of leukemia (four each) and two cases of normal blood samples as controls.The comparative analysis of leukemia genomes helps in remoulding our knowledge and depth in hemato-malignancies that could have major implications for clinical translation (Hudson et al., 2010) which fulfils the founding concept of The International Cancer Genome Consortium.Our study of genomes of sixteen leukemia patients emphasizes this evolutionary potential, nevertheless profound studies will be required to interpret these outcomes to the healthcare domain.

Selection of patients
Clinically diagnosed blood samples of four major subtypes of leukemia viz.CML, AML, ALL, CLL were collected after ethical clearance and with the informed consent of the patients through approved hospitals followed by specific protocols.A total of 18 blood samples consisting of all the above four major types of leukemia, 4 each, with 2 controls were selected for the analysis (Table 1).The leukemia blood samples and controls are not age and sex matched.

Sample collection
Peripheral blood samples (PBC) of 2.5 ml were collected in PAXgeneBlood RNA tubes (Cat no.762165, Qiagen) to prevent the intracellular RNA from degradation and stored at -80°C for further experiment.

RNA extraction and target labeling
Total RNA was extracted from all the blood samples using the PAXgene Blood RNA kit (Qiagen Cat.No.762174), according to the procedure provided by the manufacturer.The RNA integrity was measured using RNA 6000 Nano Lab chip on the 2100 *Corresponding author.E-mail: pramodbgai@gmail.com.
Abbreviations: HTLV-1, Human T-cell leukemia/lymphotropic virus type I; MA, microarray; CML, chronic myelogenous leukemia; AML, acute myelogenous leukemia; CLL, chronic lymphocytic leukemia; ALL, acute lymphocytic leukemia; RIN, RNA integrity number; DTT, dithiothreitol.The RNA was evaluated to be of good quality when the rRNA 285/185 ratios are greater than or equal to 1.5 along with the rRNA contribution being 30% or more.Additionally, RNA integrity number (RIN) should be >7.0.Agilent's Quick-Amp labeling Kit (p/n5190-0442) was used for 1 st labeling.Briefly, both first and second strand cDNA was synthesized by incubating 500 ng of total RNA with 1.2 μl of oligo dT-T7 promoter primer in nuclease free water at 65°C for 10 min followed by incubation with 4.0 μl of 5x first strand buffer, 2 μl of 0.1 M dithiothreitol (DTT), 1 μl of 10 mM dNTP mix 1 μl of 200 U/μl MMLV-RT and 0.5 μl of 40 U/μl RNase OUT, at 40°C for 2 h.Immediately following cDNA synthesis, the reaction mixture was incubated with 2.4 μl of 10 mM Cyanine 3-CTP (Perkin-Elmer, Boston MA) 20 μl of 4X Transcription buffer, 8 μl of NTP mixture, 6 μl of 0.1 M DTT, 0.5 μl of RNase OUT, 0.6 μl of Inorganic pyrophosphatase, 0.8 μl of TT RNA polymerase and 15.3 μl of nuclease free water at 40°C for 2 h.Qiagen RNeasy mini spin columns were used for hybridization.825 ng of Cyanine 3 labelled cDNA in a volume of 41.8 μl was combined with 1.1 μl of 10X blocking reagent and 2.2 μl of 25X fragmentation buffer and incubated at 60°C for 30 min in the dark.
The fragmented cDNA was mixed with 5.5 μl of 2X hybridization buffer.About 110 μl of the resulting mixture was applied to Human 8x15K Array covering 14,992 genes, (AMADID: 035928) Gene expressions Micro Array (Agilent Technologies, USA) and hybridized at 65°C for 17 h in an Agilent Microarray Hybridization Chamber with hybridization oven.After hybridization, the slides were washed with Agilent gene expression wash buffer I for 1 min at room temperature followed by 1 min wash with Agilent gene expression wash buffer II at 37°C.Slides were finally rinsed with acetonitrite for cleaning up and drying.

Hybridization, scanning, and feature extraction
Scanning of hybridized arrays was performed at a

cDNA microarray data analysis
Feature extracted data were analyzed using GeneSpring GX Version 11.5 software from Agilent.Normalization of the data was done using per spot per chip intensity dependent lowest normalization.Further quality control of normalized data was done using correlation based condition tree to eliminate bad experiments.One fold and above differentially regulated genes were filtered from the data.Differentially regulated genes were clustered by using gene tree to identify significant gene expression patterns (Figure 1).

Analysis of variance
Comparing the normal blood gene expression profiles with that of leukemia gene expression, the data were grouped.The data from all types of leukemia blood samples were subjected to F-Test in Microsoft Excel 2007.

RESULTS AND DISCUSSION
The whole genome sequence consisting of 50238 probes were used for comparative gene expression profiling of four major subtypes of leukemia viz.CML, AML, CLL and ALL by using Human 8X 60 K Array for 16 patient samples (4 for each type of leukemia) and 2 normal peripheral blood samples.The results were analysed and presented in Figure 1 and Table 2.The cluster analysis of differentially regulated genes using gene tree is to identify significant gene expression patterns (Figure 1).The clusters and sub clusters consist of different leukemia samples in a mix pattern.The cluster analysis of normal samples has been performed separately.
Fold variation of all the probes were obtained in terms of log in base 2. The fold variation in terms of gene expression of all the probes were subjected to F-test two samples for variance in two groups where each type of leukemia is compared to other type and produces six different combinations.The analysis of variance was carried out with 95% confidence and 5% α error (Table 2).It was found that the critical F values in case of CML vs. AML, CML vs. CLL, AML vs. ALL, AML vs. CLL and ALL vs. CLL were greater than the calculated F-value whereas in case of CML vs. ALL the critical F-value of 1.014786 is smaller than the calculated F value of 1.466842.The F-test results revealed that there is no significant variation among AML and other types in terms of whole genome expression profiling when all the detected probes were compared.Similarly, CLL does not exhibit any significant variation in expression profiling with other types of leukemia.CML and ALL show no significant relationship with AML and CLL but have a strong relationship between them as when F-test of two samples for variance was calculated, that critical F value was found to be smaller than the calculated F value.The results suggest that there is no significant variation between acute myeloid and chronic myeloid leukemia which are reported to be originated from myeloid line of blood cells.The same result has been found for acute lymphocytic and chronic lymphocytic leukemia where their origin is known to be lymphocytes.Similarly, Lymphoid and myeloid line of blood cells were compared to each other which render contrastive results showing AML vs. CLL to have no significant genome wide gene expression variation whereas CML vs. ALL are found to furnish statistically significant relationship.The experiment was designed with expectation to have similar patterns of result but it shows strikingly different relationships among each other.The study can be further advanced by targeting few significantly involved genes in different types of leukemia to find out any possible association among them.The highest upregulated gene found in our study was ENST00000376881 that is, ZFP57 (zinc finger protein) [Source: HGNC Symbol; Acc: 18791] in CML, LOC390413 (predicted to be similar to 60S ribosomal protein L7) in CLL, THC2585201in AML and FOXC1 in ALL (Table 3).Any gene(s) found significantly pathologically active for more than one type of leukemia could be used to design common biomarker for early diagnosis.Leukemia and lymphoma society facts (2011-2012) has cited that, approximately 31 percent of more males are living with leukemia than females but this result is not ecumenical as one of our previous report (Modak et al., 2011) has reported overall male female ratio to be 1.8:1 in leukemia cases which shows the number of male leukemia patients are almost double than that of female patients.In this particular study, the ratio between male and female of 1:1.3 has been taken for microarray gene expression analysis.

Conclusions
This study is first of its kind as per our exhaustive literature survey where the four major subtypes of leukemia were matched for their individual genetic expression to each other as well as to determine the overall association among different types of leukemia.This would lead to the development of Firstly, differences shown in the expression levels of probes of patients among different types of leukemia may be affected from different treatments used based on their origin or malignancy because a number of literature data suggest for the effect of drugs on the expression level of most of the distinguished probes or genes (Heuser et al., 2005;Chiaretti et al., 2004;McWeeney et al., 2010).The patients once diagnosed to any type of leukemia cannot be left untreated and in this condition, drug-induced changes may create problem in investigations into the genetic cause of disease transformation using the blood sample of such type of patient.Nevertheless, there is no such type of scientific data found to prove this assumption.
Additionally, genetic mechanism and upshots responsible for the transformation of the disease keep taking place during the patients undergoing medication.Secondly, we used peripheral blood samples from patients of different age and gender.This may give us an impression for patient-specific alterations in genetic expression.The systematic microarray study of large number of samples would probably be able to overcome these shortcomings.

Figure 1 .
Figure1.Clusters for intra array quality control.GeneSpring GX 11.5 software was used for normalization.[Normalization being used for QC: 75th percentile shift normalization.Percentile shift normalization was considered as a global normalization in which in an array the locations of all the spot intensities are aligned.Each column of the experiment was taken independently in this normalization, which further computes the percentile of the expression values for this array, throughout all spots (n has a range from 0-100 ; n=50 is the median).Here this value was substracted from the expression value of each entity].ALL, Acute lymphocytic leukemia; CML, chronic myeloid leukemia; AML, acute myeloid leukemia; CLL, chronic lymphocytic leukemia.

Table 1 .
Clinical details of 16 patients with 2 controls.

Table 2 .
F-test results of all the probes for all types of leukemia.
F-Test was carried out with 95% confidence and 5% α error.

Table 3 .
Few significantly differentially upregulated genes in four leukemia subtypes with respect to clinically tested control blood samples.The significant association between CML and ALL could be clinically very useful for the trial and administration of drugs.In our study, the origin of cells viz.myeloid and lymphoid does not seem to be useful to consider as a parameter for treatment of leukemia as it varies among different combinations of CML vs. CLL, CML vs. AML, CML vs. ALL, AML vs. ALL, AML vs. CLL and ALL vs. CLL.Nevertheless, with this possibility, we are cognizant that the present study of microarray based gene expression profiling conducted by us potentially entertains several drawbacks.