Identify potential chronic heart failure related transcription factors by network analysis

Chronic heart failure (CHF) is characterized by diminished cardiac output and pooling of blood in the venous system. Using transcriptome data, we constructed a regulation network to identify the potential genes that related to heart failure. In the network, some of transcription factors and its’ target genes have been proved to be related to heart failure in previous study. According to the regulation network, we found some new transcription factors and target genes, which have not been proved to be directly related with; we also found MYC, TP53 and ETS2 regulate each other. Our work demonstrated that regulation of network analysis is useful in identification of the candidate genes in heart failure.

As a global approach, DNA microarray analysis has been applied to investigate physiological mechanisms in health and disease (Spies et al., 2002).The highthroughput microarray experiment was designed to analyze genetic expression patterns and identify potential target genes for heart failure (Xu et al., 2010).Genomic expression profiling has evolved to a useful tool to identify novel pathomechanisms in human cardiac disorders (Verducci et al., 2006).
To investigate the regulation mechanisms of CHF, we constructed a regulation network of DCM-induced transcription factors and its' target genes.Base on the above network, a pathway regulation network was constructed and analyzed.In different level, the regulation mechanism of CHF was discussed.

Transcriptome data
The transcription profile GSE9128 of ischemic cardiomyopathy,

Pathway data
Kyoto Encyclopedia of Genes and Genomes (KEGG) is a collection of online databases, dealing with genomes, enzymatic pathways, and biological chemicals (Kanehisa, 2002).The "pathway" database records networks of molecular interactions in the cells, and variants of them specific to particular organisms (http://www.genome.jp/kegg/).Total of 130 pathways, involving 2287 genes were collected from KEGG.

Regulationship data
TRANSFAC database contains data on transcription factors, their experimentally-proven binding sites, and regulated genes (Wingender, 2008).Transcriptional regulatory element database (TRED, http://rulai.cshl.edu/TRED/)has been built in response to increasing needs of an integrated repository for both cis-and transregulatory elements in mammals (Jiang et al., 2007).TRED collected transcriptional regulation information, including transcription factor binding motifs and experimental evidence.The curation is currently focusing on target genes of 36 cancer-related TF families.774 pairs of regulatory relationship between 219 transcription factors (TFs) and 265 target genes were collected from TRANSFAC (http://www.gene-regulation.com/pub/databases.html).5722 pairs of regulatory relationship between 102 transcription factors (TFs) and 2920 target genes were collected from TRED (http://rulai.cshl.edu/TRED/).Combined the two group of collected regulation data, total 6328 regulatory relationships between 276 transcription factors and 3002 target genes were collected (Table 1).

Differentially expressed genes analysis
For the GSE9128 dataset, the Limma method (Smyth, 2004) was used to identify differentially expressed genes (DEGs).The original expression datasets from all conditions were preprocessed into expression estimates using the RMA method with the default settings implemented in the software Bioconductor, then constructed a linear model.The DEGs with the fold change value larger than 2 and p-value less than 0.05 were selected.

Co-expression analysis
To find the potential coexpression regulation, the Pearson correlation coefficient (PCC) was calculated for all pair-wise comparisons of gene-expression values between TFs and the DEGs.The regulatory relationships whose absolute PCC values are larger than 0.6 were considered as significant.

Gene ontology analysis
The BiNGO (Maere et al., 2005) was used to identify overrepresented GO categories in biological process.

Regulation network construction
Using the regulation data that have been collected from TRANSFAC database and TRED database, we matched the relationships between differentially expressed TFs and its' differentially expressed target genes.Base on the significant correlation (PCC > 0.6 or PCC < -0.6) between TFs and its target genes, 33 putative regulatory relationships were predicted between 7 TFs and 22 target genes.With the above regulation datasets and the coexpression of TF and its' target genes, we build a regulation networks by Cytoscape (Shannon et al., 2003).

Significance pathway analysis
We adopted an impact analysis that includes the statistical significance of the set of pathway genes but also considers other crucial factors such as the magnitude of each gene's expression change, the topology of the signaling pathway, their interactions, (Draghici et al., 2007).In this model, the Impact Factor (IF) of a pathway Pi is calculated as the sum of two terms: (1) The first term is a probabilistic term that captures the significance of the given pathway Pi from the perspective of the set of genes contained in it.It is obtained by using the hyper geometric model in which pi is the probability of obtaining at least the observed number of differentially expressed gene, Nde, just by chance (Tavazoie et al., 1999;Draghici et al., 2003).
The second term is a functional term that depends on the identity of the specific genes that are differentially expressed as well as on the interactions described by the pathway (that is, its topology).The second term sums up the absolute values of the perturbation factors (PFs) for all genes g on the given pathway Pi.The PF of a gene g is calculated as follows: (2) In this equation, the first term ∆E (g) captures the quantitative information measured in the gene expression experiment.The factor ∆E (g) represents the normalized measured expression change of the gene g.The first term ∆E (g)in the above equation is a sum of all PFs of the genes u directly upstream of the target gene g, normalized by the number of downstream genes of each such gene Nds(u), and weighted by a factor βug, which reflects the type of interaction: βug = 1 for induction, βug = −1 for repression (KEGG supply this information about the type of interaction of two genes in the description of the pathway topology).USg is the set of all such genes upstream of g.We need to normalize with respect to the size of the pathway by dividing the total perturbation by the number of differentially expressed genes on the given pathway, Nde(Pi).In order to make the IFs as independent as possible from the technology, and also comparable between problems, we also divide the second term in Equation 1 by the mean absolute fold change ∆E, calculated across all differentially expressed genes.The result of the significance analysis of pathway was shown in Table 3.

Pathway regulation network
To further investigate the regulatory relationships between TFs and pathways, we mapped the DEGs to pathways and constructed a regulation network between TFs and pathways (Figure 2).

Regulation network construction in heart failure
To get pathway-related DEGs of heart failure, we collected publicly available microarray data set GSE9128 from GEO.After microarray analysis, the differentially expressed genes with the absolute fold change value larger than 1.5 of GSE9128 and p-value less than 0.05 were selected, including 395 DEGs.To get the significant regulatory relationships, the co-expressed value (PCC ≥0.6) of transcription factors and target genes was set as a threshold.Finally, we got 33 regulatory relationships between 7 different expressed TFs and their 22 differently expressed target genes.Integrating the above regulatory relationships, a regulation network of heart failure was built between TFs and its target genes (Figure 1).In this network, MYC, ETS2, TP53 and FLI1 with higher degrees form a sub network, which suggesting that these TFs may play an important role in heart failure.Besides, the relationships of MYC, ETS2, TP53 were regulated by each other was also observed in this network.

GO analysis of the regulation network in heart failure
Several gene ontology (GO) categories were enriched among these genes of the regulatory network, including developmental process, regulation of macromolecule biosynthetic process, positive regulation of biological process and regulation of cellular metabolic process, and so on (Table 1).

Significant pathways of heart failure
To identify the relevant pathways changed in heart failure, we used a statistical approach.Significance analysis at single gene level may suffer from the limited number of samples and experimental noise that can severely limit the power of the chosen statistical test.Pathway can provide an alternative way to relax the significance threshold applied to single genes and may lead to a better biological interpretation.So, we adopted a pathway-based impact analysis method, which contained many factor including the statistical significance of the set of differentially expressed genes in the pathway, the magnitude of each gene's expression change, the topology of the signaling pathway, their interactions, and so on.The impact analysis method yields many significant pathways contained Phosphatidylinositol signaling system, Apoptosis, Wnt signaling pathway, and so on (Table 2).

Pathway regulation network
To further investigate the regulatory relationships between TFs and pathways, we mapped DEGs to pathways and got a regulation network between TFs and pathways (Figure 2).In the network, MYC, ETS2, TP53 were shown as hub nodes linked to lots of heart failure related pathways.

DISCUSSION
According to the regulation network of heart failure, we found that many TFs and pathways closely related with CHF have been linked by our method.The gene MYC, TP53, ETS2 and ERG1 also appeared as hub nodes in our transcriptome regulation network and some target genes have a close relationship with heart failure proved by previous study.Although the role of ETS2 and FLI1 in heart failure has not been investigated to date, some evidence also suggests that ETS2 and FLI1 may play an important role in response to CHF.Besides, the MYC, TP53 and ETS2 TFs regulate each other to response to CHF.As a transcription factor, MYC protein is a multifunctional nuclear phosphoprotein, which plays a role in cell cycle progression, apoptosis and cellular transformation.Bitransgenic mouse inducibly expressing MYC under the control of the cardiomyocyte-specific MHC promoter was developed to address the causal relationship between increases in c-Myc (Myc) and cardiomyopathy.Results showed that the induction of MYC expression in cardiomyocytes led to the development of severe hypertrophic cardiomyopathy followed by ventricular dysfunction and ultimately death from congestive heart failure.MYC activation in cardiomyocytes is an important regulator of downstream pathological sequelae (Lee et al., 2009).
Tumor protein TP53 is a DNA-binding protein which responds to diverse cellular stresses.Expression of TP53 is associated with dysregulation of ubiquitin-proteasome system components and activation of downstream effectors of apoptosis in human dilated cardiomyopathy (Birks et al., 2008).Increased expression of TP53 is associated with progressive loss of myocytes by apoptosis in heart failure (Tsipis et al., 2010).Coherence of TP53 with myo plays an active role during the transition of cardiac hypertrophy (CH) to HF in a model of HF induced by myo overexpression.Transition from CH to HF can be prevented in the absence of TP53 in myoinduced hypertrophy.Therefore, deletion/inhibition of TP53 could be a therapeutic strategy to prevent CH from transitioning to HF (Das et al., 2010).
In general, MYC, ETS2, TP53 and the interaction with  (Xu et al., 2008).ERG1 protein belongs to the EGR family of C2H2-type zinc-finger proteins.It is a nuclear protein and functions as a transcriptional regulator.Studies suggest this is a cancer suppressor gene.EGR1 were found to be upregulated in CHF patients.Furthermore, EGR1 expression levels can discriminate between ischemic and non-ischemic dilated cardiomyopathy CHF patients (Cappuzzello et al., 2009).
Some target genes which are regulated by these transcription factors also appears.JUN encodes a protein which is highly similar to the viral protein, and which interacts directly with specific target DNA sequences to regulate gene expression.It is known as AP-1.AP-1 was significantly activated in chronic congestive heart failure due to ischemic or dilated cardiomyopathy.This finding suggests an important involvement of AP-1 in the cardiac remodeling process (Frantz et al., 2003).
Studies showed that ablation of Mtor (mechanistic target of rapamycin) in the adult mouse myocardium results in a fatal, dilated cardiomyopathy that is characterized by accumulation of eukaryotic translation initiation factor 4E-binding protein 1 (eIF4E-BP1).When eIF4E-BP1 was ablated together with Mtor, marked improvements were observed in apoptosis, heart function, and survival suggesting the role of 4E-BP1 in regulating cardiomyocyte viability and in HF (Zhang et al., 2010).
CD9, a member of the transmembrane 4 superfamily, functions in many cellular processes including differentiation, adhesion, and signal transduction, and expression of this gene plays a critical role in the suppression of cancer cell motility and metastasis.Morphofunctional variants of the myocardium, such as LV diastolic dysfunction, showed suppression of immune reaction characterized by high values of T-suppressors (CD3+, CD4+, CD9+), which might be related with compensated regeneration of the cardiac muscle.It points to the risk of pathological remodeling of the left ventricle and cardiac failure (Zemskov et al., 2008).
ETS2, one of ETS transcriptions factors, regulate numerous genes and are involved in stem cell development, cell senescence and death, and tumorigenesis.Fetal cardiomyocytes have been proposed as a potential source of cell-based therapy for heart failure.The transcript level of ETS2 transcription factors was declined in fetal cardiomyocytes cellular senescence process, suggesting roles for this gene in maintenance of cardiomyocyte proliferative capacity (Ball et al., 2005).Fli-1, a member of the ETS family of DNA binding transcription factors, is involved in cellular proliferation and tumorigenes.Chronic heart failure resulting from ischemic and hypertrophyiccardiomyopathy is often accompanied with cardiac fibrosis.Fli-1 might be involved in this process.Fli-1-knockdown mice demonstrated greater cardiac collagen-1 expression and fibrosis compared with wild-type mice (Elkareh et al., 2009).
Phosphatidylinositol signaling system pathway, Apoptosis pathway and Wnt signaling pathway were identify as the relevant pathways changed in heart failure by a statistical approach on pathway level.Numerous studies have demonstrated that regular exercise training improves exercise tolerance and quality-of-life in patients with chronic heart failure.This exercise-induced cardioprotection may be due to activation of PI3K enzyme (Owen et al., 2009).Oxidative stress plays a pivotal role in chronic heart failure.SIRT1, an NADdependent histone/protein deacetylase, promotes cell survival under oxidative stress when it is expressed in the nucleus.PI3K is a candidate for signaling to induce the nuclear localization of SIRT1 in failing hearts.PI3K is activated in the cardiomyocytes of hypertrophic and failing hearts, and enhanced PI3K activity prolongs the lifespan of mice with DCM, whereas the diminished activity of this kinase shortens it by ~ 50%.Therefore, a Afr.J. Microbiol.PI3K-enhanced increase in nuclear SIRT1 levels may have contributed to the prognostic advantage of the mice (Tanno et al., 2010).
The progressive loss of cardiac myocytes is one of the most important pathogenic components in heart failure.During the past few years, there has been accumulating evidence in both human and animal models suggesting that apoptosis may be an important mode of cell death during heart failure.Transgenic mice with cardiac-specific overexpression of caspase-8 (fusion with the FK506 binding protein) appeared normal at birth, but administration of the divalent dimerizer FK1012 activated caspase-8 resulting in overwhelming cardiac myocyte apoptosis and rapid death of the animal.Cardiac-specific knockout of gp130, a common subunit of the interleukin-6 family of cytokine had been shown to promote cell survival in the presence of an apoptotic stimulus.The most common forms of chronic heart failure associated with apoptosis are dilated and ischemic cardiomyopathy (Kang et al., 2000).
The Wnt pathway is an evolutionary conserved signaling mechanism with a critical function in tumor growth and cardiogenesis.In infracted hearts, Fz2 (Cell surface receptors of Wnt proteins) expression is considerably enhanced, and in failing ventricles of humans, mRNA levels of secreted Fz-related proteins 3 and 4 are elevated, leading to attenuation of the Wnt/βcatenin pathway.Furthermore, Wnt3a is upregulated in ischemic cardiomyocytes (Malekar et al., 2010).
Due to the high prevalence of this condition and it is still unsatisfactory long-term outcome, searching for new therapeutic targets in heart failure is worthwhile.The identification of heart failure associated genes and the related pathways are essential to provide an important possibility in the prevention of this disease.In this research, transcriptome regulation network was used for the identification of heart failure associated genes and pathways.We also used factor analysis method to determine if a pathway is significantly changed in a microarray experiment.It turned out that this method is well suited for microarray data and therefore is proposed as a powerful tool for the search for new and so far undiscovered pathways related to other diseases.
A further understanding of transcription factors and their target genes remain an area of intense research activity in the future.Our regulation network is useful in investigating the complex interacting mechanisms of transcription factors and their regulated genes in disease.However, further experiments are still needed to confirm the conclusion.

Figure 1 .
Figure 1.Regulation network of heart failure.

Table 1 .
GO biological process analysis.IV CHF patients and (3) Age and gender matched controls (n=12) by Affymetrix microarrays.With age and gender matched, 7 pair of ischemic cardiomyopathy and control data were used.