DEXseq and Cuffdiff approaches weighing differential spliced genes exons modulation in estrogen receptor (Er) breast cancer cells

While the alternative transcription and splicing mechanisms have long been known for some genes like oncogenes, their prevalence in almost all multi-exon genes has been recently realized with the increasing application of high-throughput experimental methods, named Next Generation RNA Sequencing (NGS). Henceforth, understanding the regulation of these processes in comparisons between cell types and cancer requests, sensitive and specific bioinformatics as well as bio-statistic approaches, that is, Cufflinks/Cuffdiff, DEXseq and RESEM, detecting gene transcript/isoforms and exons abundance is necessary. Isoforms and exons expression analysis by NGS is complicated by several sources of measurement variability causing numerous statistical defies. Here, with the purpose of minimizing this statistical challenge, we integrated both Cufflinks/Cuffdiff and DEXseq bioinformatics approaches assessing whole alternative splicing (AS) events, focusing on alternative transcripts regulation and their exons modulation respectively, by processing our previous prepared Estrogen Receptor β (Erβ+ and Erβ-) breast cancer (BC) cells, stimulated by estradiol (E2). Results showed that Estradiol (E2) induced Erβ+ BC (Erβ+E2), exhibited dissimilar reply as opposed to the other’s analyzed BC cell lines in terms of intragenic, exons and junction reads count ratio. Relationship analysis between expressed genes and transcript isoforms, suggested a substantial role of alternative promoters in AS event occurrence in Erβ+ BC as opposed to Erβ- BC. Indeed, merging Cufflinks/Cuffdiff and DEXseq approaches, 79 multi-exon genes were detected as statistically differentially modulated (spliced) in Erβ+ hormone induced BC cell line, and around 38% of these spliced genes claimed to be induced by alternative promoters. The present survey discriminated between several cancer specific alternative splicing genes like LIFR a BC metastasis suppressor, PBX1 a pioneer factor defining aggressive Erβ- BC and PHLPP2 a tumor suppressor, as exhibiting significant exon modulation in early AS occurrence in hormone responsive Erβ+ BC exclusively. Although, our findings supported dissimilar reply comparing both Cuflinks/Cuffdiff and DEXseq approaches called AS events, it is noteworthy to underline their relative agreement, evaluating spliced genes functional annotation as well as their complementarity performing whole AS survey. This study therefore proposed the integration between Cufflinks/Cuffdiff and DEXseq tools as a reasonable complementary methodology assessing full AS pattern in hormone responsive Erβ BC cells. 
 
   
 
 Key words: Cufflinks/Cuffdiff, DEXseq, RNA-Seq, alternative splicing (AS), exons, transcript isoforms, estrogen receptor β (ERβ), breast cancer (BC) cells.


INTRODUCTION
Alternative splicing is a central cellular process that produces different mRNA transcript isoforms from a single gene. The qualitative and quantitative identification of such transcript isoforms is more complex as well as essential for understanding the different roles of alternatively spliced genes occurrence in a cell. However, detection of disease-specific transcript isoforms is an important task because aberrant splicing is known to be responsible for various diseases (Kim et al., 2008) and associated with different cancer types (Christofk et al., 2008;Venables et al., 2009). Several studies provided an intriguing insight into the mechanism of cancer specific alternative transcription and alternative splicing, which have long been implicated in the development of cancer. Cancer-specific isoforms have enhanced proliferative, invasive, and migratory abilities and provide a survival advantage to the tumor cells, suggesting that there is specific manipulation of the alternative event regulation in cancer that is beyond the general lack of fidelity of the splicing or the alternative transcription regulatory machinery. It is conceivable that the balanced expression of isoforms, rather than just activation or inhibition of those genes, may hold the key to impeding tumor growth and accordingly it is important to target the disease associated genes at the isoform level rather than at the gene level. The well-known application of exon arrays (Moller-Levet et al., 2009) and the advent of massive parallel sequencing named Next Generation RNA Sequencing (NGS-RNA-Seq) are allowing whole cancer genomes and transcriptomes to be sequenced with extraordinary speed and accuracy, providing insight into the bewildering complexity of isoform-specific expression in cancer genomes (Cancer Genome Atlas, 2012). Detecting alternative isoform regulation is inherently difficult in RNA-Seq, as sequencer reads are often one or more orders of magnitude shorter than the transcripts themselves. While there are several utilities that attempt to de-convolute read data into isoform abundances, the accuracy and robustness of these methods is difficult to establish (Chandramohan et al., 2013;Zhang et al., 2014). In addition, transcript isoforms expression estimates seem to vary considerably between different tools, and generally depend on the quality and completeness of the transcript assembly (Kanitz et al., 2015;Rehrauer et al., 2013). Existing well-established methods detect alternative splicing process mainly by considering sequencing reads that map uniquely to single isoforms or by assembling transcripts and estimating the most likely isoform abundance levels according to the given sequencing reads (Jeffrey and Zhong, 2011). Short read assemblers have been developed for genome and transcriptome assembly, that is, Velvet (Zerbino and Birney, 2008), Scripture (Guttman et al., 2010), Cufflinks (Trapnell et al., 2010) and others (Jeffrey and Zhong, 2011). In literature, the term alternative splicing has been used to describe both alternative transcription and splicing events. However, the variation in the pattern of intron removal, exon joining, and the addition of a poly-A tail on a single pre-mRNA result in alternatively splice mature mRNAs. These various alternative events have been identified in different cells and tissues by application of RNA sequencing (RNA-Seq) next generation sequencing (NGS) based methods in genome wide studies (Sultan et al., 2008;Wang et al., 2008;Pal et al., 2011). Furthermore, it is estimated that there are 263,772 exons in the human genome and approximately, 22% of these exons participate in alternative splicing phenomena (Pal et al., 2012), suggesting that performance assessment of whole alternative splicing occurrence in a genomic survey needed strong bioinformatics/biostatistical tools characterizing statistical exon modulation. However, a systematic assessment of transcriptome assemblies is difficult because appropriate quality metrics have not been established yet and require a well-defined gold standard that is difficult to find (Jeffrey and Zhong, 2011). Recently,  published a bioinformatics tool called DEXseq, which process exon differential survey in RNA-Seq genomic and/or transcriptomic experimentation. Rather than using an assembly approach and comparing abundance levels of predicted transcripts, DEXseq avoids the assembly step and calculates probabilities values (p-values) for every annotated exon (statistical estimation of exons abundance). Then, the emergence of NGS provides an exciting new technology to analyse alternative splicing on a large scale including differential expression analysis of exons of multi-exon genes. Hence, we believe that combining transcript isoforms differential expression survey with gene and/or transcript isoform exons modulation, could increase researcher alternative splicing phenomena comprehension processing two or more analysed cells and/or tissues conditions. Here we investigated the relation-ship between significantly differentially spliced genes (DSGs) based on alternative transcript isoforms measurement of our previous developed Erβ + breast cancer (BC) cell line model and their corresponding exons abundance modulation merging both Cufflikns/Cuffdiff  and DEXseq  bioinformatics/bio-statistical approaches respectively, emphasizing alternative splicing manifestation in developed Erβ + and Erβ breast cancer cell line models (Grober et al., 2011;Paris et al., 2012; *Corresponding author. E-mail: dgnoel7@gmail.com Tel: (+39) 3381426596 or (+225) 48397811 or (+225) 44595981.
Author(s) agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License Nassa et al., 2011;Tarallo et al., 2011).

MATERIALS AND METHODS
Erβ + breast cancer cell model preparation and gene and transcript isoforms differential expression analysis MCF7 and 5B12 breast cancer (BC) cell line model have been prepared to mimic Erβ/Erβ; (Erβ + ) and Erβ/Erα; (Erβ -) breast cancer (BC). The preparation of Erβ + and Erβ -BC cell lines have been entirely described in Nassa et al. (2011), Grober et al. (2011) and Dago et al. (2015). Differential genes, transcript isoforms and exons expression analysis, assessing the effect of early stimulation of estradiol (E2) hormone on alternative splicing occurrence in breast tumor cell models, have been performed by Cufflinks/Cuffdiff  and DEXseq  bioinformatics/bio-statistic packages respectively. Next generation RNA sequencing (NGS RNA-Seq) data used for the present analysis have been deposited in the Gene Expression Omnibus genomics data public repository (http://www. ncbi.nlm.nih.gov/geo/) with Accession Number GSE64590.

RNA sequencing (RNA-Seq) data generation and gene and/or transcript isoforms expression measurement
1 µg micrograms of high-quality total mRNA was used as starting material for the Illumina mRNASeq library preparation kit and was prepared to manufacturer's directions (Illumina). Libraries were sequenced on the Illumina Genome Analyzer II as 101 base pair paired-end reads. Tophat v.2.0.8 (Kim et al., 2013) was used to align all reads including junction-spanning reads back to the human genome (Homo sapiens Ensembl GRCh37). The reads alignment quality and distribution were estimated using SAMtools. Cufflinks v2.1.1  was used to identify differential spliced genes, isoform transcript and gene expression changes between analyzed experimental groups and/or conditions. We defined statistical significance in expression q-value and/or adjusted pvalue for multiple testing ≤ 0.05, and Fragments per Kilobase of exon per Million reads mapped (FPKM)>0.5, since FPKM is a measure of expression used in high-throughput sequencing data that is normalized for both transcript length and total number of reads sequenced.

DEXseq measuring exons differential modulation usage by RNA-sequence
DEXseq is a package for the statistical programming language R (R Development Core Team, 2009) available as open source software via the Bioconductor project (Gentleman et al., 2004). For the preparation steps, namely the flattening of the transcriptome annotation to counting bins and the counting of the reads overlapping each counting bin, two Python scripts are provided, which are built on the HTSeq framework (Anders et al., 2015). The first script takes a GTF file with gene models and transforms it into a GFF file listing counting bins, the second takes such a GFF file and an alignment file in the SAM format and produces a list of counts. The R package is used to read these counts, estimate the size factors and dispersions, fit the dispersion-mean relation and test for differential exon usage. Then, exons with significant change and/or modulation at a false discovery rate lower or equal to 0.05 (FDR≤0.05) have been selected as involve in alternative splice events. All bioinformatics and biostatistics analyses and comparisons were implemented and performed using in-house scripts written in Unix and R.

Functional annotation gene ontology analysis by DAVID
We performed Database for Annotation, Visualization and Integrated Discovery (DAVID http://david.abcc.ncifcrf.gov/) analysis (Glynn et al., 2003) focusing exclusively on the set of differentially spliced gene transcripts and/or isoforms, discriminated by both Cufflinks/Cuffdiff and DEXseq approaches. Then, we performed a gene ontology (GO) survey (FDR ≤0.05 with at least 2 fold enrichment) by processing significantly differentially spliced genes, evaluating alternative transcript and exon change and/or modulation in multi-exon genes.

Reads sequences from both Erβ+ and Erβhuman BC cell line models high-throughput RNA sequencing (RNA-Seq) statistical analysis
mRNA sequencing experiment basing on illumina genome analyzer II (GAII) processing both Erβ+ and Erβhormone induced breast cancer (BC) cell lines, yielded approximately 94911602-99876796 million pair-end read (101 bp) sequences (Table 1). From these reads, low quality sequences, were eliminated, resulting in around 65-71 million reads corresponding to 68-70.2% of total reads for each processed replicate sample (Table 1). In total around 65-71 million reads were aligned to the Homo sapiens Ensembl GRCh37 reference genome ( Table 1). The number of reads per genes and transcript isoforms were further normalized to Fragment per Kilobases of exon per Million mapped reads (FPKM). Then, in order to include a maximum number of genes and transcript isoforms, reducing as possible statistical type I error in calling differentially modulated and spliced genes as well as exon modulation events, we adopted 0.5 FPKM, as genes and/or transcript isoforms expression level threshold in both processed Erβ + and Erβ -BC cell exemplars ( Figure 1). Student test analysis, based on both loaded junctions and found junction parameters from transcriptome and/or genome reconstruction through Cufflinks package, suggested a reasonable difference (pvalue <0.01) between Erβ + BC cell line under estradiol stimulus (Erβ + E 2 ) and the other's analyzed BC cell conditions (Table 1). In other words, the present result supported that Erβ + and Erβ -BC cell lines in normal conditions (Erβ + noE 2 , Erβ -noE 2 ) as well as Erβ -BC cell line induced by E 2 (Erβ -E 2 ), exhibited the same behaviors as opposed to Erβ + E 2 breast cancer cell reacting to E 2 induction (Table 1).
Since each analyzed condition have been processed in three technical replicates, Erβ + E 2 _Rep1 indicates the replicate 1 of Erβ + BC cell line condition under estradiol (E 2 ) treatment, while Erβ + noE 2 _Rep1 indicates replicate 1 of the same cell line with any E 2 treatment The same nomenclature has been adopted for Erβ -E 2 and Erβ + noE 2 analyzed samples conditions.   Genomic re-distribution of RNA sequencing (RNA-Seq) reads sequences evaluating breast cancer cell line reply to estradiol (E 2 ) hormone stimulus Next, we focused on genomic re-distribution of reads sequences, merging read count analysis results from HTSeq/DEXseq with those from Tophat2/Cufflinks software's for each analyzed hormone induced Erβ BC cell line conditions (Erβ + E 2 , Erβ + noE 2 , Erβ -E 2 and Erβ -noE 2 ) appraising early alternative splicing events monitoring differential spliced genes and exons change in breast tumor. The present survey highlighted a strong difference in read distributions between Erβ + E 2 BC cells and the other's analyzed conditions (Erβ + noE 2 , Erβ -E 2 and Erβ -noE 2 ) ( Figure 2). These results seem to be in agreement with above reported loaded and/or found junction analysis (Table 1), supporting subtly an agreement between HTSeq/DEXseq and Tophat2/ Cufflinks approaches, in genome reconstruction process and/or in genomic reads re-distribution analysis. Furthermore, the present survey exhibited a constancy rate, measuring exon read count as well as intragenic and intron reads values comparing Erβ + noE 2 , and both Erβ -E 2 and Erβ -noE 2 breast cancer cells (Figure 2), hypothesizing a similitude between the former's assessing alternative splicing pathway in the present analyzed BC cells, reinforcing the link between estrogen receptor beta (Erβ) and early transcription and mRNA splicing events in hormone responsive BC cells. Considering as a whole, the present results supported a dynamic molecular reply of Erβ + BC cells as opposed to Erβ -BC in terms of alternative splicing events ( Figure 2).

Assessment of expressed genes and their transcript isoforms proportion comparing hormone responsive Erβ + and Erβ breast cancer cell lines
We evaluated change in differential expressed gene and transcript isoforms including exclusively genes and transcript isoforms that exhibit a FPKM expression value ≥0.5 and significant statistical differential change at an adjusted p-value ≤0.05 by Cufflinks/Cuddiff approach in processed hormone responsive BC cell lines. Then, basing on previous analysis (Figure 1), we processed in total 6714 and 52499 genes and transcript/isoforms respectively ( Figure 3). As expected, change in gene and transcript isoforms strongly contrast between both analyzed estrogen hormone responsive Erβ + and Erβ -BC cell lines ( Figure 3A and B) since, estimated Pearson correlations values comparing log 2 fold change assessing measurement between expressed genes and transcript isoforms resulted to lower 0.5 (R 2 =0.23 and R 2 =0.31 for Erβ + and Erβ -BC cell respectively), advising and Erβ + or (B) BC cells. The x-coordinate is the gene expression fold change relative to Erβ -(A) or Erβ + (B) BC cells and the ycoordinate is the transcript isoform fold change relative to the same BC cells respectively. Blue dots represent the situation where both isoforms and genes, with an expression value FPKM ≥ 0.5, have been selected as statistically significantly differentially expressed at a p-adjusted value≤0.05. Red dots represent situation where only isoform expression is significantly altered. (C) Venn diagrams showing overlapping/divergence between hormone induced Erβ-and Erβ+ BC models analyzing cases where only transcript isoforms expression is significantly altered.
weak agreement between change in genes and their respective transcript isoforms, especially in Erβ + BC cells ( Figure 3). Hence, we focused on the cases where only transcript isoforms expression were significantly altered, alerting alternative promoter usage in alternative transcript isoforms modulation as well as in alternative splicing event occurrence in estrogen responsive BC cell line. Indeed, we showed that situations for which transcript isoform expression was significantly alerted as opposed to their corresponding genes resulted 2 fold more in hormone induced Erβ + BC cell ( Figure 3C). Taking together, the present results suspected a considerable involvement of alternative promoter usage in early alternative splicing occurrence in hormone responsive Erβ + BC cell lines as opposed to Erβ -BC cell lines ( Figure  3).

Identification of differentially spliced genes (DSGs) in hormone responsive Erβ + breast cancer cell lines by Cufflinks/Cuffdiff approach
To identify the differences in splice ratios between Erβ + E 2 and Erβ + noE 2 BC cell line (Erβ + E 2 vs. Erβ + noE 2 ), we employed Cufflinks/Cuffdiff v2.1.1 package, which calculates the changes in the relative splice abundances by quantifying the square root of the Jensen Shannon divergence on all the primary transcripts that produce two or more isoforms. It is essential to note that the distributions of genes, and primary transcripts, and isoform expression level (FPKM) are comparable between the samples that are taken for the differential splicing test. Then, 213 genes randomly distributed on human chromosomes have been detected as significantly  Table 1). Furthermore, basing on this result as well as on previous one (data non shown), we showed that Erβ + BC cell lines exhibited 3 fold more differentially spliced genes with respect to Erβ-BC cells replying to estradiol hormone stimulus (Supplementary Tables 1 and 2) confirming previous results and observations (Figures 2 and 3). Therefore, we were interested in investigating DSGs exons change monitoring early AS pathway in hormone responsive Erβ+ BC, since exons modulation have been recognized as suitable process understating mechanism of transcript isoforms expression, proposing the former's as potential tumors biomarkers as well as therapeutic target.

Assessment of differential spliced multi-exonic genes (DSGs), exons modulation by DEXseq in Erβ breast cancer cell lines
Here, we analyzed through the DEXseq approach, differential exons expression change, referring to previous detected differential spliced multi-exonic genes (Cufflinks/Cuffdiff) in hormone induced Erβ + BC cell line. This analysis looks for difference across conditions (in Erβ + E 2 vs. in Erβ + noE 2 ) between quantities that are directly observable from shotgun data (read count data), such as the relative usage of each exon. Then, performing the test for differential exons used considering exon with at least 10 reads count in at least one analyzed condition and controlling false discovery rate (FDR) with the Benjamini Hochberg method, at 5% threshold (statistical stringency), 3682 and 122 exons from 1438 and 75 multi-exonic genes, claimed to be significantly differentially modulated in hormone responsive Erβ + and Erβ -BC cell lines respectively (Supplementary Materials 1 and 2). Also, this result suggested that exons change ratio in analyzed differentially spliced multi-exonic genes was 1.6 fold more higher in hormone responsive Erβ + BC cell lines when compared to Erβ -BC cell line, supporting a potential high number of significantly alternatively modulated transcript isoforms involvement characterizing processed hormone responsive Erβ + BC cell line, highlighting the links between Erβ and early AS occurrence in E 2 hormone induced breast cancer. However, significantly modulated exons proportion survey by processing differentially spliced multiexonic genes revealed that exons change in the analyzed hormone responsive BC cell lines, meanly regard the first 10 gene transcripts exons (more than 25%), advising and/or suspecting weakly solicitation of ending transcripts exons yielding alternative transcript isoforms in hormone responsive BC cells (Table 2).

DAVID analysis assessing functional annotation of differentially spliced genes discriminated by both Cufflinks/Cuffdiff and DEXseq tools
68 genes out the 73 processed differentially spliced genes (DSGs) discriminated by Cufflinks/ Cuffdiff approach at 5% false discovery rate in hormone responsive Erβ + BC (Supplementary Table 2) were converted in a new list for functional annotation analysis by DAVID package showing at Benjamini correction referring to a p-adjusted value at 5%, that (i) 75% of differentially alternatively spliced genes called by Cufflinks/ Cuffdiff were genes that code for at least two isoforms due to pre-mRNA splicing event and that (ii) 74% of these genes were recognized as alternative splice variant. Furthermore, 42.64% of alternative spliced genes categorized by Cufflinks/ Cuffdiff approach have been discriminated to be localized in the nucleus ( Figure 5A). In the same tendency and focusing on alternative spliced exons events called by DEXseq methodology at 5% false discovery rate threshold with 2 fold exon change (high alternative splicing evidence), we showed that 55.35% of analyzed alternative spliced genes, coded for at least two isoforms due to pre-mRNA splicing event and that 54.91% of their transcript isoform variants were recognized as alternative splice variant. We also showed that 37.84% of all analyzed alternative spliced genes called by DEXseq approach have been discriminated to be localized in the nucleus ( Figure 5B). These results, delicately suggested a relative agreement between the two considered and/or analyzed bioinformatics tools (Cufflinks/ Cuffdiff and DEXseq) assessing the performance of alternative splicing pathway in breast cancer cell line, since mRNA maturation in eukaryotic cells happened in cell's nucleus. Finally, even if the present results suggested DEXseq approach as discriminating highest number of potential alternative splicing event in term of functional annotation as opposed to Cufflinks/Cuffdiff ( Figure  5), it is noteworthy to underline their similitude evaluating (i) splice and isoform variant, (ii) alternative spliced genes localization in the nucleus and (iii) nucleolus in the present functional annotation survey ( Figure 5).

Integration between Cufflinks/Cuffdiff and DEXseq data assessing alternative splicing event in Erβ breast cancer cell line
More than 34.74% of DSGs (multi-exonic genes) discriminated by Cufflinks/Cuffdiff approach in the hormone responsive Erβ + BC cell, exhibited at least one significant exons change as DEXseq exon differential analysis ( Figure 4). Indeed, LIFR gene, knows as a breast cancer metastasis suppressor, MATA2 gene involves in apoptosis process in human hepatoma, NBPF10 gene associated with several types of cancer, PBX1 gene that results engaged in the progression of breast cancer and PHLPP2 gene that inhibits cancer cell proliferation acting as tumor suppressor, detected as significantly differentially spliced by Cufflinks/Cuffdiff approach, displayed a significant exons change in the present analyzed hormone responsive BC cell line (Figure 4), proposing both Cufflinks/Cuffdiff and DEXseq data merging process as a valid methodology monitoring AS pathway in breast cancer. As DEXseq provides exons splicing visualization, we reported in Figure 6 an example of exons modulation evaluating alternative splicing happening in two fully differentially spliced genes MATA2 and PBX1 in E 2 hormone induced BC exclusively (Erβ + BC). Moreover, several generic tumor biomarkers such as SEPT9, known as a candidate for the ovarian tumor suppressor, PPP1R13B and NOTCH2 genes affecting cells differentiation implementation, proliferation and apoptotic programs and NBPF10 gene which results associated with several type of cancer, were recognized as exhibiting significant exons change in the hormone responsive Erβ + BC cell line (Figure 4). We also showed that 36.11% of detected spliced genes merging both Cuffdiff and  DEXseq approaches were regulated by significant alternative promoter use ( Figure 4). Interestingly, wellnoted biomarker genes like MATA2, SEPT9, NOTCH2 and PPP1R13B claimed to be under alternative promoter usage (p-value≤0.05) (Figure 4 and Supplementary Table  3), assessing alternative splicing pathway in Erβ + BC cells. So, the present results suspecting the involvement of some remarkable cancer biomarkers controlling AS path in hormone responsive BC cell lines, proposed Cufflinks/Cuffdiff and DEXseq approaches data integration as reasonable and complementary scheme weighing whole AS pathways in breast cancer cells.

DISCUSSION
Alternative splicing (AS) is a means of expressing several or many different transcripts from the same genomic DNA and results from the inclusion of a subset of the available exons for a particular protein. By excluding one or more exons, certain protein domains may be lost from the encoded protein, which can result in protein function damage or gain. Several types of alternative splicing have been described resulting in exon skipping; alternative 5′ or 3′ splice sites; mutually exclusive exons and much more rarely and intron retention, approving the complexity of AS study and the needed of accurate bioinformatics or bio-statistics systems evaluating this phenomena in eukaryotic cells. However, many alternative splicing events have been noted in human development, especially in the brain and the testes (Grabowski and Black, 2001;Venables, 2002) as well as in cancer, including the use of alternative individual splice sites, alternative exons, and alternative introns. Therefore, whole alternative splicing survey must include capable sensitive and specific bioinformatics tools, able to measure accurately alternative transcript isoforms as well as multi-exonic genes transcripts exons abundance, since recent Next Generation RNA Sequencing (NGS RNA-Seq) analysis provides innovative platform exploring in detail cell transcriptome and/or genome. However, a recurrent challenge in RNA-Seq experimentation regards application of adequate bioinformatics tools analyzing huge quantity and complex yielded data. Also, RNA-Seq coupled with well-established bioinformatics approaches; that is, Cufflinks/Cuffdiff  and DEXseq , allowed a suitable whole genome and transcriptome reconstruction measuring gene transcript isoforms as well as exons expression level, performing differential expression analysis between cells types. In such analysis, annotated regions of the considered genome can be expressed (that is, exons), describing how the pre-mRNAs are spliced into transcripts. While there are several utilities that attempt to de-convolute read data into isoform abundances, the accuracy and robustness of these methods is difficult to establish (Chandramohan et al., 2013;Zhang et al., 2014). Isoform expression estimates seem to vary considerably between different tools, and generally depend on the quality and completeness of the transcript assembly (Kanitz et al., 2015;Rehrauer et al., 2013). Also, Cufflinks/Cuffdiff methodology is not more informative regarding exons modulation measuring alternative splice transcript isoforms pattern. Simon  demonstrate the versatility of DEXseq package by applying it to several data sets facilitating the study of regulation and function of alternative exon usage on a genome-wide scale. Guided by these observations, we proposed merging between both Cufflinks/Cuffdiff and DEXseq investigating meticulously alternative transcript isoforms exons abundance in our previous studied hormone responsive Erβ+ breast cancer cells, since Erβ significantly affects estrogen-induced early transcription and mRNA splicing in hormone-responsive BC cells (Dago et al., 2015). Then, genomic re-distribution of RNA-Seq reads sequences from hormone responsive Erβ BC cells by HTSeq/DEXseq and TopHat/Cufflinks, suggested strong difference (p-value <0.05) between estrogen induced Erβ + BC cells (Erβ + E 2 ) and the other's analyzed breast cancer cells conditions (Erβ + with any E 2 stimulus; Erβ + noE 2, Erβ induced by E 2 ; Erβ -E 2 and Erβ without any E 2 stimulus; Erβ -noE 2 ), since exhibiting dissimilar attitude and/or behaviors considering loaded and fund reads junction as well as exon intragenic and intron reads distribution ( Figure 2 and Table 1), subtly evoking agreement between both Cufflinks and DEXseq approaches in genomic and transcriptomic reconstruction analysis. In addition, this analysis suggested high variability between Erβ + E 2 and Erβ + noE 2 conditions suspecting a high number of differentially expressed genes and/or transcript isoforms in the latter's as opposed to hormone induced Erβ -BC cells. However, comparative analysis assessing the relationship between change in expressed genes and their respective transcript isoforms, showed a strong evidence of alternative promoter used in alternative splicing pattern processing hormone responsive Erβ + BC cell line, as the ratio between expressed transcript isoforms as oppose to expressed genes, was 2 fold more higher in the latter's (Erβ + BC) when compared to Erβ -BC cell line (Figure 3). Taking together, these results reinforced alternative splicing evidence in hormone responsive Erβ + BC as opposite to Erβ -BC cell lines alerting a significant participation of alternative promoter regulating AS events occurrence in breast cancer process. Several studies have shown that the occurrence of alternative transcriptional termination and splicing is higher in genes with alternative promoters, and the choice of alternative promoter and transcription termination can influence the alternative splicing pattern of the pre-mRNA (Winter et al., 2007;Albulescu et al., 2012). Nevertheless, it is estimated that participate to the alternative splicing phenomena proposing genes transcript isoforms exons modulation as a suitable approach understanding alternative transcript isoform expression in human genome. Hence, detailed alternative splicing analysis by NGS RNA-Seq need innovative bioinformatics scheme capable to integrate accurately transcript isoforms and exon expression analysis. Based on this, we combined differentially spliced genes and transcript isoforms exons modulation assessing early AS occurrence in estradiol hormone induced Erβ+ BC cells, emphasizing strong involvement of transcript isoforms exons change in alternative splicing mechanism in breast tumor. Then, integrating Cufflinks/ Cuffdiff and DEXseq approaches, our findings proposed alternative promoter's usage as well as significant exon change as key molecular events favoring AS pattern in hormone responsive breast cancer cells (Figure 4). Also, it is noteworthy to underline the weak involvement of ended exons of transcript isoforms evaluating alternative splicing occurrence in the present hormone induced Erβ BC cells (Table 2). Furthermore, around 34.74% of significantly differentially spliced genes by Cufflinks/ Cuffdiff have shown significant exons change in DEXseq analysis testing for differential usage of exon regions as a proxy for alternative isoform regulation as well as providing a powerful suite of visualization tools ( Figure 6) (Love et al., 2014). Interestingly, the present analysis identified several remarkable cancer biomarkers, like MDM2 and NF1 genes, known as cancer specific alternative splicing genes (Venables, 2004), PBX1 gene, revealed as a novel pioneer factor defining aggressive Erβ breast tumors, as it guides Erα genomic activity to unique genomic regions promoting a transcriptional program favorable to breast cancer progression (Magnani et al., 2011), LIFR gene, knows as a breast cancer metastasis suppressor (Chen et al., 2012), MATA2 gene, involved in human colon cancer progression (Chen et al., 2007) and PHLPP2 tumor suppressor (Liu et al., 2011) and AIB1/NCOA3 hormone signaling in breast cancer as exhibiting significant exon change in hormone responsive Erβ BC cells, demonstrating the key role of exon modulation in early estrogen induced breast cancer alternative splicing pattern (Figure 4, Supplementary Materials 1 and 2). Furthermore, 36.11% of detected differentially spliced genes by merging Cuffdiff and DEXseq approaches were recognized as including alternative promoter, and some well-noted cancer biomarker (MATA2, SEPT9, NOTCH2 and PPP1R13B) claimed to be under alternative promoter usage (p-value≤0.05) (Figure 4 and Supplementary Table 3), monitoring early alternative splicing pathway in our processed hormone responsive Erβ + BC cells (Figure 4). So, the present results by evoking the involvement of some remarkable cancer biomarkers, controlling AS pattern in the present hormone responsive BC cell lines, proposed Cufflinks/Cuffdiff and DEXseq genomic data integration, as a reasonable complementary scheme assessing whole AS occurrence in hormone responsive breast cancer cell. Moreover, concordance between Cufflinks/ Cuffdiff and DEXseq has been supported in part by DAVID functional annotation analysis since 42.64 and 37.84% of alternative spliced genes categorized by previous mentioned approaches respectively, have been discriminated to be localized in cell's nucleus, suggesting a strong contribution of the latter's (alternative sliced genes) regulating AS pattern occurrence in hormone responsive BC cell line ( Figure 5). Also, even if the present study admitted conflicting results between Cufflinks/Cuffdiff and DEXseq approaches in alternative splicing analysis (Figure 5), we showed that an adequate integration between these bioinformatics/biostatistics tools can help to wholly investigate alternative splicing occurrence in breast cancer disease reducing false discovery event rate. Furthermore, accuracy in statistical analysis processing alternative transcript isoforms regulation in genomic and transcriptomic studies, has been supported by recent work introducing a new method that builds on the statistical techniques used by the wellestablished DEXseq package to detect differential usage of both exonic regions and splice junctions helping differential usage of novel splice junctions without the need for an additional isoform assembly step (Stephen and James, 2016).

Conclusion
RNA-Seq providing more than simple measurements of gene and/or transcript-level expression, can be used to study more complex regulatory phenomena at the isoform level, even when the isoforms in question are unannotated. Numerous tools have been developed to detect alternative isoform regulation exhibiting in some cases conflicting results. In contrast to this tendency, the present study proposed integration between wellestablished Cufflinks/Cuffdiff and DEXseq approaches as reasonable system assessing alternative splicing events in hormone responsive Erβ breast cancer cells. Finally, our findings exhibited significant exon modulation of multi-exonic gene transcripts regulated by alternative promoters, as recurrently solicited in early AS pattern in estrogen induced BC cells.

CONFLICT OF INTERESTS
The authors have not declared any conflict of interests.

ACKNOWLEDGEMENT
Thanks to all members of the Laboratory of Molecular Medicine and Genomics of the Department of Medicine and Surgery, University of Salerno (SA) Italy, for providing information regarding RNA sequencing data for the present study.