Journal of
Computational Biology and Bioinformatics Research

  • Abbreviation: J. Comput. Biol. Bioinform. Res
  • Language: English
  • ISSN: 2141-2227
  • DOI: 10.5897/JCBBR
  • Start Year: 2009
  • Published Articles: 41

Full Length Research Paper

Querying formal concepts containing transcription factors: A case study using multiple databases

Mathilde Pellerin1,2 and Olivier Gandrillon1*
1Université de Lyon, Université Lyon 1, Centre de Génétique et de Physiologie Moléculaire et Cellulaire (CGPHIMC), CNRS UMR5534, F-69622 Lyon, France. 2Statlife, Espace Maurice Tubiana, 39 rue Camille Desmoulins, 94805 VILLEJUIF, France.
Email: [email protected]

  •  Accepted: 26 October 2011
  •  Published: 30 November 2011

Abstract

In order to reduce the amount of information when querying from large databases, one has to develop new approaches. We present here a new way to query our SQUAT database. SQUAT contains formal concepts representing an association between a number of genes that are simultaneously over expressed and the biological situations in which those genes are over expressed. We explored the relevance of querying “self-explaining” formal concepts obeying a double constraint: (1) The concept should contain, within the genes of the concepts, at least one transcription factor (TF), and (2) At least one gene in the concept, should contain in its promoter a transcription factor binding site (TFBS) for the identified TF. The present work demonstrated that: (1) there are such “self-explaining” formal concepts in SQUAT. (2) Mining only those “self-explaining” formal concepts severely reduces the number of concepts that have to be analyzed. (3) Two such “self-explaining” concepts have been further analyzed, and their biological relevance has been demonstrated.

 

Key words: Data mining, gene expression, large database, formal concepts.