Integrating gene expression data to infer how biological changes drive transcriptional responses

2016 
The work presented in this Ph.D. thesis is two sided. The first part describes a series of tools to integrate gene expression data, while the second one describes how to mathematically model them. The first part explains the methodology used to integrate publicly available transcriptomic data, the creation of a series of software tools that implement this methodology, and their application to create collections of gene expression data (compendia) for several prokaryote species and one eukaryote (the crop plant Vitis vinifera). Compendia are gene expression matrices in which every row is a gene of the species of interest while columns represent the different conditions in which genes have been measured. They provide a rich source of information for systems biology applications. Besides being the result of the first part of this Ph.D. project, gene expression compendia are the starting point for the second part, with the purpose of facilitating biological knowledge discovery drawing inference from mathematical models. We develop and discuss two complementary models. The first one uses a Bayesian approach, in which we model a probability distribution over an underlying true change in expression for a given gene in response to a given condition. The second one uses Boolean networks to model structural information about the known genetic mechanisms of response to stimuli. Boolean networks are used to fit a distribution over steady-states of cells in measured samples. These models may be used for various types of statistical inference and decision making. They can serve to formulate statistically sound hypothesis about stimuli/signals that better explain observed changes in gene expression, or about the inherent variability of a gene (independently from the conditions in which it is measured), or to find complex patterns of co-expression.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []