Computational studies of gene expression and sequence data in yeast

2006 
Advances in genome sequencing and gene expression measurement technologies in recent years have made large amounts of sequence and expression data available for computational analysis. In the present study, the candidate uses computational and statistical mechanisms to analyze sequence and expression data from different yeast species to infer and test various hypotheses of biological interest. The thesis is presented in four parts. In the first, part, a new model of gene conservation for yeast species is proposed: we call it the Correction Model. In the second part, the cell-cycle regulated genes in fission yeast are identified with the help of bioinformatics from newly generated genome-wide microarray expression data. In the third part, we present a novel algorithm for the meta-analysis of microarray data sets. In the fourth part, we describe a computational tool SpikeChart that allows visualization and statistical analysis of spatial patterns of DNA motif distributions. Members of the yeast genus Saccharomyces are descendants of an ancient whole-genome duplication event. The lack of divergence for many paralogous gene-pairs is too strong to be explained solely by the currently known model of gene conservation. We propose the new Correction model of conservation, which is based on gene expression and is validated with a detailed analysis involving data from six yeast species. Based on newly generated microarray data for three fission yeast cell-cycle experiments, we analyzed the expression of all genes with clustering and motif search. Top 750 genes with oscillating transcripts were identified by meta-analysis of the experiments. High-throughput microarray technology measures the expression of thousands of features under various experimental conditions. We present a meta-analysis method to jointly address the dual issues of how to combine the significance values across independent data sets with high power, while controlling the proportion of false positives. Finally, we present SpikeChart, a tool that allows visualization and statistical analysis of spatial patterns in the distributions of transcription factor binding motifs in promoters, which can be used to test hypotheses about possible modes of transcription as well as to detect phylogenetic footprints.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []