We describe a prototype database of sequence alignments and experimental results for the /spl beta/-like globin gene cluster of mammals. This data repository is intended to help the international community of globin gene biologists to plan experiments, to design models and, ultimately, to understand regulation of those genes. Moreover, the approaches and software developed for this project will be applicable to other sequence-analysis problems. Our first steps were to develop a program that can simultaneously align a few very long sequences and to establish an e-mail server that formats any requested portion of the alignment, annotated to indicate highly conserved regions and known run sequence features. We are currently developing a World-Wide Web server that will provide access to the wealth of pertinent experimental data, in register with the annotated alignment, through a variety of query languages and graphical presentations.< >
Pairwise comparison of long stretches of genomic DNA sequence can Identify regions conserved across species, which often indicate functional significance. However, the novel insights frequently must be winnowed from a flood of information; for instance, running an alignment program on two 50-kilobase sequences might yield over a hundred pages of alignments. Direct inspection of such a volume of printed output Is infeasible, or at best highly undesirable, and computer tools are needed to summarize the information, to assist in its analysis, and to report the findings. This paper describes two such software tools. One tool prepares publication-quality pictorial representations of alignments, while another facilitates interactive browsing of pairwise alignment data. Their effectiveness is illustrated by comparing the β-like globin gene clusters between humans and rabbits. A second example compares the chioroplast genomes of tobacco and liverwort.
ABSTRACT Background The majority of eukaryotic promoters utilize multiple transcription start sites (TSSs). How multiple TSSs are specified at individual promoters across eukaryotes is not understood for most species. In S. cerevisiae , a preinitiation complex comprised of Pol II and conserved general transcription factors (GTFs) assembles and opens DNA upstream of TSSs. Evidence from model promoters indicates that the preinitiation complex (PIC) scans from upstream to downstream to identify TSSs. Prior results suggest that TSS distributions at promoters where scanning occurs shift in a polar fashion upon alteration in Pol II catalytic activity or GTF function. Results To determine extent of promoter scanning across promoter classes in S. cerevisiae , we perturbed Pol II catalytic activity and GTF function and analyzed their effects on TSS usage genome-wide. We find that alterations to Pol II, TFIIB, or TFIIF function widely alter the initiation landscape consistent with promoter scanning operating at all yeast promoters, regardless of promoter class. Promoter architecture, however, can determine extent of promoter sensitivity to altered Pol II activity in ways that are predicted by a scanning model. Conclusions Our observations coupled with previous data validate key predictions of the scanning model for Pol II initiation in yeast – which we term the “shooting gallery”. In this model, Pol II catalytic activity, and the rate and processivity of Pol II scanning together with promoter sequence determine the distribution of TSSs and their usage.
Williams syndrome is a complex developmental disorder that results from the heterozygous deletion of a ∼1.6-Mb segment of human chromosome 7q11.23. These deletions are mediated by large (∼300 kb) duplicated blocks of DNA of near-identical sequence. Previously, we showed that the orthologous region of the mouse genome is devoid of such duplicated segments. Here, we extend our studies to include the generation of ∼3.3 Mb of genomic sequence from the mouse Williams syndrome region, of which just over 1.4 Mb is finished to high accuracy. Comparative analyses of the mouse and human sequences within and immediately flanking the interval commonly deleted in Williams syndrome have facilitated the identification of nine previously unreported genes, provided detailed sequence-based information regarding 30 genes residing in the region, and revealed a number of potentially interesting conserved noncoding sequences. Finally, to facilitate comparative sequence analysis, we implemented several enhancements to the program PipMaker , including the addition of links from annotated features within a generated percent-identity plot to specific records in public databases. Taken together, the results reported here provide an important comparative sequence resource that should catalyze additional studies of Williams syndrome, including those that aim to characterize genes within the commonly deleted interval and to develop mouse models of the disorder. [The sequence data described in this paper have been submitted to GenBank under accession nos. AF267747 , AF289666 , AF289667 , AF289664 , AF289665 , AC091250 , AC079938 , AC084109 , AC024607 , AC074359 , AC024608 , AC083858 , AC083948 , AC084162 , AC087420 , AC083890 , AC080158 , AC084402 , AC083889 , AC083857 , and AC079872 .]
Photoperiod is a key environmental cue affecting flowering and biomass traits in plants. Key components of the photoperiodic flowering pathway have been identified in many species, but surprisingly few studies have globally examined the diurnal rhythm of gene expression with changes in day length. Using a cost-effective 3'-Tag RNA sequencing strategy, we characterize 9,010 photoperiod responsive genes with strict statistical testing across a diurnal time series in the C4 perennial grass, Panicum hallii. We show that the vast majority of photoperiod responses are driven by complex interactions between day length and sampling periods. A fine-scale contrast analysis at each sampling time revealed a detailed picture of the temporal reprogramming of cis-regulatory elements and biological processes under short- and long-day conditions. Phase shift analysis reveals quantitative variation among genes with photoperiod-dependent diurnal patterns. In addition, we identify three photoperiod enriched transcription factor families with key genes involved in photoperiod flowering regulatory networks. Finally, coexpression networks analysis of GIGANTEA homolog predicted 1,668 potential coincidence partners, including five well-known GI-interacting proteins. Our results not only provide a resource for understanding the mechanisms of photoperiod regulation in perennial grasses but also lay a foundation to increase biomass yield in biofuel crops.