Early embryonic developmental programs are guided by the coordinated interplay between maternally inherited and zygotically manufactured RNAs and proteins. Although these processes happen concomitantly and affecting gene function during this period is bound to affect both pools of mRNAs, it has been challenging to study their expression dynamics separately.
Although traditional genetic assays have characterized the pattern of crossing over across the genome in Drosophila melanogaster, these assays could not precisely define the location of crossovers. Even less is known about the frequency and distribution of noncrossover gene conversion events. To assess the specific number and positions of both meiotic gene conversion and crossover events, we sequenced the genomes of male progeny from females heterozygous for 93,538 X chromosomal single-nucleotide and InDel polymorphisms. From the analysis of the 30 F1 hemizygous X chromosomes, we detected 15 crossover and 5 noncrossover gene conversion events. Taking into account the nonuniform distribution of polymorphism along the chromosome arm, we estimate that most oocytes experience 1 crossover event and 1.6 gene conversion events per X chromosome pair per meiosis. An extrapolation to the entire genome would predict approximately 5 crossover events and 8.6 conversion events per meiosis. Mean gene conversion tract lengths were estimated to be 476 base pairs, yielding a per nucleotide conversion rate of 0.86 × 10(-5) per meiosis. Both of these values are consistent with estimates of conversion frequency and tract length obtained from studies of rosy, the only gene for which gene conversion has been studied extensively in Drosophila. Motif-enrichment analysis revealed a GTGGAAA motif that was enriched near crossovers but not near gene conversions. The low-complexity and frequent occurrence of this motif may in part explain why, in contrast to mammalian systems, no meiotic crossover hotspots have been found in Drosophila.
Accurate and comprehensive characterization of genetic variation is essential for deciphering the genetic basis of diseases and other phenotypes. A vast amount of genetic variation stems from large-scale sequence changes arising from the duplication, deletion, inversion, and translocation of sequences. In the past 10 years, high-throughput short reads have greatly expanded our ability to assay sequence variation due to single nucleotide polymorphisms. However, a recent de novo assembly of a second Drosophila melanogaster reference genome has revealed that short read genotyping methods miss hundreds of structural variants, including those affecting phenotypes. While genomes assembled using high-coverage long reads can achieve high levels of contiguity and completeness, concerns about cost, errors, and low yield have limited widespread adoption of such sequencing approaches. Here we resequenced the reference strain of D. melanogaster (ISO1) on a single Oxford Nanopore MinION flow cell run for 24 hr. Using only reads longer than 1 kb or with at least 30x coverage, we assembled a highly contiguous de novo genome. The addition of inexpensive paired reads and subsequent scaffolding using an optical map technology achieved an assembly with completeness and contiguity comparable to the D. melanogaster reference assembly. Comparison of our assembly to the reference assembly of ISO1 uncovered a number of structural variants (SVs), including novel LTR transposable element insertions and duplications affecting genes with developmental, behavioral, and metabolic functions. Collectively, these SVs provide a snapshot of the dynamics of genome evolution. Furthermore, our assembly and comparison to the D. melanogaster reference genome demonstrates that high-quality de novo assembly of reference genomes and comprehensive variant discovery using such assemblies are now possible by a single lab for under $1,000 (USD).
Proteins of the bone morphogenetic protein (BMP) family are known to have a role in ocular and skeletal development; however, because of their widespread expression and functional redundancy, less progress has been made identifying the roles of individual BMPs in human disease. We identified seven heterozygous mutations in growth differentiation factor 6 (GDF6), a member of the BMP family, in patients with both ocular and vertebral anomalies, characterized their effects with a SOX9-reporter assay and western analysis, and demonstrated comparable phenotypes in model organisms with reduced Gdf6 function. We observed a spectrum of ocular and skeletal anomalies in morphant zebrafish, the latter encompassing defective tail formation and altered expression of somite markers noggin1 and noggin2. Gdf6+/− mice exhibited variable ocular phenotypes compatible with phenotypes observed in patients and zebrafish. Key differences evident between patients and animal models included pleiotropic effects, variable expressivity and incomplete penetrance. These data establish the important role of this determinant in ocular and vertebral development, demonstrate the complex genetic inheritance of these phenotypes, and further understanding of BMP function and its contributions to human disease.
This article includes supplemental data. Please visit http://www.fasebj.org to obtain this information.Multiple recent publications on RNA sequencing (RNA-seq) have demonstrated the power of next-generation sequencing technologies in whole-transcriptome analysis. Vendor-specific protocols used for RNA library construction often require at least 100 ng total RNA. However, under certain conditions, much less RNA is available for library construction. In these cases, effective transcriptome profiling requires amplification of subnanogram amounts of RNA. Several commercial RNA amplification kits are available for amplification prior to library construction for next-generation sequencing, but these kits have not been comprehensively field evaluated for accuracy and performance of RNA-seq for picogram amounts of RNA. To address this, 4 types of amplification kits were tested with 3 different concentrations, from 5 ng to 50 pg, of a commercially available RNA. Kits were tested at multiple sites to assess reproducibility and ease of use. The human total reference RNA used was spiked with a control pool of RNA molecules in order to further evaluate quantitative recovery of input material. Additional control data sets were generated from libraries constructed following polyA selection or ribosomal depletion using established kits and protocols. cDNA was collected from the different sites, and libraries were synthesized at a single site using established protocols. Sequencing runs were carried out on the Illumina platform. Numerous metrics were compared among the kits and dilutions used. Overall, no single kit appeared to meet all the challenges of small input material. However, it is encouraging that excellent data can be recovered with even the 50 pg input total RNA.
It is well recognized that the field of metagenomics is becoming a critical tool for studying previously unobtainable population dynamics at both an identification of species level and a functional or transcriptional level. Because the power to resolve microbial information is so important for identifying the components in an mixed sample, metagenomics can be used to study nearly any possible environment or system including clinical, environmental, and industrial, to name a few. Clinically, it may be used to determine sub-populations colonizing regions of the body or determining a rare infection to assist in treatment strategies. Environmentally it may be used to identify microbial populations within a soil, water or air sample, or within a bioreactor to characterize a population- based functional process. The possibilities are endless.
However, the accuracy of a metagenomics dataset relies on three important “gatekeepers” including 1) The ability to effectively extract all DNA or RNA from every cell within a sample, 2) The reliability of the methods used for deep or high-throughput sequencing, and 3) The software used to analyze the data.
Since DNA extraction is the first step in the technical process of metagenomics, the Nucleic Acid Research Group (NARG) conducted a study to evaluate extraction methods using a synthetic microbial sample. The synthetic microbial sample was prepared from 10 known bacteria at specific concentrations and ranging in diversity. Samples were extracted in duplicate using various popular kit based methods as well as several homebrew protocols then analyzed by NextGen sequencing on an Illumina HiSeq. Results of the study include determining the percent recovery of those organisms by comparing to the known quantity in the original synthetic mix.
As part of the DNA Sequencing Research Group of the Association of Biomolecular Resource Facilities, we have tested the reproducibility of the Roche/454 GS-FLX Titanium System at five core facilities. Experience with the Roche/454 system ranged from <10 to >340 sequencing runs performed. All participating sites were supplied with an aliquot of a common DNA preparation and were requested to conduct sequencing at a common loading condition. The evaluation of sequencing yield and accuracy metrics was assessed at a single site. The study was conducted using a laboratory strain of the Dutch elm disease fungus Ophiostoma novo-ulmi strain H327, an ascomycete, vegetatively haploid fungus with an estimated genome size of 30–50 Mb. We show that the Titanium System is reproducible, with some variation detected in loading conditions, sequencing yield, and homopolymer length accuracy. We demonstrate that reads shorter than the theoretical minimum length are of lower overall quality and not simply truncated reads. The O. novo-ulmi H327 genome assembly is 31.8 Mb and is comprised of eight chromosome-length linear scaffolds, a circular mitochondrial conti of 66.4 kb, and a putative 4.2-kb linear plasmid. We estimate that the nuclear genome encodes 8613 protein coding genes, and the mitochondrion encodes 15 genes and 26 tRNAs.
Aneuploidy and epigenetic alterations have long been associated with carcinogenesis, but it was unknown whether aneuploidy could disrupt the epigenetic states required for cellular differentiation. In this study, we found that ~3% of random aneuploid karyotypes in yeast disrupt the stable inheritance of silenced chromatin during cell proliferation. Karyotype analysis revealed that this phenotype was significantly correlated with gains of chromosomes III and X. Chromosome X disomy alone was sufficient to disrupt chromatin silencing and yeast mating-type identity as indicated by a lack of growth response to pheromone. The silencing defect was not limited to cryptic mating type loci and was associated with broad changes in histone modifications and chromatin localization of Sir2 histone deacetylase. The chromatin-silencing defect of disome X can be partially recapitulated by an extra copy of several genes on chromosome X. These results suggest that aneuploidy can directly cause epigenetic instability and disrupt cellular differentiation.