Large-scale identification of chemically induced mutations in Drosophila melanogaster

2014 
Systematically defining the function of genes remains one of the most challenging endeavors in biological sciences. Several large forward and reverse genetic efforts have been initiated in mice to address this issue (Justice et al. 1999; Clark et al. 2004; Bradley et al. 2012; White et al. 2013). However, Caenorhabditis elegans and Drosophila melanogaster are still the most coveted systems to perform systematic functional annotation of genes required for development, nervous system function, organogenesis, metabolism, etc. (Venken et al. 2011). To this end, three main approaches are typically used: RNA interference (RNAi), transposon hopping, and chemical mutagenesis. Each of these methods has advantages as well as drawbacks (Mohr et al. 2010; Venken et al. 2011). Chemical mutagens like EMS (ethyl methanesulfonate) have the major advantage of being unbiased and often permit the generation of allelic series. However, mapping the causative mutations using traditional techniques is tedious and time consuming (Venken et al. 2011). The limitations of chemical mutagenesis, however, can be partially overcome by using low concentrations of mutagen to reduce the mutagenic load, thereby reducing the generation of second site mutations that can modify the phenotype of interest and confound mapping efforts. Moreover, if a method can be developed to efficiently map hundreds of mutations in a relatively short time, a major hurdle would be overcome. Currently, mutations are mapped in Drosophila using duplications (Cook et al. 2010; Venken et al. 2010), deficiencies (Parks et al. 2004; Cook et al. 2012), recombination mapping based on visible markers, single-nucleotide variations (SNVs) (Berger et al. 2001; Hoskins et al. 2001), and/or P-elements (Zhai et al. 2003). The process typically takes several months, depending on the availability of genetic tools, and the methods are not easily scalable to large sets. Hence, a majority of mutations, generated in prior forward genetic EMS-mutagenesis screens, remain unassigned to a gene, even though cursory phenotypic studies have been carried out. Thus, high-throughput strategies, facilitating identification of the causative mutations, are highly desirable. With the advent of whole-genome sequencing (WGS) (Sarin et al. 2008, 2010; Blumenstiel et al. 2009) and the reduction in cost of sequencing an entire genome (less than $500 per Drosophila genome at 30× coverage) (Hobert 2010), it is now possible to sequence an entire collection of mutant strains. In principle, comparing mutant and wild-type genome sequences should allow for the unambiguous identification of phenotype-causing mutations. However, natural sequence variation between chromosomes from different strains makes it challenging to determine causative mutations. For example, sequencing of 120 wild-type flies revealed one SNV per 25 nucleotides (Mackay et al. 2012). This corresponds to nearly 1 million polymorphisms for the X chromosome, which is 22.4 Mb and contains 2194 genes (http://www.ncbi.nlm.nih.gov/mapview/stats/BuildStats.cgi?taxid=7227b Keightley et al. 2009). Hence, WGS does not provide a direct solution to the problem of mapping causative mutations. Thus far, several proof-of-principle studies, each applying different approaches to successfully map a chemically induced mutation using WGS, have been documented in the literature (Sarin et al. 2008, 2010; Blumenstiel et al. 2009; Zhang et al. 2009; Earley and Jones 2011; Fairfield et al. 2011; Andrews et al. 2012; Bull et al. 2013). In general, a subset of SNVs is first removed based on assay-specific criteria upon which some form of mapping is performed to reduce the number of candidate mutations. For instance, Leshchiner et al. (2012) designed an algorithm that identifies SNVs in regions of homozygosity when one combines meiotic mapping with WGS (Leshchiner et al. 2012). Here, every mutant strain is allowed to recombine for several generations with a wild-type strain of different genetic background, upon which a number of individual progeny are pooled and sequenced. This method has allowed the successful mutation identification of a handful of mutants in flies, worms, zebrafish, and mice, but the number of complementation groups that were mapped per report is limited. It thus remains unclear how scalable this approach is or what its success rate is when one attempts to apply WGS to identify their mutant of interest (Doitsidou et al. 2010; Earley and Jones 2011; Andrews et al. 2012; Leshchiner et al. 2012; Bull et al. 2013; Henke et al. 2013). The drawback of combining meiotic mapping and WGS is that (1) recombination mapping requires several generations of back-crossing and is less straightforward when recessive lethal mutations are being mapped, and (2) in order to sequence multiple animals per genotype, animals are typically pooled and sequenced on one lane of the Illumina sequencer at a low coverage (4×–5×), which fails to identify many SNVs that are present in the genome. In addition, it was recently found that, apart from slightly reducing the mutational load, outcrossing mutant strains to wild-type strains also introduces a significant number of variants and may hence complicate mutation identification (Sarin et al. 2010). Alternatively, independent variants that are found in the same gene can lead to gene identification when the mutants that are part of the same complementation group and exhibit similar phenotypes are sequenced (Sarin et al. 2008; Gonzalez et al. 2012). However, none of the strategies used thus far have been scaled effectively to map numerous causative mutations, and it remains to be determined what the optimal filters are, what fraction of mutations can be identified, and what fraction of multiple versus single alleles can be mapped effectively using the current technologies. Finally, one needs to demonstrate without a doubt that a mutation is causal among hundreds of mutations. Here, we describe our large-scale effort to map 394 EMS-induced mutations. We performed WGS on mutant lines that were generated in a forward genetic screen for essential genes on the X chromosome (Yamamoto et al. 2014) and have developed a set of filters to reduce the number of SNVs to a manageable level. By combining WGS with a rough mapping strategy (to ∼1.4 Mb), we were able to map 274 (70%) of the mutations. The mutations were shown to be causative by rescuing the lethality with small, molecularly defined P[acman] duplications (Venken et al. 2009, 2010). In summary, we show that WGS can be successfully applied to map EMS-induced mutations on a large scale, permitting forward genetic screens to annotate the function of numerous genes at a much greater pace than currently available methodologies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    49
    Citations
    NaN
    KQI
    []