Article Figures and data Abstract eLife digest Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract Dynamic post-translational modification of RNA polymerase II (RNAPII) coordinates the co-transcriptional recruitment of enzymatic complexes that regulate chromatin states and processing of nascent RNA. Extensive phosphorylation of serine residues at the largest RNAPII subunit occurs at its structurally-disordered C-terminal domain (CTD), which is composed of multiple heptapeptide repeats with consensus sequence Y1-S2-P3-T4-S5-P6-S7. Serine-5 and Serine-7 phosphorylation mark transcription initiation, whereas Serine-2 phosphorylation coincides with productive elongation. In vertebrates, the CTD has eight non-canonical substitutions of Serine-7 into Lysine-7, which can be acetylated (K7ac). Here, we describe mono- and di-methylation of CTD Lysine-7 residues (K7me1 and K7me2). K7me1 and K7me2 are observed during the earliest transcription stages and precede or accompany Serine-5 and Serine-7 phosphorylation. In contrast, K7ac is associated with RNAPII elongation, Serine-2 phosphorylation and mRNA expression. We identify an unexpected balance between RNAPII K7 methylation and acetylation at gene promoters, which fine-tunes gene expression levels. https://doi.org/10.7554/eLife.11215.001 eLife digest Genes are sections of DNA that encode the instructions to make proteins. When a gene is switched on, the section of DNA is copied to make molecules of messenger ribonucleic acid (RNA) in a process called transcription. These messenger RNAs are then used as templates for protein production. In animals, plants and other eukaryotic organisms, an enzyme called RNA polymerase II is responsible for making messenger RNA molecules during transcription. This enzyme is made up of several proteins, the largest of which contains a long tail, called the carboxy-terminal domain. This domain is crucial for transcription because it serves as a landing platform for other enzymes that help to make the RNA molecules. The carboxy-terminal domain contains multiple repeats of a string of seven amino acids (the building blocks of proteins). Normally, each repeat contains three amino acids called serines. However, in humans and other mammals, one of these serines is often substituted with another amino acid called lysine instead. This lysine (referred to as Lysine-7) was known to be modified by the addition of a chemical group called an 'acetyl' tag, but it was not clear how this tag affected transcription. Dias, Rito, Torlai Triglia et al. carried out an in-depth study into how Lysine-7 is modified in mouse cells, and what effects these modifications have on transcription. The experiments show that Lysine-7 can also be modified by the addition of a different chemical group, called a 'methyl' tag. This new modification is also found in flies, worms and human cells, which suggests that it is generally important for transcription. Next, Dias, Rito, Torlai Triglia et al. found that in mouse stem cells, methyl tags are added to Lysine-7 during the earliest steps of transcription, before the acetyl tags are added. Further experiments show that a balance between the addition of methyl tags and acetyl tags to Lysine-7 fine-tunes the activity of RNA polymerase II. These findings add to our understanding of how cells control the activity of RNA polymerase II at different genes. Future challenges are to find out which enzymes are responsible for adding and removing these chemical tags, and how the balance between the methyl and acetyl modifications is controlled. https://doi.org/10.7554/eLife.11215.002 Introduction Transcription of protein-coding genes is a complex process involving a sequence of RNA processing events that occur at different stages of the transcription cycle. Co-transcriptional recruitment of chromatin modifiers and RNA processing machinery is modulated through a complex array of post-translational modifications at the C-terminal domain (CTD) of RPB1, the largest subunit of RNAPII. This unique domain constitutes a docking platform for protein complexes that cap, splice and polyadenylate newly-made RNAs (Bentley, 2014; Buratowski, 2009; Egloff et al., 2012; Eick and Geyer, 2013; Hsin and Manley, 2012). The CTD also integrates signaling cascades that, for example, coordinate the DNA damage response and chromatin remodeling with gene expression (Munoz et al., 2009; Winsor et al., 2013). The CTD is a large, structurally disordered domain composed of a tandem heptapeptide repeat structure with the canonical sequence Y1-S2-P3-T4-S5-P6-S7. Extensive remodeling of the CTD occurs during distinct steps of the transcription cycle (Buratowski, 2003, 2009). RNAPII binds to promoter regions in a hypophosphorylated state, before the CTD becomes phosphorylated at Serine-5 (S5p) and Serine-7 (S7p), marking the earliest stages of transcription (Akhtar et al., 2009; Chapman et al., 2007; Tietjen et al., 2010). Productive elongation is characterized by an increase in phosphorylation of Serine-2 (S2p) throughout gene bodies, with the highest levels found around transcription end sites (TES). S5p is important for recruitment of the capping machinery, while S2p is involved in the recruitment of splicing and polyadenylation factors (Corden, 2013; Ghosh et al., 2011; Gu et al., 2013; Lunde et al., 2010). Although the tandem repeat structure of the CTD was acquired very early in eukaryotic evolution, and general features of serine phosphorylation are fairly conserved from yeast to mammals, the number of repeats is highly variable among different taxa (Chapman et al., 2008; Yang and Stiller, 2014). The most complex multicellular organisms, such as vertebrates, generally have longer CTDs (e.g. 52 heptad repeats in mammals), whereas Drosophila melanogaster, Caenorhabditis elegans and unicellular yeast have 44, 42 and 26–29 copies, respectively. The mammalian CTD retains a core of 21 consensus repeats, but has accumulated a diversity of non-consensus repeats, particularly at its most C-terminal region (Figure 1a). In vertebrates, non-canonical amino-acid residues occur most frequently at the seventh position of the heptapeptide repeat, and the most frequent substitution replaces the canonical S7 residue with a lysine (K7; Figure 1a). The number of non-canonical K7-containing repeats increases from zero in yeast to one, three and eight repeats in C. elegans, D. melanogaster and vertebrates, respectively (Figure 1b). Previous work has shown that non-canonical CTD-K7 residues can be acetylated, and that CTD-K7ac is associated with transcriptional pausing at epidermal growth factor (EGF)-inducible genes in mouse fibroblasts (Schroeder et al., 2013). Evolutionary analyses also suggest that CTD-K7ac played a role in the origin of complex Metazoan lineages (Simonti et al., 2015). Figure 1 Download asset Open asset Structure and evolutionary conservation of the C-terminal domain of RPB1. (a) Mouse RPB1 CTD is composed of 52 heptapeptide repeats with consensus amino-acid sequence YSPTSPS, which is represented 21 times at the most proximal CTD region. Non-consensus amino acids are enriched for at the distal region. Most abundant non-consensus residues are lysines, all found at heptad position 7 (K7; represented in red). Other non-consensus residues are represented in blue. (b) Amino-acid sequence alignment of the most distal part of the CTD containing K7 residues across different species: Mus musculus (M. mus); Xenopus tropicalis (X. tro); Danio rerio (D. rer); Drosophila melanogaster (D. mel); and, Caenorhabditis elegans (C. ele). Conservation of CTD K7 residues is highlighted in yellow. CTD repeat numbering was done according to the mouse CTD sequence between repeats 35 and 49, and aligned to the other species CTDs from the position of the repeat containing the first lysine in each organism. RPB1, RNA polymerase II large subunit; CTD, C-terminal domain. https://doi.org/10.7554/eLife.11215.003 To further explore the increasing complexity of CTD modifications over evolution, their temporal sequence, and how they interplay with each other, we have investigated the possibility of additional modification of non-canonical CTD residues. We identify mono- and di-methylation of CTD-K7 residues in both vertebrates and invertebrates. We produce new antibodies specific to CTD-K7me1 and CTD-K7me2 and show that these novel modifications precede or accompany phosphorylation of S5 and S7, upstream of S2 phosphorylation. Using biochemical and genome-wide approaches, we show that CTD-K7 methylation is present at the promoters of genes that are productively transcribed into mature RNA, but defines the earliest stages of the transcription cycle. Through detailed analysis of abundance and distribution of different CTD modifications at gene promoters in embryonic stem (ES) cells, we show that gene expression levels depend on the balance between CTD-K7 methylation and acetylation. Results Mutation of CTD-K7 residues is compatible with cell viability To study the importance of non-consensus CTD-K7 residues on cell viability and their potential for post-translational methylation, we generated stable mouse NIH-3T3 cell lines expressing α-amanitin-resistant RPB1 bearing CTD-K7 mutations (Figure 2a). In this system, the endogenous α-amanitin-sensitive RPB1 is continually depleted and functionally replaced by the resistant variant (Nguyen et al., 1996). CTD-K7 residues were mutated into serine (S7) residues to restore the consensus sequence of the CTD heptapeptide. We avoided the more traditional lysine to arginine substitution, as a non-canonical arginine residue is present at the CTD in position 7 of repeat 31 and undergoes methylation in vivo (Sims et al., 2011). Therefore, artificial expansion of R7 residues in the CTD could confound our investigation of CTD-K7 methylation. Figure 2 with 1 supplement see all Download asset Open asset Mutation of CTD-K7 to -S7 residues does not interfere with RPB1 stability, phosphorylation or subcellular localization. (a) Outline of strategy used to generate mouse cell lines bearing K7-to-S7 mutations and study CTD-K7 methylation. Red bars represent CTD repeats with K7 residues. The nomenclature of the cell lines is indicative of the number of K7 residues retained in each α-amanitin-resistant Rpb1 constructs. (b) Expression and phosphorylation levels of RPB1 in cell lines expressing wild-type and mutant YFP-Rpb1 construct. Levels of total RPB1, YFP, S5p, S7p and S2p were analyzed by western blotting in total cell extracts from NIH-3T3 (3T3) and from NIH-3T3 cell lines expressing wild-type (8K) and mutant (0K) RPB1. Hypo- (IIa) and hyperphosphorylated (II0) isoforms of YFP-Rpb1 constructs migrate slower than wild-type construct detected in 3T3 due to the YFP tag (Y-IIa and Y-II0 respectively). Total RPB1 was detected with an antibody to the N-terminus of RPB1. α-tubulin was used as loading control. For each blot/antibody, samples were run in the same gel, and re-ordered to improve clarity. Complete western blots are shown in Figure 2—figure supplement 1. (c) Whole-cell detection of RPB1 expression in wild-type (8K), mutant (0K) and untransfected NIH-3T3 fibroblasts. Expression and distribution of total RNAPII (YFP, green) is similar in 8K and 0K cell lines. Immunofluorescence of S5p (pseudo-colored red) shows similar pattern and distribution in the three cell lines. DNA (pseudo-colored blue) was counterstained with TOTO-3. Scale bar, 10 µm. CTD, C-terminal domain; RPB1, RNA polymerase II large subunit; YFP, yellow fluorescent protein. https://doi.org/10.7554/eLife.11215.004 To explore the effect of the number and position of different K7 residues in the mouse CTD, we generated α-amanitin-resistant YFP-Rpb1 (YFP fusion at N-terminus of Rpb1 gene) constructs containing different number of K7-to-S7 mutations (Figure 2a). Mutant 0K does not have any K7 residues and therefore resembles a yeast-like CTD, but with 52 repeats. Mutant 1K retains only one K7, on repeat 35, which is conserved in D. melanogaster (aligning from C-terminus of CTD) and is the only K7 residue present in C. elegans. Mutant construct 3K has three K7 residues present at repeats 35, 40 and 47, all of which are conserved in D. melanogaster. Finally, we used a wild-type murine Rpb1 construct (8K), which contains all eight vertebrate-conserved K7 residues, as a control for expression and α-amanitin selection. We produced viable mouse NIH-3T3 fibroblast lines that express each of the four constructs and show stable YFP-RPB1 expression for more than one month in culture and after several passages under α-amanitin selection. Viability of cells expressing α-amanitin-resistant RPB1 was previously shown for K7-to-R7 mutations (Schroeder et al., 2013) or for other CTD constructs without all lysines, where Rpb1 contained only consensus heptapeptide repeats (Chapman et al., 2005; Hintermair et al., 2012). Mutation of CTD-K7 residues is compatible with CTD phosphorylation To determine whether non-canonical CTD-K7 residues are important for CTD phosphorylation, we performed western blotting using total protein extracts from stable NIH-3T3 clones expressing 8K (wild-type) or 0K constructs (Figure 2b, Figure 2—figure supplement 1); extracts from untransfected NIH-3T3 fibroblasts were analyzed as an additional control. Total expression levels of YFP-RPB1 fusion proteins, detected using an antibody against the N-terminus of RPB1, were similar to the levels of endogenous RPB1 in the parental NIH-3T3 cell line. As expected, YFP-RPB1 fusion proteins migrate at a higher molecular weight than endogenous RPB1, confirmed using antibodies that detect the YFP tag (Figure 2b, Figure 2—figure supplement 1). Western blot analyses of CTD phosphorylation using highly specific antibodies against S5p, S7p and S2p (Brookes et al., 2012; Stock et al., 2007), detect hyperphosphorylated (II0) RPB1 in untransfected NIH-3T3, wild-type 8K and 0K mutant cells, showing that mutation of K7-to-S7 residues is compatible with normal global levels of serine phosphorylation. To examine the effect of K7-to-S7 mutation on the subcellular localization of RPB1, we used confocal microscopy and YFP fluorescence to detect YFP-RPB1 fusion proteins and found that the typical RNAPII nucleoplasmic distribution is unaffected by K7-to-S7 mutations (Figure 2c; e.g. Xie et al., 2006). Immunofluorescence using S5p antibodies also shows similar distribution and levels of S5p in 8K and 0K cells (Figure 2c). These observations show that mutation of CTD-K7 residues is compatible with viability of mouse fibroblasts, and suggest that global serine phosphorylation and RNAPII localization are independent of the presence or absence of K7 residues. CTD-K7 residues are methylated in vivo Acetylation of CTD-K7 residues was recently identified and found associated with inducible gene expression (Schroeder et al., 2013). Lysine acetylation has been extensively studied in the context of histone proteins, where it is often counter-balanced by methylation, with clear roles in regulation of gene expression and repression (Bannister and Kouzarides, 2011; Wozniak and Strahl, 2014). To investigate whether K7 methylation could counteract K7 acetylation of RNAPII, we developed specific monoclonal antibodies using CTD peptides methylated on K7. With the aim of raising antibodies that could potentially detect methylation in several or all of the K7-containing CTD repeats, we chose the peptide sequence centered on the K7 residue in repeat 35 of the CTD (Figure 3a). This heptad has the most represented K7-repeat sequence (YSPTSPK), and is the K7 residue with most conserved distance to the C-terminal end of RPB1 in vertebrates and invertebrates (Figure 1b). Peptides modified by mono-, di- and tri-methylation of K7 residues were used for immunization. The supernatants of hybridoma clones were screened by enzyme-linked immunosorbent assay (ELISA) to test for specificity to CTD-K7 methylation (Figure 3b). We identified several antibody clones specific for CTD-K7 mono- or di-methylation (Figure 3c). Clone CMA611 is specific for CTD-K7me1 and does not recognize unmodified CTD-K7, CTD-K7me2, CTD-K7me3 or CTD-K7ac. Clone CMA612 is specific for CTD-K7me2 and does not bind to the other peptides tested (Figure 3c). Although ELISA analyses identified clones that recognize CTD-K7me3, or both CTD-K7me3 and me1/2 forms, these clones showed reactivity towards other proteins (not shown). We therefore did not perform further analyses using CTD-K7me3 antibody clones. Figure 3 with 1 supplement see all Download asset Open asset RPB1 is mono- and di-methylated at CTD-K7 residues. (a) Amino-acid sequence of CTD-K7-methyl peptides used for immunization, designed based on the sequence of mouse CTD repeats 35 and 36. (b) Schematic representation of strategy used for production and screening of specific CTD-K7-methyl antibodies. Antibody clones that specifically recognize K7 or its modifications should bind strongly to the wild-type band, the 8K slower-migrating band, but not to the mutant 0K band. (c) Specificity of CTD-K7 methyl antibodies was assessed by ELISA using unmodified (K7), mono- (K7me1), di- (K7me2), tri-methylated (K7me3) and acetylated (K7ac) CTD peptides (Table 1). Clones CMA611 and CMA612 are specific for K7me1 and K7me2, respectively. (d) K7me1 and K7me2 mark hypophosphorylated RPB1 in mouse cells with migration similar to forms detected using 8WG16 antibody. Western blotting was performed using total protein extracts from NIH-3T3 (3T3), and from NIH-3T3 cells stably expressing wild-type 8K (8K) or mutant 0K (0K). K7me1 and K7me2 are detected in 3T3 and 8K, but not in 0K cell lines. CTD methylation migrates at the level of hypophosphorylated RNAPII (IIa and Y-IIa). Low levels of methylation of endogenous RPB1 is also detected in 8K and 0K cell lines, due to expression from endogenous Rpb1 locus. α-Tubulin was used as loading control. For 8WG16 blot, samples were run in the same gel, and re-ordered to improve clarity. Original western blots are shown in Figure 3—figure supplement 1a. (e) CTD K7 residues are mono- and di-methylated in 3T3 cells at levels that increase with K7 number. K7me1 and K7me2 were detected by western blotting using whole-cell extracts from 3T3 lines expressing 8K, 3K, 1K or 0K Rpb1 constructs; untransfected 3T3 cell extracts were used as an additional control. RPB1 levels were measured by immunoblotting of YFP and using 8WG16 antibody with specificity for unmodified S2. α-Tubulin was used as loading control. Samples were run in the same gel, and re-ordered to improve clarity. Original western blots are shown in Figure 3—figure supplement 1c. (f) K7me2 and K7me1 are detected in invertebrates, mouse and human cells. Western blotting of K7me1 and K7me2 was performed using C. elegans whole worm extract (C. ele), D. melanogaster embryo extract (D. mel), total cell extracts from NIH-3T3 cells (M. mus) and from human HEK-293T cells (H. sap). α-Tubulin was used as loading control. RPB1, RNA polymerase II large subunit; YFP, yellow flourescent protein. Original western blots are shown in Figure 3—figure supplement 1d. https://doi.org/10.7554/eLife.11215.006 To test whether mono- or di-methylation of the CTD could be identified in vivo, we performed western blotting on total extracts from NIH-3T3, 8K and 0K cell lines using the novel antibodies against CTD-K7me1 and CTD-K7me2 (Figure 3d, Figure 3—figure supplement 1a). K7me1 and K7me2 are detected in NIH-3T3 and in 8K cell lines, both of which express a CTD with eight K7 residues, but not in the 0K mutant cell line, where all K7 residues are mutated to S7. These results confirm the specificity of the K7me1 and K7me2 antibodies to CTD-K7 modifications. The single band in NIH-3T3 and in 8K cell lines shows lack of cross-reactivity to other NIH-3T3 proteins (Figure 3—figure supplement 1a). These results demonstrate the existence of in vivo methylation of non-canonical K7 residues of the CTD in mouse fibroblasts. RPB1 migrates in two major forms, a fast migrating hypophosphorylated (IIa) state and a slower migrating hyperphosphorylated (II0) state, as well as intermediate phosphorylation forms. Interestingly, we found that both the K7me1 and K7me2 antibodies detect the hypophosphorylated RPB1 and YFP-RPB1 bands (Figure 3d). This band is also detected by the antibody 8WG16, which preferentially recognizes unmodified S2 residues (reviewed inBrookes and Pombo, 2009). To confirm the presence of K7me1 and K7me2 within hypophosphorylated RPB1, we repeated the K7me1 and K7me2 western blotting in mouse ES cells, confirming immunoreactivity to hypophosphorylated RPB1 (Figure 3—figure supplement 1b). Multiple CTD-K7 residues are methylated To explore the extent of methylation of the eight mammalian K7 residues, we performed western blots using the NIH-3T3 cell lines engineered to express YFP-RPB1 fusion proteins bearing different numbers of K7 residues (0, 1, 3 or 8 lysines; Figure 3e, Figure 3—figure supplement 1c). Mono- and di-methylation were identified in total cell extracts from the 1K cell line. The intensities of mono- and di-methylation increase in the 3K line, indicating that several lysine residues are mono- and di-methylated in the same CTD. The level of mono-methylation increases further in the 8K-cell line, showing abundant mono-methylation of the CTD in vivo. In contrast, the di-methylation levels remain similar between the 8K and 3K lines, suggesting that not all eight CTD lysine-7 residues are simultaneously di-methylated, and reflecting a possible preference for di-methylation of the K residues conserved between mammals and invertebrates, which are present in the 3K construct. Similar expression levels of YFP-RPB1 were confirmed in the four cell lines using western blots for YFP and 8WG16 (Figure 3e). CTD-K7 residues are also methylated in human cells, D. melanogaster and C. elegans We next tested whether CTD-K7 methylation is conserved across species. K7me2 is also detected in whole protein extracts from adult Caenorhabditis elegans worm, Drosophila melanogaster embryos, and human HEK293 cells, in western blotting analysis (Figure 3f, Figure 3—figure supplement 1d). These observations reveal, for the first time, conservation of a non-consensus CTD modification between vertebrates and invertebrates. K7me1 also occurs in human cells and C. elegans, but is not easily detected in extracts from D. melanogaster embryos, suggesting that di-methylation may be a more prevalent methylation mark. Detection of mono- and di-methylation at the single C. elegans K7 residue, with antibodies produced using peptides based on the mammalian repeats 34-–35, also suggests that the recognition of K7 methylation is in general robust to small differences in the amino-acid sequences that flank the modified K7 residues (compare C. elegans sequence, -S4-S5-P6-K7-Y1-S2-P3-, with immunizing mammalian peptide sequence, -T4-S5-P6-K7-Y1-T2-P3-; Figure 1a, b). This conclusion is also supported by the increased detection of K7me1 and K7me2 with increased number of K7 residues in the mammalian CTD (cell lines 1K, 3K and 8K; Figure 3e), each flanked by slightly different amino acids (Figure 1a). CTD-K7 methylation occurs early during the transcriptional cycle The observation that mono- and di-methylation of CTD-K7 residues is detected primarily in the hypophosphorylated (faster-migrating) forms of RPB1 (Figure 3d) suggests that CTD methylation is associated with early stages of the transcription cycle. However, it could also result from steric hindrance of K7me1 and K7me2 antibody binding by CTD phosphorylation. To test whether CTD phosphorylation interferes with immunodetection of K7 methylation, we performed western blots from total protein extracts obtained from mouse ES cells, and pre-treated the blots with alkaline phosphatase to remove phosphoepitopes prior to immunoblotting (Figure 4a, Figure 4—figure supplement 1a). We find that the detection of K7 methylation remains specific to the hypophosphorylated RPB1 after treatment of immunoblots with alkaline phosphatase, showing only a minor increase in the detection of K7me1 and K7me2 at intermediately phosphorylated forms, in conditions that fully abrogate detection of phosphorylated epitopes (e.g. S5p; see also Stock et al., 2007). Therefore, immunodetection of K7me1 and K7me2 is only minimally affected by CTD phosphorylation, suggesting that K7me1 and K7me2 modifications are depleted from elongation-competent hyperphosphorylated RPB1 complexes. Interestingly, the association of K7me1 and K7me2 with hypophosphorylated form of RPB1 differs from K7ac, previously shown to occur at both hypo- and hyperphosphorylated RPB1 forms (Schroeder et al., 2013), suggesting that K7 methylation may precede K7 acetylation during the transcription cycle. Figure 4 with 1 supplement see all Download asset Open asset Interplay between K7me1 and K7me2 with RPB1 phosphorylation. (a) CTD K7me1 and K7me2 mark hypophosphorylated and intermediately phosphorylated Rpb1 forms, but not the hyperphosphorylated (II0) form. Western blotting using the indicated antibodies was performed after treatment of nitrocellulose membranes in the presence (+) or absence (–) of alkaline phosphatase (AP). Hypo- (IIa) and hyperphosphorylated (II0) RPB1 forms are indicated. α-Tubulin was used as loading control. Lanes were re-ordered to improve clarity. Original western blots are shown in Figure 4—figure supplement 1a. (b) K7me1 and K7me2 abundance is insensitive to CDK9 inhibition with inhibitor flavopiridol. Mouse ES cells were treated with flavopiridol (10 µM, 1 hr), before western blotting using antibodies specific for S5p, S2p, K7me1 or K7me2. Hypo- (IIa) and hyperphosphorylated (II0) RPB1 forms are indicated. α-Tubulin was used as loading control. Lanes were re-ordered to improve clarity. Original western blots are shown in Figure 4—figure supplement 1b. (c) K7me1 and K7me2 are localized in the nucleoplasm with a more restricted distribution than S5p. Whole-cell immunofluorescence of S5p, K7me1 and K7me2 was performed using mouse NIH-3T3 fibroblasts. Nucleic acids were counterstained with TOTO-3. Scale bar, 10 µm. RPB1, RNA polymerase II large subunit. https://doi.org/10.7554/eLife.11215.008 To further investigate whether K7 mono- and di-methylation occur upstream of elongation, we treated ES cells with flavopiridol, an inhibitor of RNAPII elongation (Chao and Price, 2001). Depletion of elongation-competent complexes can be achieved by short treatment of ES cells with flavopiridol (10 µM, 1 hr), as shown by loss of S2p detection and lower mobility of S5p forms in western blots (Stock et al., 2007; Figure 4b, Figure 4—figure supplement 1b). We find that K7me1 and K7me2 levels are only minimally increased by flavopiridol treatment (Figure 4b), consistent with both modifications being associated with pre-elongation stages of transcription. The minor increase in K7me1 and K7me2 levels agrees with the slightly increased detection of K7 methylation upon CTD dephosphorylation. We then tested whether K7me1 and K7me2 are localized within the nucleus using immunofluorescence in mouse NIH-3T3 cells (Figure 4c). We find K7me1 and K7me2 concentrated in punctate nucleoplasmic domains, absent from nucleoli and regions of heterochromatin. The K7me1 and K7me2 foci are sparser than the discrete nucleoplasmic domains containing total YFP-RPB1 or S5p (see Figure 2c), consistent with their presence at RNAPII complexes involved in more restricted transcription events, rather than extensively marking chromatin-free RPB1. CTD-K7 mono- and di-methylation mark promoters of active genes To explore the role of K7me1 and K7me2 in the transcription cycle, we mapped their chromatin occupancy genome-wide using chromatin immunoprecipitation coupled to next generation sequencing (ChIP-seq) in mouse ES cells (Figure 5). The chromatin occupancy of K7me1 and K7me2 was compared with RNAPII phosphorylation (S5p, S7p, S2p and unmodified S2 detected with antibody 8WG16), with K7 acetylation, and with mRNA-seq, using published datasets from mouse ES cells (Figure 5a; Brookes et al., 2012; Schroeder et al., 2013). S5p, S7p and 8WG16 are primarily enriched at gene promoters and downstream of polyadenylation sites; S2p is detected along gene bodies and is most highly enriched immediately after polyadenylation sites (Brookes et al., 2012). CTD-K7ac occupies gene promoters and extends into gene bodies, as previously described (Schroeder et al., 2013). Figure 5 with 2 supplements see all Download asset Open asset K7me1 and K7me2 mark promoters of expressed genes. (a) K7me1 and K7me2 are enriched at promoters of active genes. ChIP-seq profiles for K7me1, K7me2, K7ac, 8WG16, S5p, S7p and S2p, and mRNA-seq profiles are represented for the inactive gene Myf5, and active genes Eed, Rpll13 and Tuba1a. Images were obtained from UCSC Genome Browser using mean as windowing function. (b) Methylation and acetylation of CTD-K7 residues coincides at most genes. Gene promoters positive for K7me1 (6962), K7me2 (8265) and K7ac (8312) were identified using a peak finder approach (see Materials and methods). The overlap between the three CTD-K7 modifications is represented using a Venn diagram. (c) CTD-K7 methylation and acetylation extensively co-occur with other CTD modifications. The percentages of genes positiv
a tab delimited flat file (SDRF file) describing the experimental details for human timecourse samples for the standard protocol of the HeliScopeCAGE protocol
List of DRA (http://trace.ddbj.nig.ac.jp/dra/index_e.html) accession numbers of the FANTOM5 samples, sequences and genomic coordinations. Files are in tab-delimited format, which includes * Library ID * FFID * BioSamples accession number * DRA experiment accession number * DRA run accession numbers * DRA analysis accession number for genomic coordination (BAM file) * DRA analysis accession number for CTSS (BED file) * Experiment method (CAGE/RNA-Seq/sRNA-Seq)
a tab delimited flat file (SDRF file) describing the experimental details for human primary cell samples for the standard protocol of the HeliScopeCAGE protocol
Article16 October 2017Open Access Transparent process RNA polymerase II primes Polycomb-repressed developmental genes throughout terminal neuronal differentiation Carmelo Ferrai Corresponding Author Carmelo Ferrai [email protected] orcid.org/0000-0002-8088-2757 Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Genome Function, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Search for more papers by this author Elena Torlai Triglia Elena Torlai Triglia orcid.org/0000-0002-6059-0116 Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Search for more papers by this author Jessica R Risner-Janiczek Jessica R Risner-Janiczek Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Stem Cell Neurogenesis, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Neurophysiology Group, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Search for more papers by this author Tiago Rito Tiago Rito Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Search for more papers by this author Owen JL Rackham Owen JL Rackham Duke-NUS Medical School, Singapore, Singapore Search for more papers by this author Inês de Santiago Inês de Santiago Genome Function, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Search for more papers by this author Alexander Kukalev Alexander Kukalev Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Search for more papers by this author Mario Nicodemi Mario Nicodemi Dipartimento di Fisica, Università di Napoli Federico II and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, Naples, Italy Search for more papers by this author Altuna Akalin Altuna Akalin Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany Search for more papers by this author Meng Li Meng Li Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Stem Cell Neurogenesis, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Search for more papers by this author Mark A Ungless Corresponding Author Mark A Ungless [email protected] orcid.org/0000-0002-1730-3353 Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Neurophysiology Group, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Search for more papers by this author Ana Pombo Corresponding Author Ana Pombo [email protected] orcid.org/0000-0002-7493-6288 Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Genome Function, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Institute for Biology, Humboldt-Universität zu Berlin, Berlin, Germany Search for more papers by this author Carmelo Ferrai Corresponding Author Carmelo Ferrai [email protected] orcid.org/0000-0002-8088-2757 Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Genome Function, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Search for more papers by this author Elena Torlai Triglia Elena Torlai Triglia orcid.org/0000-0002-6059-0116 Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Search for more papers by this author Jessica R Risner-Janiczek Jessica R Risner-Janiczek Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Stem Cell Neurogenesis, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Neurophysiology Group, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Search for more papers by this author Tiago Rito Tiago Rito Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Search for more papers by this author Owen JL Rackham Owen JL Rackham Duke-NUS Medical School, Singapore, Singapore Search for more papers by this author Inês de Santiago Inês de Santiago Genome Function, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Search for more papers by this author Alexander Kukalev Alexander Kukalev Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Search for more papers by this author Mario Nicodemi Mario Nicodemi Dipartimento di Fisica, Università di Napoli Federico II and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, Naples, Italy Search for more papers by this author Altuna Akalin Altuna Akalin Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany Search for more papers by this author Meng Li Meng Li Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Stem Cell Neurogenesis, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Search for more papers by this author Mark A Ungless Corresponding Author Mark A Ungless [email protected] orcid.org/0000-0002-1730-3353 Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Neurophysiology Group, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Search for more papers by this author Ana Pombo Corresponding Author Ana Pombo [email protected] orcid.org/0000-0002-7493-6288 Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany Genome Function, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK Institute for Biology, Humboldt-Universität zu Berlin, Berlin, Germany Search for more papers by this author Author Information Carmelo Ferrai *,1,2,3,‡, Elena Torlai Triglia1,‡, Jessica R Risner-Janiczek3,4,5, Tiago Rito1, Owen JL Rackham6, Inês Santiago2,3,10, Alexander Kukalev1, Mario Nicodemi7, Altuna Akalin8, Meng Li3,4,11, Mark A Ungless *,3,5 and Ana Pombo *,1,2,3,9 1Epigenetic Regulation and Chromatin Architecture, Max Delbrück Center for Molecular Medicine, Berlin, Germany 2Genome Function, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK 3Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, London, UK 4Stem Cell Neurogenesis, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK 5Neurophysiology Group, MRC London Institute of Medical Sciences (previously MRC Clinical Sciences Centre), London, UK 6Duke-NUS Medical School, Singapore, Singapore 7Dipartimento di Fisica, Università di Napoli Federico II and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, Naples, Italy 8Scientific Bioinformatics Platform, Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany 9Institute for Biology, Humboldt-Universität zu Berlin, Berlin, Germany 10Present address: Seven Bridges Genomics UK Ltd, London, UK 11Present address: Neuroscience and Mental Health Research Institute, School of Medicine and School of Biosciences, Cardiff, UK ‡These authors contributed equally to this work *Corresponding author. Tel: +49 3094061755; E-mail: [email protected] *Corresponding author. Tel: +44 2083838299; E-mail: [email protected] *Corresponding author (lead contact). Tel: +49 3094061760; E-mail: [email protected] Molecular Systems Biology (2017)13:946https://doi.org/10.15252/msb.20177754 PDFDownload PDF of article text and main figures. Peer ReviewDownload a summary of the editorial decision process including editorial decision letters, reviewer comments and author responses to feedback. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract Polycomb repression in mouse embryonic stem cells (ESCs) is tightly associated with promoter co-occupancy of RNA polymerase II (RNAPII) which is thought to prime genes for activation during early development. However, it is unknown whether RNAPII poising is a general feature of Polycomb repression, or is lost during differentiation. Here, we map the genome-wide occupancy of RNAPII and Polycomb from pluripotent ESCs to non-dividing functional dopaminergic neurons. We find that poised RNAPII complexes are ubiquitously present at Polycomb-repressed genes at all stages of neuronal differentiation. We observe both loss and acquisition of RNAPII and Polycomb at specific groups of genes reflecting their silencing or activation. Strikingly, RNAPII remains poised at transcription factor genes which are silenced in neurons through Polycomb repression, and have major roles in specifying other, non-neuronal lineages. We conclude that RNAPII poising is intrinsically associated with Polycomb repression throughout differentiation. Our work suggests that the tight interplay between RNAPII poising and Polycomb repression not only instructs promoter state transitions, but also may enable promoter plasticity in differentiated cells. Synopsis Poised RNAPII-S5p is present at Polycomb-repressed genes from embryonic stem cells to terminally differentiated neurons. The tight interplay between RNAPII poising and Polycomb repression enables promoter plasticity in differentiated cells and increased potential for reactivation. Poised RNAPII-S5p primes Polycomb-repressed promoters throughout terminal differentiation to functional dopaminergic neurons. Poised RNAPII-S5p associates with increased potential for reactivation upon loss of Polycomb repression. DNA methylation valleys coincide with broad occupancy of poised RNAPII-S5p and Polycomb repression. Key non-neuronal transcription factor genes that co-associate with Polycomb and RNAPII-S5p in neurons have potential roles in transdifferentiation. Introduction Embryonic differentiation starts from a totipotent cell and culminates with the production of highly specialized cells. In ESCs, many genes important for early development are repressed in a state that is poised for subsequent activation (Azuara et al, 2006; Bernstein et al, 2006; Stock et al, 2007; Brookes et al, 2012). These genes are mostly GC-rich (Deaton & Bird, 2011), and their silencing in pluripotent cells is mediated by Polycomb repressive complexes (PRCs). Genes with more specialized cell type-specific functions are neither active nor Polycomb repressed in ESCs, have AT-rich promoter sequences, and their activation is associated with specific transcription factors (Sandelin et al, 2007; Brookes et al, 2012). Polycomb repressive complex proteins have major roles in modulating gene expression during differentiation and in disease (Prezioso & Orlando, 2011; Richly et al, 2011). They assemble in two major complexes, PRC1 and PRC2, which catalyze H2AK119 monoubiquitination and H3K27 methylation, respectively (Simon & Kingston, 2013). Both PRC-mediated histone marks are important for chromatin repression, and synergize in a tight feedback loop to recruit each other's modifying enzymes (Blackledge et al, 2015). Although imaging studies suggest that PRC-repressed chromatin has a compact conformation (Francis et al, 2004; Eskeland et al, 2010; Boettiger et al, 2016), molecular and cell biology approaches show that PRC repression in ESCs coincides with the occupancy of poised RNAPII complexes and active histone marks in the vast majority of PRC-repressed promoters (Azuara et al, 2006; Bernstein et al, 2006; Stock et al, 2007; Brookes et al, 2012; Tee et al, 2014; Kinkley et al, 2016; Weiner et al, 2016). The co-occurrence of RNAPII and PRC enzymatic activities, RING1B and EZH2, on chromatin has been confirmed in ESCs by sequential ChIP (Brookes et al, 2012) and is mirrored by the simultaneous presence of H3K4me3 and H3K27me3 marks, called chromatin bivalency (Azuara et al, 2006; Bernstein et al, 2006; Voigt et al, 2012; Kinkley et al, 2016; Sen et al, 2016; Weiner et al, 2016). RNAPII function is regulated through complex post-translational modifications at the C-terminal domain (CTD) of its largest subunit, RPB1, which coordinate the co-transcriptional recruitment of chromatin modifiers and RNA processing machinery to chromatin, leading to productive transcription events and mRNA expression (Brookes & Pombo, 2009; Zaborowska et al, 2016). In mammals, the CTD comprises 52 repeats of the heptapeptide sequence N-Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7-C. At active genes, RNAPII is phosphorylated on Ser5 (S5p) to mark transcription initiation, on Ser7 (S7p) during the transition to productive transcription, and on Ser2 (S2p) during elongation. S5p and S7p are mediated by CDK7, while S2p is mediated by CDK9. RNAPII also exists in a paused state of activation characterized by short transcription events at promoter regions, followed by promoter-proximal termination and re-initiation events (Adelman & Lis, 2012). RNAPII pausing is identified at genes that produce mRNA at lower levels and is often measured as the amount of RNAPII at gene promoters relative to its occupancy throughout coding regions. Paused states of RNAPII are therefore a feature of genes that are active to a lower extent, are characterized by the presence of S5p and S7p at gene promoters, low abundance of S2p throughout the coding regions, and they are recognized by 8WG16, an antibody which has a preference for unphosphorylated Ser2 residues. The paused RNAPII complex is also characterized by methylation and acetylation of the non-canonical Lys7 residues at the CTD (Schröder et al, 2013; Dias et al, 2015; Voss et al, 2015). The RNAPII complex that primes PRC-repressed genes in mouse ESCs has a unique configuration of post-translational modifications of the CTD, which is different from the paused RNAPII, and was originally referred to as poised RNAPII (Stock et al, 2007; Brookes & Pombo, 2009). The poised RNAPII is characterized by exclusive phosphorylation of S5p in the absence of S7p, S2p, K7me1/2, K7ac, or recognition by 8WG16 (Brookes et al, 2012; Dias et al, 2015). Poised RNAPII-S5p, in the absence of 8WG16, has not been described in Drosophila (Gaertner et al, 2012), consistent with lack of chromatin bivalency (Vastenhouw & Schier, 2012; Voigt et al, 2013). Importantly, Ser5 phosphorylation of poised RNAPII complexes at Polycomb-repressed genes in ESCs is mediated by different kinases, ERK1/2 (Tee et al, 2014; Ma et al, 2016), instead of CDK7 which phosphorylates both S5p and S7p at active genes, irrespectively of pausing ratio (Akhtar et al, 2009; Glover-Cutter et al, 2009). Loss of ERK1/2 activity in ESCs results in the loss of poised RNAPII-S5p and decreased occupancy of PRC2 at Polycomb-repressed developmental genes (Tee et al, 2014), suggesting a tight functional link between the presence of poised RNAPII-S5p at Polycomb-target genes and the recruitment of Polycomb. While histone bivalency has been studied to some extent during mammalian cell differentiation and found present at smaller proportion of genes (Mohn et al, 2008; Lien et al, 2011; Wamstad et al, 2012; Xie et al, 2013), it remains unexplored whether the co-occupancy of poised RNAPII-S5p at PRC targets is a property of ESCs or extends beyond pluripotency. The poised RNAPII-S5p state was observed at Polycomb-repressed genes in ESCs grown in the presence of serum and leukemia inhibitor factor (LIF; Stock et al, 2007; Brookes et al, 2012; Tee et al, 2014; Ma et al, 2016). Other studies grow ESCs in 2i conditions to simulate a more naïve pluripotent state, through inhibition of GSK3 and MEK signaling, which in turn inhibits ERK signaling. In these conditions, the occupancy of poised RNAPII complexes is reduced at Polycomb-target genes (Marks et al, 2012; Williams et al, 2015), consistent with the effects of ERK1-2 inhibition (Tee et al, 2014). Interestingly, the decreased occupancy of poised RNAPII-S5p at PRC-repressed genes in 2i conditions is accompanied by reduced occupancy of PRC2 catalytic subunit EZH2 and H3K27me3 modification, suggesting a tight interplay between the presence of poised RNAPII-S5p and Polycomb occupancy at Polycomb-repressed genes in ESCs, which is interfered upon in 2i conditions. Interestingly, prolonged 2i treatment was shown to impair ESC developmental potential and cause widespread loss of DNA methylation (Choi et al, 2017; Yagi et al, 2017), leading to renewed interest in understanding the regulation of developmental genes, and in particular whether poised RNAPII complexes are a more general feature of Polycomb repression mechanisms in the early and late stages of differentiation. Recent studies of DNA methylation in differentiated tissues show that many silent developmental regulator genes remain hypomethylated in wide genomic regions (also called DNA methylation valleys; or DMVs) in differentiated tissues (Xie et al, 2013), raising the question of whether poised RNAPII complexes might prime developmental regulator genes in cell lineages irrespectively of future activation. In this scenario, Polycomb repression may represent the universal mode of repression of this group of CpG-rich genes that recruit poised RNAPII-S5p and which are not targeted by DNA methylation. Silencing of developmental regulator genes through Polycomb repression mechanisms in fully differentiated cells, especially in the presence of poised RNAPII complexes, may nevertheless have roles in the remodeling of cell function, for example in response to specific stimuli such as tissue injury, or in disease such as in cancers associated with Polycomb dysfunction. To investigate RNAPII poising at Polycomb-repressed genes, from pluripotency to terminal differentiation, we mapped H3K27me3 (a marker of Polycomb repression), RNAPII-S5p (present at both active and poised RNAPII complexes), and RNAPII-S7p (a marker of productive gene expression) and produced matched mRNA-seq datasets in ESCs and in four stages of differentiation of functionally mature dopaminergic neurons. We show compelling evidence that the presence of poised RNAPII at H3K27me3-marked chromatin is not a specific feature of ESCs, but a general property common to differentiating and post-mitotic cells. We also observe de novo Polycomb repression during neuronal cell commitment and neuronal maturation that promotes waves of transient downregulation of gene expression. We discover a group of genes that maintain poised RNAPII-S5p and Polycomb silencing throughout neuronal differentiation, and which are developmental transcription factors important for cell specification toward non-neuronal lineages. Although these genes are unlikely to be subsequently reactivated in the neuronal lineage, their silencing in neuronal precursors and mature neurons is sensitive to Polycomb inhibition or knockout. We also show that the presence of poised RNAPII-S5p at specific subsets of Polycomb-repressed genes in terminally differentiated neurons coincides with their wide hypomethylation in mouse brain. Our study reveals the interplay between RNAPII poising and Polycomb repression in the control of regulatory networks and cell plasticity throughout cell differentiation. Results Capturing distinct stages of differentiation from ESCs to dopaminergic neurons To study the dynamic changes in Polycomb and RNAPII occupancy at gene promoters during differentiation, we optimized neuronal differentiation protocols to obtain large quantities of pure cell populations required for mapping chromatin-associated histone marks and RNAPII at five states of neuronal differentiation that leads to the production of functional dopaminergic neurons (ESC, days 1, 3, 16, and 30; Fig 1A). To capture the early exit from pluripotency, we adopted an approach that starts from mouse ESCs grown in serum-free and 2i-free conditions and which within 3 days achieves synchronous exit from pluripotency toward the production of neuronal progenitors (Abranches et al, 2009; Fig EV1A, top row). To obtain terminally differentiated dopaminergic neurons, we used an approach that commits ESCs to a midbrain neuron phenotype (Jaeger et al, 2011; Fig EV1A, bottom row). Figure 1. Model of differentiation from pluripotent stem cells to terminal dopaminergic neurons Schematic representation of the differentiation system used and the temporal expression dynamics of differentiation stage markers. RNA levels of differentiation stage markers were measured by qRT–PCR. Relative levels are normalized to Actb internal control, and values are plotted relative to the highest expressed time point. Mean and standard deviation (SD) are from three biological replicates. Indirect immunofluorescence confirms expression of stage-specific markers at the single cell level. OCT4 is a marker of pluripotent ESCs. Tuj1 is an antibody that detects neuronal marker TUBB3 at day 16 and day 30 neurons. The cycling activity of ESCs, day 16, and day 30 neurons was assessed by BrdU incorporation (24 h) into replicating BrDNA. Nuclei are counterstained with DAPI. Scale bar, 100 μm. Tyrosine hydroxylase (TH; in red) is a marker of dopaminergic neurons. It is not expressed in ESCs and detected weakly in day 16 and broadly in day 30 neurons. Nuclei are counterstained with DAPI. Scale bar, 100 μm. Gene expression dynamics across the differentiation time line for genes whose expression peaks in a single time point (z-score > 1.75; from mRNA-seq). Representative enriched GO terms were calculated using as background all genes expressed (> 1 TPM) in at least one time point. n, number of genes per group. Permute P-value (GO-Elite) is shown. Download figure Download PowerPoint Click here to expand this figure. Figure EV1. Neuronal differentiation protocol starting from ESCs gives cultures enriched for ventral midbrain dopaminergic neuronsRelated to Fig 1. Schematic representation of the protocols used to differentiate ESCs to dopaminergic neurons. Time points selected for ChIP-seq and mRNA-seq are boxed. Total RNA qRT–PCR shows the differential expression of specific markers during exit from pluripotency. Relative levels are normalized to Actb and plotted as ratio to the expression in the most expressed time point. Mean and standard deviation (SD) are from three biological replicates. Left, indirect immunofluorescence of LMX1A (green) and FOXA2 (red) in day 16 neurons. Nuclei were counterstained with DAPI (blue). Scale bar, 100 μm. Right, percentage of cells positive for FOXA2, LMX1A, and both. Mean and SD are from five fields of view. Download figure Download PowerPoint The expression of pluripotency markers Nanog and Oct4 decreases dramatically at days 1 and 3 of differentiation, respectively (Fig 1B). The early differentiation marker Fgf5 is transiently expressed in days 1–3, whereas neuronal markers Blbp, Hes5, and Mash1/Ascl1 are increasingly expressed from day 2 (Figs 1B and EV1B). The expression of Sox2, which encodes for a transcription factor expressed in ESCs and by most central nervous system progenitors (Graham et al, 2003), is detected from ESC to day 4, as expected (Fig EV1B; Abranches et al, 2009). After sixteen days, we obtained neurons that no longer express OCT4 protein, are positive for the neuronal marker TUBB3 (detected by Tuj1 antibody), and no longer divide, as assessed by lack of BrdU incorporation into newly replicated DNA (Fig 1C). To confirm the dopaminergic phenotype, we performed immunofluorescence for tyrosine hydroxylase (TH), the rate-limiting enzyme in dopamine synthesis (Fig 1D). The neuronal populations obtained expressed TH from day 16, reaching close to ubiquitous expression at day 30 (Fig 1D). Moreover, 70% of cells co-express LMX1A and FOXA2 at day 16 (Fig EV1C), two markers specific to the dopaminergic ventral midbrain, confirming that the neuronal populations produced are highly enriched for the dopaminergic lineage (Hegarty et al, 2013). Taken together, these results indicate that day 16 neurons represent an immature stage of differentiation committed to the ventral midbrain lineage, which further mature until day 30. To characterize the changes in gene expression that accompany neuronal differentiation, we produced mRNA-seq datasets for ESCs, day 1, day 3, day 16, and day 30. We identified 4656 genes whose expression levels peaked at one specific time point (Fig 1E). Genes peaking in ESCs are enriched in Gene Ontology (GO) terms typical of pluripotency, such as “stem cell maintenance”, “regulation of gene silencing”, and “sugar utilization”, and include Nanog, Tet1, and Hk2. Genes with highest expression on day 1 have roles in the exit from pluripotency, such as Wt1, Foxd3, and Dnmt3b, and are enriched in GO terms “cell morphogenesis”, “pattern specification process”, and “gene silencing”. On day 3, the expression of Fgf8, Gli3 and HoxA1 peaked, reflecting an early stage of neuronal commitment, highlighted by enrichment in GO terms such as “cellular developmental process”, “multicellular organismal process”, and “neuronal nucleus development”. Day 16 coincided with highest expression of genes associated with GO terms such as “nervous system development”, “axon guidance”, and “neuron migration” (including Nova1, Sema3f, Ascl1, Neurog2), and day 30 with genes important for dopaminergic synaptic transmission, for the “G-protein coupled receptor protein signaling pathway” and “response to alkaloid” (such as Lpar3, Th, Park2, Chrnb4). The complete list of enriched GO terms is presented in Dataset EV1. These expression profiles show that each time point captures a specific stage of neuronal development, and suggest that days 16 and 30 reflect early and late stages, respectively, of maturation of dopaminergic neurons. To further confirm the quality of our samples, we also explored the expression profiles of specific single genes (Fig EV2). In addition to confirming the expression of the differentiation markers studied by quantitative PCR and immunofluorescence (Fig EV2A), we also found that the proneural gene Ngn2 (expressed in immature, but not in mature, dopaminergic neurons) peaks at day 16 and drops by day 30, while Nurr1 (required for maintenance of dopaminergic neurons) is upregulated at day 16 but remains expressed at day 30 (Fig EV2B; Ang, 2006). Other markers of dopaminergic neurons, such as Pitx3, Aadc, Vmat, and Dat, are most highly expressed at day 30 (Fig EV2B). Click here to expand this figure. Figure EV2. mRNA-seq profiles highlight single-gene expression changes at different stages of differentiationRelated to Fig 1. UCSC Genome Browser snapshots of mRNA-seq tracks confirm the expression of specific markers for differentiation. ESCs progressively downregulate pluripotency genes, such as Nanog and Oct4, and express Fgf5 at the onset of differentiation. FoxA2 and Lmx1a are expressed in developing neurons while Th in mature dopaminergic neurons. UCSC Genome Browser snapshots of mRNA-seq tracks for additional markers of developing (Ngn2, Nurr1, Pitx3) and mature (Aadc, Vmat, Dat) dopaminergic neurons are sequentially activated in terminally differentiated neurons. Data information: Arrows represent the position of promoter and directionality of the gene. Y-axis scales are kept constant per gene and adjusted to its maximum expression. Download figure Download PowerPoint Electrophysiological measurements demonstrate distinct stages of neuronal maturation on days 16 and 30 of differentiation To directly investigate the state of maturation of neurons upon prolonged culture, we measured their action potential activity and synaptic connectivity by conducting targeted whole-cell electrophysiological recordings (Fig 2A) at four different time points (days 14–16, 20–25, 26–30, and > 30). We find that days 14–16 neurons are largely silent, whereas at days > 30 they exhibit robust spontaneous action potential activity (Figs 2B and EV3A), similar to the activity of midbrain dopaminergic neurons from ex vivo slice preparations and in vivo (Marinelli & McCutcheon, 2014). During maturation, neurons also exhibit a hyperpolarization-activated inward (Ih) current (Fig 2C), an electrophysiological feature commonly used to identify dopaminergic neurons (Ungless & Grace, 2012). Strikingly, after prolonged culture many cells exhibit burst-like events (Fig EV3B and C), often seen in vivo in midbrain dopaminergic neurons (Grace & Bunney, 1984), which are thought to be driven in part by synaptic inputs (Paladini & Roeper, 2014). We also observed maturation of functional synaptic connectivity. Spontaneous synaptic events were rare at days 14–16, but were pr
Eukaryotic gene expression is an intricate multistep process, regulated within the cell nucleus through the activation or repression of RNA synthesis, processing, cytoplasmic export, and translation into protein. The major regulators of gene expression are chromatin remodeling and transcription machineries that are locally recruited to genes. However, enzymatic activities that act on genes are not ubiquitously distributed throughout the nucleoplasm, but limited to specific and spatially defined foci that promote preferred higher-order chromatin arrangements. The positioning of genes within the nuclear landscape relative to specific functional landmarks plays an important role in gene regulation and disease.
a tab delimited flat file (SDRF file) describing the experimental details for mouse tissue samples for the low quantity version of the HeliScopeCAGE protocol