A-to-I RNA editing occurs at over a hundred million genomic sites, located in a majority of human genes

2014 
Consistency of genomic information flow is a basic concept in biology. In general, it is believed that the processed content of a gene (RNA) has the exact same sequence as its original DNA template. However, adenosine deaminases acting on RNA (ADARs), an essential family of RNA-modifying enzymes, can edit nucleotides in the RNA (Savva et al. 2012). Specifically, these enzymes can modify a genetically encoded adenosine (A) into an inosine (I) in double-stranded RNA structures. ADAR editing results in inosine, which replaces the genomically encoded adenosine, and is read by the cellular machinery as a guanosine (G) (Bass 2002; Nishikura 2010). Thus, sequencing of inosine-containing RNAs results in G where the corresponding genomic DNA reads A. The progress in sequencing techniques in recent years has brought about many reports of A-to-I editing in the human genome (Li et al. 2009; Bahn et al. 2012; Park et al. 2012; Peng et al. 2012; Ramaswami et al. 2012, 2013). These studies have identified a growing number of A-to-G mismatches in mRNA-sequencing data aligned to the genome, and used various algorithmic techniques to identify those mismatches originating from A-to-I editing. Analyses of various data sets have resulted in identification of thousands, and up to hundreds of thousands of editing sites. However, the overlap between the many reported sets is quite low (see Supplemental Tables 1, 2; Ramaswami et al. 2012), suggesting that the reported sites do not reflect the full scope of the A-to-I editing phenomenon. The primate specific Alu sequences are the dominant short interspersed nuclear element (SINEs) in the primate genomes (International Human Genome Sequencing Consortium 2001; Cordaux and Batzer 2009). Humans have about a million copies of Alu, roughly 300 bp long each, accounting for ∼10% of their genome. Since these repeats are so common, especially in gene-rich regions (Korenberg and Rykowski 1988), pairing of two oppositely oriented Alus located in the same pre-mRNA structure is likely. Such pairing produces a long and stable dsRNA structure, an ideal target for the ADARs. Indeed, recent studies have shown that Alu repeats account for >99% of editing events found so far in humans (Athanasiadis et al. 2004; Blow et al. 2004; Kim et al. 2004; Levanon et al. 2004; Ramaswami et al. 2012, 2013). Edited Alu sequences typically include a number of clustered edited sites (Athanasiadis et al. 2004; Blow et al. 2004; Kim et al. 2004; Levanon et al. 2004). This feature may be utilized to distinguish bona fide editing events from sequencing errors or misalignments due to duplication or genomic variability. Here, we refined the detection approach, focusing only on Alu editing, which allowed us to exploit the clustering property of editing sites. Analysis of large-scale RNA-seq data supplemented by targeted sequencing of Alu elements revealed that the majority of Alu elements form editable dsRNA structures, and nearly all adenosines expressed in such Alu repeats undergo A-to-I editing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    56
    References
    359
    Citations
    NaN
    KQI
    []