Quality-controlled R-loop meta-analysis reveals the characteristics of R-Loop consensus regions

2021 
R-loops are three-stranded nucleic acid structures formed from the hybridization of RNA and DNA during transcription. While the pathological consequences of R-loops have been well-studied to date, the locations, classes, and dynamics of physiological R-loops remain poorly understood. R-loop mapping studies provide insight into R-loop dynamics, but their findings are challenging to generalize. This is due to the narrow biological scope of individual studies, the limitations of each mapping modality, and, in some cases, poor data quality. In this study, we reprocessed 693 R-loop mapping datasets from a wide array of biological conditions and mapping modalities. From this data resource, we developed an accurate method for R-loop data quality control, and we reveal the extent of poor-quality data within previously published studies. We then identified a set of high-confidence R-loop mapping samples and used them to define consensus R-loop sites called "R-loop regions" (RL regions). In the process, we revealed the stark divergence between S9.6 and dRNH-based R-loop mapping methods and identified biologically meaningful subtypes of both constitutive and variable R-loops. Taken together, this work provides a much-needed method to assess R-loop data quality and reveals intriguing aspects of R-loop biology.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    66
    References
    1
    Citations
    NaN
    KQI
    []