SoupX removes ambient RNA contamination from droplet based single cell RNA sequencing data

2018 
Droplet based single cell RNA sequence analyses assume all acquired RNAs are endogenous to cells. However, any cell free RNAs contained within the input solution are also captured by these assays. This sequencing of cell free RNA constitutes a background contamination that has the potential to confound the correct biological interpretation of single cell transcriptomic data. Here, we demonstrate that contamination from this "soup" of cell free RNAs is ubiquitous, experiment specific in its composition and magnitude, and can lead to erroneous biological conclusions. We present a method, SoupX, for quantifying the extent of the contamination and estimating "background corrected", cell expression profiles that can be integrated with existing downstream analysis tools. We apply this method to two data-sets and show that the application of this method reduces batch effects, strengthens cell-specific quality control and improves biological interpretation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    102
    Citations
    NaN
    KQI
    []