CaMelia: imputation in single-cell methylomes based on local similarities between cells.

2021 
MOTIVATION Single-cell DNA methylation sequencing detects methylation levels with single-cell resolution, while this technology is upgrading our understanding of the regulation of gene expression through epigenetic modifications. Meanwhile, almost all current technologies suffer from the inherent problem of detecting low coverage of the number of CpGs. Therefore, addressing the inherent sparsity of raw data is essential for quantitative analysis of the whole genome. RESULTS Here, we reported CaMelia, a CatBoost gradient boosting method for predicting the missing methylation states based on the locally paired similarity of intercellular methylation patterns. On real single-cell methylation data sets, CaMelia yielded significant imputation performance gains over previous methods. Furthermore, applying the imputed data to the downstream analysis of cell-type identification, we found that CaMelia helped to discover more intercellular differentially methylated loci that were masked by the sparsity in raw data, and the clustering results demonstrated that CaMelia could preserve cell-cell relationships and improve the identification of cell types and cell subpopulations. AVAILABILITY Python code is available at https://github.com/JxTang-bioinformatics/CaMelia. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    4
    Citations
    NaN
    KQI
    []