singleCellHaystack: Finding surprising genes in 2-dimensional representations of single cell transcriptome data

2019 
Summary: A typical analysis of single-cell sequencing data includes dimensionality reduction using Principal Component Analysis (PCA) and visualization in 2-dimensional plots, using t-SNE or UMAP. However, identifying differentially expressed genes or extracting biological knowledge remains challenging, even after reducing dimensionality. Here we introduce singleCellHaystack, a methodology that uses Kullback-Leibler Divergence to find genes that are expressed in subsets of cells that are non-randomly positioned in a multi-dimensional space. Critically, singleCellHaystack does not rely on arbitrary clustering of cells. We illustrate the usage of singleCellHaystack through applications on several single-cell datasets. Genes with highly biased expression profiles often include known cell type marker genes. singleCellHaystack is implemented as an R package, and includes additional functions for clustering and visualization of genes with interesting expression patterns. Availability and implementation: https://github.com/alexisvdb/singleCellHaystack
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []