CytoSet: Predicting clinical outcomes via set-modeling of cytometry data

2021 
AO_SCPLOWBSTRACTC_SCPLOWSingle-cell flow and mass cytometry technologies are being increasingly applied in clinical settings, as they enable the simultaneous measurement of multiple proteins across millions of cells within a multi-patient cohort. In this work, we introduce CytoSet, a deep learning model that can directly predict a patients clinical outcome from a collection of cells obtained through a blood or tissue sample. Unlike previous work, CytoSet explicitly models the cells profiled in each patient sample as a set, allowing for the use of recently developed permutation invariant architectures. We show that CytoSet achieves state-of-the-art classification performance across a variety of flow and mass cytometry benchmark datasets. Specifically, CytoSet greatly outperforms two baseline models by 20.6% on a large multi-sample clinical flow cytometry dataset. The strong classification performance is further complemented by demonstrated robustness to the number of sub-sampled cells per patient, enabling CytoSet to scale to hundreds of patient samples. Furthermore, we also conducted an ablation study with networks of varying depths to demonstrate that much of the representation power of CytoSet comes from the permutation-equivalent architectures. The superior performance achieved by the set-based architectures used in CytoSet suggests that clinical cytometry data can be appropriately interpreted and studied as sets. The code is publicly available at https://github.com/CompCy-lab/cytoset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    0
    Citations
    NaN
    KQI
    []