De novo identification and visualization of important cell populations for classic Hodgkin lymphoma using flow cytometry and machine learning

2020 
Objectives: Automated classification of flow cytometry data has the potential to reduce errors and accelerate flow cytometry interpretation. We desired a machine learning approach that is accurate, intuitively easy to understand, and highlights the cells that are most important in the algorithm9s prediction for a given case. Methods: We developed an ensemble of convolutional neural networks (CNNs) for classification and visualization of impactful cell populations in detecting classic Hodgkin lymphoma, using two-dimensional (2D) histograms. Data from 977 and 245 clinical flow cytometry cases were used for training and testing, respectively. 78 non-gated 2D histograms were created per flow cytometry file. SHAP values were calculated to determine the most impactful 2D histograms and regions within the histograms. The SHAP values from all 78 histograms were then projected back to the original cells data for gating and visualization using standard flow cytometry software. Results: The algorithm achieved 67.7% recall (sensitivity), 82.4 % precision, and 0.92 AUROC. Visualization of the important cell populations in making individual predictions demonstrated correlations with known biology. Conclusions: The method presented enables model explainability while highlighting important cell populations in individual flow cytometry specimens, with potential applications in both diagnosis and discovery of previously overlooked key cell populations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []