Learning from Heterogeneous Data via Contrastive Learning: An Application in Multi-Source COVID-19 Radiography

2021 
The ongoing COVID-19 pandemic has overloaded current healthcare systems, including radiology systems and departments. Machine learning-based medical imaging diagnostic approaches play an important role in tracking the spread of this virus, identifying high-risk patients, and controlling infections in real-time. Researchers aggregate radiographic samples from different data sources to establish a multi-source learning scheme to mitigate the insufficiency of COVID-19 samples from individual hospitals, especially in the early stage of the disease. However, data heterogeneity across different clinical centers with various imaging conditions is considered a significant limitation in model performance. This paper proposes a contrastive learning scheme for the automatic diagnosis of COVID-19 to effectively mitigate data heterogeneity in multi-source data and learn a robust and generalizable model. Inspired by advances in domain adaptation, we employ contrastive training objectives to promote intra-class cohesion across different data sources and inter-class separation of infected and non-infected cases. Extensive experiments on two public COVID-19 CT datasets demonstrate the effectiveness of the proposed method for tackling data heterogeneity problems with boosted diagnosis performance. Moreover, benefiting from the contrastive learning framework, our method can be generalized to solve data heterogeneity problems under a broader multi-source learning setting.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []