Multimodal Multitask Deep Learning for X-Ray Image Retrieval

2021 
Content-based image retrieval (CBIR) is of increasing interest for clinical applications spanning differential diagnosis, prognostication, and indexing of electronic radiology databases. However, meaningful CBIR for radiology applications requires capabilities to address the semantic gap and assess similarity based on fine-grained image features. We observe that images in radiology databases are often accompanied by free-text radiologist reports containing rich semantic information. Therefore, we propose a Multimodal Multitask Deep Learning (MMDL) approach for CBIR on radiology images. Our proposed approach employs multimodal database inputs for training, learns semantic feature representations for each modality, and maps these representations into a common subspace. During testing, we use representations from the common subspace to rank similarities between the query and database. To enhance our framework for fine-grained image retrieval, we provide extensions employing deep descriptors and ranking loss optimization. We performed extensive evaluations on the MIMIC Chest X-ray (MIMIC-CXR) dataset with images and reports from 227,835 studies. Our results demonstrate performance gains over a typical unimodal CBIR strategy. Further, we show that the performance gains of our approach are robust even in scenarios where only a subset of database images are paired with free-text radiologist reports. Our work has implications for next-generation medical image indexing and retrieval systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    0
    Citations
    NaN
    KQI
    []