Crowdsourced Annotations as an Additional Form of Data Augmentation for CAD Development

2017 
Annotations are critical for machine learning and developing Computer Aided Detection (CAD) algorithms. However, a majority of medical data is either unlabeled or annotated only at the image-level. This poses a problem specifically for employing deep learning based approaches for CAD development as they require large amounts of annotated data for training. Data augmentation is a popular solution to address this need. We explore crowdsourcing as a solution for training a deep neural network (DNN) for lesion detection. Our solution employs a strategy to overcome the noisy nature of crowdsourced annotations by i) assigning a reliability factor for each subject of the crowd based on their performance (at global and local levels) and experience and ii) requiring region of interest (ROI) markings rather than pixel-level markings from the crowd. We present a solution for training the DNN with data drawn from a heterogeneous mixture of annotations, namely, very limited number of pixel-level markings by experts and crowdsourced ROI markings. Experimental results obtained for hard exudate detection from color fundus images show that training with processed/refined crowdsourced data is effective as detection performance improves by 25% over training with just expert-markings and by 11% over training with annotation derived using majority voting among the crowd.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []