Computational Recovery of Information From Low-quality and Missing Labels

2021 
This chapter focuses on another type of missing data: the classification problem under missing data labels. Missing data labels is a common problem in the context of large-scale data analysis, which challenges traditional supervised learning methods. Especially in life science research, as the data labeling require a strong professional research background and expert experience, the lack of labels is an unavoidable problem for large-scale biological data. This chapter presents a robust information theoretic (RIT) model to reduce the uncertainties, i.e. missing and noisy labels, in general discriminative data representation tasks. The fundamental pursuit of our model is to simultaneously learn a transformation function and a discriminative classifier that maximize the mutual information of data and their labels in the latent space. In this general paradigm, we respectively discuss three types of the RIT implementations with linear subspace embedding, deep transformation and structured sparse learning. In practice, the RIT and deep RIT are exploited to solve the image categorization task whose performances will be verified on various benchmark datasets. The structured sparse RIT is further applied to a medical image analysis task for brain MRI segmentation that allows group-level feature selections on the brain tissues.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    0
    Citations
    NaN
    KQI
    []