Large-Scale Datasets for Going Deeper in Image Understanding

2019 
Recently, extensive efforts have been devoted to computer vision and machine learning by exploiting big data to explore many practical applications. However, these research fields are still quite limited not only by the sheer volume, but also the versatility and diversity, of the available datasets. In this paper, we target at four challenging and yet important computer vision tasks, namely, human-centered scene classification, attribute based zero-shot learning (recognition), human keypoint detection and image Chinese captioning. Four novel large-scale datasets are collected and annotated to facilitate these tasks of deeper image understanding. Labels, bounding boxes, attributes, keypoints and captions are annotated in corresponding datasets. These rich annotations bridge the semantic gap between low-level images and high-level concepts. Extensive experiments on baseline methods have been implemented and compared, which show that these learning tasks on our datasets are still challenging.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    8
    Citations
    NaN
    KQI
    []