Inferring finer-grained human information with multi-modal cross-granularity learning: PhD forum abstract.

2020 
Existing machine learning algorithms for human information inference are typically data-driven models trained on carefully labeled datasets. Given the significant labeling effort, traditional pure data-driven approaches are challenging to implement for emerging smart applications requiring long-term finer-grained information. Taking activities of daily life (ADL) tracking for elders as an example, prior work mostly focused on context-level information learning such as cooking and cleaning. [8]. However, new applications such as evaluating elders' cognitive impairments progress by tracking their ADL engagement requires finer-grained, i.e., action-level information [7]. In practice, labeling the day-length data at such granularity can be very expensive and requires a lot of human efforts [9]. My research focuses on the inference problems in the scope of human physical condition monitoring and activity recognition with limited labeled data. To alleviate the effort of labeling large amounts of data, prior works on semi-supervised learning combine a small amount of labeled data with a large amount of unlabeled data to train the model. However, as the label granularity (number of classes) increasing, the difficulty to distinguish nuance distinctions between finer-grained classes escalates as well. This makes training a robust semi-supervised model for finer-grained classification with less labels difficult if not impossible. Fortunately, coarse-grained (context-level) labels is usually available or cheaper to obtain in practice. In this case, the multi-granularity hierarchy between finer and coarse labels follows the aggregation relation defined in [5]. This hierarchical relation can be leveraged in the tasks of inferring finer-grained information. In addition, it is illustrated by the previous study that co-located multi modality sensing systems capture complementary aspects of the same event [6]. The research question I focus on is how to infer finer-grained human information with coarse-grained labeled data leveraging complementary multi-modal sensing? I target three directions: 1) a cross-granularity semi-supervised setting: how to utilize coarse-grained labeled data with a small amount of finer-grained labeled data to infer finer-grained human information, 2) cross-granularity relationship learning: how to learn the multi-granularity class hierarchy from data and further help the finer-grained human information acquisition, 3) enhancing inference granularity by leveraging multi-model sensing: how to leverage the complimentary co-located multiple sensing modalities to accurately infer finer-grained human information?
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    0
    Citations
    NaN
    KQI
    []