Online Rare Category Identification and Data Diversification for Edge Computing

2021 
Identifying rare categories is an important data management problem in many application fields including video surveillance, ecological environment monitoring and precision medicine. Previous solutions in literature require all data instances to be first delivered to the server. Then, the rare category identification algorithms are executed on the pool of data to find informative instances for human annotators to label. This incurs large bandwidth consumption and high latency. To deal with the problems, we propose a light-weight rare category identification framework. At the sensor side, the designed online algorithm filters less informative data instances from the data stream and only sends the informative ones to servers for annotating. After labeling, the server only sends labels of the corresponding data instances in response. The sensor-side algorithm is extended to enable cooperation between embedded devices for the cases that data is collected in a distributed manner. For enhancing diversity of selected data, a representative selection algorithm is proposed to run during the idle time of the system or after the execution of rare category identification algorithm. Experiments are conducted to show our framework dramatically outperforms the baseline. The network traffic is reduced by 75% on average.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []