Leveraging the Twitch Platform and Gamification to Generate Home Audio Datasets

2021 
Training AI systems requires large datasets. While there are a range of existing methods for collecting such data, such as paid work on crowdsourcing platforms, the strengths and weaknesses of each method leads us to believe that new, complementary methods are needed. The Polyphonic project contributes a novel method for collecting real-world data by piggybacking on game streaming communities such as Twitch, which capture over a trillion minutes of viewer attention a year. By embedding activities within the sociotechnical context of the stream, we can leverage some of this attention for data collection and processing. In this paper, we describe the design and implementation of a proof-of-concept system for collecting home audio data. We conducted a field study in four live streams and found that our proof-of-concept effectively supports data capture. We also contribute further design insights about stream-based data collection systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    58
    References
    0
    Citations
    NaN
    KQI
    []