AutoTag: visual domain adaptation for autonomous retail stores through multi-modal sensing

2019 
Autonomous checkout at retail stores could bring a large array of benefits to both consumers -no lines, better user experience- and retailers -lower operational cost, detailed insights about customer behavior. Existing approaches include self-checkout stations, on-item sensing (e.g., RFID) and infrastructure-based sensing (e.g., vision and weight). While each has their own pros and cons, the latter offers a good tradeoff between information richness and operational cost. However, several challenges currently limit their accuracy. In particular, visual item recognition is constrained by the huge amount of training data required and the domain adaptation gap that usually exists between the training -e.g., well-lit environment- and testing distributions -e.g., each store might have different lighting conditions, camera angles, etc. In this preliminary work we explore different ways to leverage multi-modal sensing (e.g., weight load cells on shelves) to automatically label frames from customers picking up or putting down items on the shelf. Then, those annotated frames could be used to continuously expand the initial visual model and tailor it to each store's conditions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    3
    Citations
    NaN
    KQI
    []