GrokNet: Unified Computer Vision Model Trunk and Embeddings For Commerce

Sean Bell,Yiqun Liu,Sami Alsheikh,Yina Tang,Ed Pizzi,Michael Henning,Karun Singh,Omkar Parkhi,Fedor Vladimirovich Borisyuk

GrokNet: Unified Computer Vision Model Trunk and Embeddings For Commerce

2020

Sean Bell
Yiqun Liu
Sami Alsheikh
Yina Tang
Ed Pizzi
Michael Henning
Karun Singh
Omkar Parkhi
Fedor Vladimirovich Borisyuk

In this paper, we present GrokNet, a deployed image recognition system for commerce applications. GrokNet leverages a multi-task learning approach to train a single computer vision trunk. We achieve a 2.1x improvement in exact product match accuracy when compared to the previous state-of-the-art Facebook product recognition system. We achieve this by training on 7 datasets across several commerce verticals, using 80 categorical loss functions and 3 embedding losses. We share our experience of combining diverse sources with wide-ranging label semantics and image statistics, including learning from human annotations, user-generated tags, and noisy search engine interaction data. GrokNet has demonstrated gains in production applications and operates at Facebook scale.

Keywords:

Search engine
Semantics
Embedding
Artificial intelligence
Categorical variable
recognition system
Computer vision
Trunk
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations