Showcasing Deeply Supervised Multimodal Attentional Translation Embeddings: a Demo for Visual Relationship Detection

2019 
We address the task of Visual Relationship Detection, i.e. the detection of triplets in an image, introducing Multimodal Attentional Translation Embeddings (ICIP 2019 paper, id 3642). Motivated by the need of visualization and interpretation of the results, as well as the lack of other tools for online predictions on this task, we design and implement the first architecture for live inference of visual relationships on video streams and wild images, including research and engineering extensions, ablation models and a lightweight CPU-version. The code is available at https://bitbucket.org/deeplabai/vrd.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []