Showcasing Deeply Supervised Multimodal Attentional Translation Embeddings: a Demo for Visual Relationship Detection

Nikolaos Gkanatsios,Vassilis Pitsikalis,Petros Koutras,Athanasia Zlatintsi,Petros Maragos

Showcasing Deeply Supervised Multimodal Attentional Translation Embeddings: a Demo for Visual Relationship Detection

2019

Nikolaos Gkanatsios
Vassilis Pitsikalis
Petros Koutras
Athanasia Zlatintsi
Petros Maragos

We address the task of Visual Relationship Detection, i.e. the detection of triplets in an image, introducing Multimodal Attentional Translation Embeddings (ICIP 2019 paper, id 3642). Motivated by the need of visualization and interpretation of the results, as well as the lack of other tools for online predictions on this task, we design and implement the first architecture for live inference of visual relationships on video streams and wild images, including research and engineering extensions, ablation models and a lightweight CPU-version. The code is available at https://bitbucket.org/deeplabai/vrd.

Keywords:

Human–computer interaction
Computer vision
Computer science
Artificial intelligence
Visualization
Task analysis
Inference
Architecture
Server

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations