Aligned Dual Channel Graph Convolutional Network for Visual Question Answering

Qingbao Huang,Jielong Wei,Yi Cai,Changmeng Zheng,Junying Chen,Ho Fung Leung (梁浩豐),Qing Li

Aligned Dual Channel Graph Convolutional Network for Visual Question Answering

2020

Visual question answering aims to answer the natural language question about a given image. Existing graph-based methods only focus on the relations between objects in an image and neglect the importance of the syntactic dependency relations between words in a question. To simultaneously capture the relations between objects in an image and the syntactic dependency relations between words in a question, we propose a novel dual channel graph convolutional network (DC-GCN) for better combining visual and textual advantages. The DC-GCN model consists of three parts: an I-GCN module to capture the relations between objects in an image, a Q-GCN module to capture the syntactic dependency relations between words in a question, and an attention alignment module to align image representations and question representations. Experimental results show that our model achieves comparable performance with the state-of-the-art approaches.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations