Dense Captioning Using Abstract Meaning Representation

2020 
The world around us is composed of images that often need to be translated into words. This translation can take place in parts, converting regions of the image into textual descriptions what is also known as dense captioning. By doing so, the information present in this region is converted into words expressing the way objects relate to each other. Computational models have been proposed to perform this task in a similar way to human beings, mainly using deep neural networks. As the same region of the image can be described in several different forms, this study proposes to use the Abstract Meaning Representation (AMR) in the generation of descriptions for a given region. We hypothesize that by using AMR it would be possible to extract the meaning of the text and, as a consequence, improve the quality of the sentences produced by the models. AMR was investigated as a semantic representation formalism evolving the so far proposed models that are based only on purely natural language. The results show that the models trained with sentences in the form of AMR led to better descriptions and the performance achieved was superior in almost all evaluations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []