Finding Entities and Related Facts in Newspaper

Jaimel de Oliveira Lima,Cristiano da Silveira Colombo,Flavio Izo,Elias Oliveira,Claudine Badue

Finding Entities and Related Facts in Newspaper

2020

Information production is increasing very fast. Most of this information is in free text format, and to extract meaningful knowledge is a difficult task. Many techniques can help with the problem of processing a large amount of data and its relations. One of these tasks is Relation Extraction (RE). RE is a Natural Language Processing (NLP) task and can be defined as the extraction of relations among two or more entities. Besides, semantic relation extraction, sentiment analysis, opinion mining, question answering are areas that may apply RE to ease their processing. In our work, we propose to use RE to find entities and related facts in newspaper articles. To carry out this task, we segment the text into sentences. Withing each sentence, we tokenized the terms and extracted their dependencies by using the spaCy tool. Moreover, we applied the Named Entities Recognition (NER) to extract some of the entities-classes. And finally, we use an inductive logic programming-based model to model some logic relations we find within sentences. To train our model, we defined a proportion for training and tests from the newspaper corpus to evaluate our solution by comparing the annotated relations against that a human has done in the same dataset. The results show a competitive model for Relation Extraction in Portuguese.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations