UMCC_DLSI_SemSim: Multilingual System for Measuring Semantic Textual Similarity

2014 
In this paper we describe the specifications and results of UMCC_DLSI system, which was involved in Semeval-2014 addressing two subtasks of Semantic Textual Similarity (STS, Task 10, for English and Spanish), and one subtask of Cross-Level Semantic Similarity (Task 3). As a supervised system, it was provided by different types of lexical and semantic features to train a classifier which was used to decide the correct answers for distinct subtasks. These features were obtained applying the Hungarian algorithm over a semantic network to create semantic alignments among words. Regarding the Spanish subtask of Task 10 two runs were submitted, where our Run2 was the best ranked with a general correlation of 0.807. However, for English subtask our best run (Run1 of our 3 runs) reached 16 th place of 38 of the official ranking, obtaining a general correlation of 0.682. In terms of Task 3, only addressing Paragraph to Sentence subtask, our best run (Run1 of 2 runs) obtained a correlation value of 0.760 reaching 3 rd place of 34.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    4
    Citations
    NaN
    KQI
    []