Classification of Arithmetic Sentences Expressed in Natural Language using HMM

2019 
This paper aims to classify mathematical sentences extracted from natural language. Mathematical sentences are collected from oral speeches and written documents for being subsequently stored as a linguistic Corpus. Mathematical sentences refer to operations involving any of the four basic arithmetic operations, addition, subtraction, multiplication and division. In this paper, Hidden Markov Models, HMM, are used for the identification of these basic operations and their classification. Before training the HMM, the linguistic Corpus was cleaned up using the Levenshtein distance. In order to have observable symbols two processes are applied. First, a symbol (label) is assigned to each word found in a sentence so that the length of the sequence of observable symbols is directly related to the number of the words of the sentence. Second, for each sentence the entropy of each word is obtained, aiming to obtain sequences of observable symbols with equal length. After training the models, the tests with HMM were evaluated using the metrics of Fl-score, accuracy, recall and precision.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []