Towards Math Terms Disambiguation Using Machine Learning.

2021 
Word disambiguation has been an important task in natural language processing. However, the problem of disambiguation is still less explored in mathematical text. Similar to natural languages, some math terms are not assigned a unique interpretation. As math text is an important part of the scientific literature, an accurate and efficient way of performing disambiguation of math terms will be a significant contribution. In this paper, we present some investigations on math-term disambiguation using machine learning. All experimental data are selected from the DLMF dataset. Our experiments consist of 3 steps: (1) create a labeled dataset of math equations (from the DLMF) where the instances are (math token, token meaning) pairs, grouped by equation; (2) build machine learning models and train them using our labeled dataset, and (3) evaluate and compare the performance of our models using different evaluation metrics. Our results show that machine learning is an effective approach to math-term disambiguation. The accuracy of our models ranges from 70% to 85%. There is potential for considerable improvements once we have much larger labeled datasets with more balanced classes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []