Building an Open Morphological Lexicon and Lemmatizing Old French Texts with the TXM Platform

2017 
This paper presents an experience of lemmatizing Medieval French texts (9th – 15th centuries) with the TXM platform (http://textometrie.org). The project uses available lexical resources to compile an open morphological lexicon of Medieval French (FROLEX), which is used in its turn to perform automatic lemmatization. At the final stage, the lemmas are verified and corrected by a human expert. The methodological solutions proposed and the tools for managing lexicons and applying lemmatization developed for TXM may be used for processing other languages, especially those with high variation in spelling and word segmentation practices.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []