Using Annotation Projection for Semantic Role Labeling of Low-Resourced Language: Sinhala

2020 
We present SinSRL, the first-ever semantic role labeller (SRL) for Sinhala, an Indo-European language spoken mainly in Sri Lanka. SinSRL takes parallel text in English (or any other language for which a suitable SRL exists) and Sinhala and outputs semantically annotated Sinhala text. We have enhanced existing tools to address several issues related to the target language. This will also be useful for labeling other Indic languages. In addition, we have manually semantically labeled a small Sinhala-English parallel dataset. The accuracy of our system is similar to that of manually labeled data. Our implementation can be used to generate a SRL dataset which may be used to train a direct semantic role labeller. SinSRL may be easily modified to annotate other low-resource languages for which parallel corpora are available.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    1
    Citations
    NaN
    KQI
    []