Speeding up corpus development for linguistic research: language documentation and acquisition in Romansh Tuatschin

Géraldine Walther,Benoît Sagot

Speeding up corpus development for linguistic research: language documentation and acquisition in Romansh Tuatschin

2017

Géraldine Walther
Benoît Sagot

In this paper, we present ongoing work for developing language resources and basic NLP tools for an undocumented variety of Romansh, in the context of a language documentation and language acquisition project. Our tools are designed to improve the speed and reliability of corpus annotations for noisy data involving large amounts of code-switching, occurrences of child speech and orthographic noise. Being able to increase the efficiency of language resource development for language documentation and acquisition research also constitutes a step towards solving the data sparsity issues with which researchers have been struggling.

Keywords:

Language documentation
Language acquisition
Natural language processing
Noisy data
Speech recognition
Linguistics
Artificial intelligence
Computer science
resource development

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations