Language Identification in Code-Switching Scenario
2014
This paper describes a CRF based token level language identification system entry to Language Identification in CodeSwitched (CS) Data task of CodeSwitch 2014. Our system hinges on using conditional posterior probabilities for the individual codes (words) in code-switched data to solve the language identification task. We also experiment with other linguistically motivated language specific as well as generic features to train the CRF based sequence labeling algorithm achieving reasonable results.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
10
References
14
Citations
NaN
KQI