Language Identification in Code-Switching Scenario

Naman Jain,Riyaz Ahmad Bhat

Language Identification in Code-Switching Scenario

2014

Naman Jain
Riyaz Ahmad Bhat

This paper describes a CRF based token level language identification system entry to Language Identification in CodeSwitched (CS) Data task of CodeSwitch 2014. Our system hinges on using conditional posterior probabilities for the individual codes (words) in code-switched data to solve the language identification task. We also experiment with other linguistically motivated language specific as well as generic features to train the CRF based sequence labeling algorithm achieving reasonable results.

Keywords:

Speech recognition
Posterior probability
Sequence labeling
Language identification
Code-switching
Security token
Pattern recognition
Computer science
Artificial intelligence
Natural language processing

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations