Normalisasi Kata Tidak Baku yang Tidak Disingkat dengan Jarak Perubahan

I Gusti Bagus Baskara Nugraha,Rafi Dwi Rizqullah

Normalisasi Kata Tidak Baku yang Tidak Disingkat dengan Jarak Perubahan

2019

I Gusti Bagus Baskara Nugraha
Rafi Dwi Rizqullah

Voice assistant technology is growing rapidly and its use has begun to spread to daily use. However, voice assistant usages are still limited to standard conversation languages. Meanwhile, Indonesian people are accustomed to informal language in daily conversation. This research gives solution to overcome the problem of voice assistants with informal words or words that will not be found in formal word dictionary. We propose text normalization using Levenshtein distance. Test result shows that normalization using Levenshtein distance outperform the normalization using Longest Common Subsequence (LCS) distance with accuracy difference of 8.34%.

Keywords:

Text normalization
Natural language processing
Indonesian
Levenshtein distance
Artificial intelligence
Computer science
Longest common subsequence problem
Normalization (statistics)
Conversation
voice assistant

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations