Heuristic Methods for Filtering Newly Coined Profanities Using Phylogenetic Analysis
2010
We proposed a smart filtering system for newly coined profanities, using approximate string searching and sequence alignment. However there are a lot of coined profanities. For example, game portal Nexon has a forbidden word list of 60,000 words, so even our system still requires too much computational time for application to a real-time chat system. Therefore we need to manage a profanity database, discard redundancy and divide the elements into several groups by priority. In this paper, we propose a management algorithm for a profanity database. We use phylogenetic analysis, make evolution trees and classify profanities. We compare input words and a root of a group. We decrease the elements of the database from 6302 to 2229.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
13
References
0
Citations
NaN
KQI