Efficient Text Processing via Context Triggered Piecewise Hashing Algorithm for Spam Detection
2020
In our research, we have examined millions of spam messages and have developed a technology called Spam Term Generator. This technology uses mix of CTPH (Context Triggered Piecewise Hashing), DBSCAN (Density-Based Spatial Clustering of Applications with Noise) and LCS algorithm (Longest Common Substring) to automatically determine almost similar spam messages and extract repetitive text pieces in large collections of spam texts.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
7
References
0
Citations
NaN
KQI