Efficient Text Processing via Context Triggered Piecewise Hashing Algorithm for Spam Detection

2020 
In our research, we have examined millions of spam messages and have developed a technology called Spam Term Generator. This technology uses mix of CTPH (Context Triggered Piecewise Hashing), DBSCAN (Density-Based Spatial Clustering of Applications with Noise) and LCS algorithm (Longest Common Substring) to automatically determine almost similar spam messages and extract repetitive text pieces in large collections of spam texts.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    0
    Citations
    NaN
    KQI
    []