GANgster: A Fraud Review Detector based on Regulated GAN with Data Augmentation

2020 
Financial implications of written reviews provide great incentives for businesses to pay fraudsters to write or use bots to generate fraud reviews. The promising performance of Deep Neural Networks (DNNs) in text classification, has attracted research to use them for fraud review detection. However, the lack of trusted labeled data has limited the performance of the current solutions in detecting fraud reviews. Unsupervised and semi-supervised methods are among the most applicable methods to deal with the data scarcity problem. Generative Adversarial Network (GAN) as a semi-supervised method has demonstrated to be effective for data augmentation purposes. The state-of-the-art solution utilizes GAN to overcome the data limitation problem. However, it fails to incorporate the behavioral clues in both fraud generation and detection. Besides, the state-of-the-art approach suffers from a common limitation in the training convergence of the GAN, slowing down the training procedure. In this work, we propose a regularised GAN for fraud review detection that makes use of both review text and review rating scores. Scores are incorporated through Information Gain Maximization in to the loss function for two reasons. One is to generate near-authentic and more human like score-correlated reviews. The other is to improve the stability of the GAN. Experimental results have shown better convergence of the regulated GAN. In addition, the scores are also used in combination with word embeddings of review text as input for the discriminators for better performance. Results show that the proposed framework relatively outperformed existing state-of-the-art framework; namely FakeGAN; in terms of AP by 7%, and 5% on the Yelp and TripAdvisor datasets, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    2
    Citations
    NaN
    KQI
    []