Modeling the behavior of web crawlers on a web resource

2020 
In this paper, we present a study of web-crawlers behavior depending on the web resource. We provide a simulation model of web-crawler which is necessary for web robots detection techniques and dataset generation for methods based on machine learning. We analyze differences of behavior among humans, common crawlers, malicious crawlers and demonstrate that their models can be used for behavior analysis. We show that malicious crawlers behave similar to common crawlers and their behavior can be simulated to obtain necessary datasets and traffic patterns for the further detection and protection against unethical crawling. Our results and observations can be used as a basis of comprehensive intrusion detection and prevention system development.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    0
    Citations
    NaN
    KQI
    []