Solving Cyber-Alert Allocation Markov Games with Deep Reinforcement Learning

2019 
Companies and organizations typically employ different forms of intrusion detection (and prevention) systems on their computer and network resources (e.g., servers, routers) that monitor and flag suspicious and/or abnormal activities. When a possible malicious activity is detected, one or more cyber-alerts are generated with varying levels of significance (e.g., high, medium, or low). Some subset of these alerts may then be assigned to cyber-security analysts on staff for further investigation. Due to the wide range of potential attacks and the high degrees of attack sophistication, identifying what constitutes a true attack is a challenging problem. In this paper, we present a framework that allows us to derive game-theoretic strategies for assigning security alerts to security analysts. Our approach considers a series of sub-games between the attacker and defender with a state maintained between sub-games. Due to the large sizes of the action and state spaces, we present a technique that uses deep neural networks in conjunction with Q-learning to derive near-optimal Nash strategies for both attacker and defender. We assess the effectiveness of these policies by comparing them to optimal policies obtained from brute force value iteration methods, as well as other sensible heuristics (e.g., random and myopic). Our results show that we consistently obtain policies whose utility is comparable to that of the optimal solution, while drastically reducing the run times needed to achieve such policies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    1
    Citations
    NaN
    KQI
    []