Unsupervised Machine Learning by Graph Analytics on Heterogeneous Network Device Data

2018 
Abstract We explored unsupervised machine learning algorithms, specifically graph analytics, applied to behaviors observed in heterogeneous network sensor data for discovering anomalous behavior that could include novel attacks. In addition, we explored the potential difficulties with applying unsupervised machine learning approaches to anomaly detection in a network-defense context to understand how to integrate inherently imperfect anomaly-detection approaches into the workflow of a cyber defense infrastructure. Two general approaches can be used to discover anomalies: (1.) detecting rarity, i.e. , finding those activities that are observed the least frequently in a set of observations, and (2.) detecting novelty, i.e. , finding activities with the lowest estimated probability of observation based on prior observations of baseline (presumably “normal”) data. This effort will describe the case of detecting rarity. In this paper, we describe the entire pipeline starting from explaining the data used, the data ingest, the quantization of features, application of graph analytics to the data, post-processing to reduce results, and measuring the performance. A network-penetration experiment was setup to conduct the network attacks and generate the data that is the input to this work. Baseline methods are proposed and compared to the main method that is described in this paper.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    3
    Citations
    NaN
    KQI
    []