logo
    Machine learning based Insider Threat Modelling and Detection
    25
    Citation
    0
    Reference
    20
    Related Paper
    Citation Trend
    The threat of malicious insider activity continues to be of paramount concern in both the public and private sectors. Though there is great interest in advancing the state of the art in predicting and stopping these threats, the difficulty of obtaining suitable data for research, development, and testing remains a significant hinderance. We outline the use of synthetic data to enable progress in one research program, while discussing the benefits and limitations of synthetic insider threat data, the meaning of realism in this context, as well as future research directions.
    Insider threat
    Citations (255)
    Despite extensive research within the security community, the objective of finding scalable and accurate solutions to the problem posed by corporate insider threats has never been greater. Continuous and growing challenges exist for corporation and government entities to protect their environment from malicious and inadvertent exposure of sensitive data from within their own organisation. In this research paper, we examine the use of Machine Learning Algorithms, building advanced feature definition, analysing and classifying data patterns in user activity to put forward a conceptual model to mitigate the risk of data loss due to insider threats. This work extends existing research in the area of using anomaly detection and classification learning algorithms. We evaluate supervised learning, crossing threat features for classification and building data profiles for employees. We use existing industry CERT datasets on insider threats along with synthetic injected data into the Apache Spark Machine Learning library. Our objective is to analyse through developing advanced feature extractions that the accuracy levels support the detection of suspicious activity. This will be evaluated by building an application to analyse streaming data and design a feature classifier merging real time activity with historic patterns for an individual employee. The result of the research puts forward a method for developing activity feature extraction, shows an average level of accuracy level of 99.984%.
    Insider threat
    Feature (linguistics)
    Citations (0)
    Insider threat represents a major cybersecurity challenge to companies and government agencies. The challenges in insider threat detection include unbalanced data, limited ground truth, and possible user behaviour changes. This research presents an unsupervised machine learning (ML) based anomaly detection approach for insider threat detection. We employ two ML methods with different working principles, specifically auto-encoder and isolation forest, and explore various representations of data with temporal information. Evaluation results show that the approach allows learning from unlabelled data under adversarial conditions for insider threat detection with a high detection and a low false positive rate. For example, 60% of malicious insiders are detected under 0.1% investigation budget. Furthermore, we explore the ability of the proposed approach to generalize for detecting unseen anomalous behaviours in different datasets, i.e. robustness. Comparisons with other work in the literature confirm the effectiveness of the proposed approach.
    Insider threat
    Robustness
    Isolation
    Analysis of an organization's computer network activity is a key component of early detection and mitigation of insider threat, a growing concern for many organizations. Raw system logs are a prototypical example of streaming data that can quickly scale beyond the cognitive power of a human analyst. As a prospective filter for the human analyst, we present an online unsupervised deep learning approach to detect anomalous network activity from system logs in real time. Our models decompose anomaly scores into the contributions of individual user behavior features for increased interpretability to aid analysts reviewing potential cases of insider threat. Using the CERT Insider Threat Dataset v6.2 and threat detection recall as our performance metric, our novel deep and recurrent neural network models outperform Principal Component Analysis, Support Vector Machine and Isolation Forest based anomaly detection baselines. For our best model, the events labeled as insider threat activity in our dataset had an average anomaly score in the 95.53 percentile, demonstrating our approach's potential to greatly reduce analyst workloads.
    Insider threat
    Interpretability
    Component (thermodynamics)
    Citations (107)
    Insider threat is a severe security risk that tends to cause enormous financial losses and damages for organizations. Many approaches have been proposed to detect and mitigate insider threat. However, implementing an effective detection system is still a challenging task. In this paper, we propose a hybrid intelligent system for insider threat detection that aims to realize more effective detection of security incidents by incorporating multiple complementary detection techniques, such as entity portrait, rule matching and iterative attention. The system takes as input multi-domain heterogeneous event logs, psychological data and functional information that are available in the targeted organization. With both consideration of subjective and objective factors, the proposed system captures comprehensive information of events by building entity portraits. Subsequently, we perform insider threat detection by rule matching and iterative attention that can not only quickly detect known attacks but can also identify stealthy malicious activities at an early stage. We evaluated the proposed system using the CERT r4.2 insider threat dataset. Experimental results show that the hybrid intelligent system achieves a significant improvement compared with the state-of-the-art detection approach in terms of AUC and early detection scores.
    Insider threat
    Citations (7)
    A malicious insider is one of the most damaging threats to any organization from industry to government agencies. Many challenges from insider threat detection come from the fact that the ground truth is very limited and costly to acquire. This paper presents a semi-supervised learning approach to insider threat detection. We employ three machine learning methods under different real-world conditions. These include obtaining the initial ground truth training data randomly or via a certain type of insider malicious behavior or by anomaly detection system scores. Evaluation results show that the approach allows learning from very limited data for insider threat detection at high precision. 90% of malicious data instances are detected under 1% false positive rate.
    Insider threat
    Supervised Learning
    Labeled data
    Citations (10)
    Insider threat is one of the most damaging cyber security attacks to companies and organizations. In this paper, we explore different techniques to leverage spatial and temporal characteristics of user behaviours for insider threat detection. In particular, feature normalization (scaling) techniques and a scheme for representing explicit temporal information are explored to improve the performance of the machine learning based insider threat detection. The results show that these data characteristics have different effects on different classifiers, where Standard Scaler with Random Forest classifier produces the best performance.
    Insider threat
    Normalization
    Leverage (statistics)