Evaluating data distribution and drift vulnerabilities of machine learning algorithms in secure and adversarial environments
2014
Machine learning is continuing to gain popularity due to its ability to solve problems that are difficult to model using
conventional computer programming logic. Much of the current and past work has focused on algorithm development,
data processing, and optimization. Lately, a subset of research has emerged which explores issues related to security.
This research is gaining traction as systems employing these methods are being applied to both secure and adversarial
environments. One of machine learning’s biggest benefits, its data-driven versus logic-driven approach, is also a
weakness if the data on which the models rely are corrupted. Adversaries could maliciously influence systems which
address drift and data distribution changes using re-training and online learning. Our work is focused on exploring the
resilience of various machine learning algorithms to these data-driven attacks. In this paper, we present our initial
findings using Monte Carlo simulations, and statistical analysis, to explore the maximal achievable shift to a
classification model, as well as the required amount of control over the data.
Keywords:
- Active learning (machine learning)
- Online machine learning
- Intrusion detection system
- Adversarial machine learning
- Computational learning theory
- Machine learning
- Stability (learning theory)
- Monte Carlo method
- Algorithm
- Artificial intelligence
- Computer science
- Computer programming
- Theoretical computer science
- Hidden Markov model
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
8
References
1
Citations
NaN
KQI