Latent Semantics Approach for Network Log Analysis: Modeling and its application
2021
Network log analysis helps network operators to troubleshoot their network. Many mathematical analysis methods rely on a set of time series corresponding to log type (log template) per device, as their input. However, they do not take full advantage of the meaning of logs despite log messages containing semantic information written in a free format. In this paper, we emphasize the use of this rich semantic information for network log analysis. More specifically, we propose an unsupervised latent semantics-based network log analysis. The key idea of the work is to build a latent semantics model of unobservable network functionalities (e.g., routing protocols, hardware) from generated logs, instead of inferring what is happening in the network from a network operator’s knowledge with log messages. This approach enables us to numerically cluster/compare the meaning of logs because each log message obtains a numerical distributed representation (a topic distribution). We discuss the validity of our approach with a set of logs collected at a nation-wide academic network for a year. We first show that our approach outperforms a popular data-driven approach (i.e., word2vec), which does not require any assumptions on the data, by evaluating the quality of the distributed representation of logs. Furthermore, through two network analysis scenarios, we demonstrate several benefits of our approach: intuitive interpretability of analysis results and bridging the gap between multi-vendors log messages.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
30
References
1
Citations
NaN
KQI