Towards Unknown Traffic Identification via Embeddings and Deep Autoencoders

Shuyuan Zhao,Yongzheng Zhang,Yafei Sang

Towards Unknown Traffic Identification via Embeddings and Deep Autoencoders

2019

Traffic classification, as a fundamental tool for network management and security, is suffering from a critical problem, namely “unknown traffic”. The unknown traffic is defined as network traffic generated by previously unknown applications (i.e., zero-day applications) in a traffic classification system. The ability to divide the mixed unknown traffic into clusters, each of which contains only one application traffic as far as possible, is the key to solve this problem. This paper reports our recent exploration of the n-gram embeddings strategy, deep neural networks and clustering algorithms for constructing an unsupervised scheme for unknown network traffic identification. Experimental results on real-world traces demonstrate that our method gains average clustering purity rate about 97.35% when we use DNS, DHCP, BitTorrent, SSH, HTTP, IMAP, MySQL, and Github to simulate unknown traffic.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations