IP2Vec: Learning Similarities Between IP Addresses

2017 
IP Addresses are a central part of packet- and flow-based network data. However, visualization and similarity computation of IP Addresses are challenging to due the missing natural order. This paper presents a novel similarity measure IP2Vec for IP Addresses that builds on ideas from Word2Vec, a popular approach in text mining. The key idea is to learn similarities by extracting available context information from network data. IP Addresses are similar if they appear in similar contexts. Thus, IP2Vec is automatically derived from the given network data set. The proposed approach is evaluated experimentally on two public flow-based data sets. In particular, we demonstrate the effectiveness of clustering IP Addresses within a botnet data set. In addition, we use visualization methods to analyse the learned similarities in more detail. These experiments indicate that IP2Vec is well suited to capture the similarity of IP Addresses based on their network communications.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    37
    Citations
    NaN
    KQI
    []