Cluster Analysis of Passive DNS Features for Identifying Domain Shadowing Infrastructure

2020 
Illicitly-registered subdomains are often trusted by users since they typically inherit the trust and reputation of their parent (apex) domains. This trust can be leveraged by attackers to deliver malware, obtain credentials, or redirect users to exploit kits. Classifiers have been successfully used to identify shadowed domains; however, these approaches stop short of identifying associations among these domains that may correlate to a specific activity group or exploit kit. The identification of adversary infrastructure and the tools common to that infrastructure is helpful in developing possible mitigation strategies.This paper investigates how cluster analysis can be used to identify shadowed versus non-shadowed domains as well as group shadowed domains into meaningful clusters for tracking adversaries. Features are engineered primarily from PDNS data, with which a probabilistic model is built to identify sub-populations within the overall population of shadowed domains. Cluster labeling is then performed, using known exploit kit associations with the shadowed domains as a proxy for an activity group. Experimental results using PDNS information from three different exploit kit campaigns and a randomly selected set of non-shadowed domains show that non-shadowed domains generally cluster separately from shadowed. Furthermore, when only considering shadowed domains, the clusters largely consist of domains associated with a single exploit kit, illustrating that the features engineered from PDNS data are useful not only for identifying shadowed domains, but also for grouping infrastructure that is likely related.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    0
    Citations
    NaN
    KQI
    []