Toward practical defense against traffic analysis attacks on encrypted DNS traffic

2023 
The primary goal of the DNS-over-HTTPS (DoH) protocol is to address users’ privacy concerns regarding on-path adversaries, including local ISPs, who observe DNS traffic to learn web browsing activities even when the web traffic is not observable. To achieve this goal, in DoH protocol, DNS traffic between a DNS client and its local DNS resolver is channeled through an encrypted HTTPS tunnel. However, as shown in previous studies, adversaries can still infer users’ web browsing activities from encrypted DoH traffic through machine-learning-based traffic analysis (TA) attacks. These attacks rely on unencryptable features in the DNS footprint of a website, including query/response lengths and counts and delays between subsequent queries (timing), to identify which website is being visited. To defend against such TA attacks, existing DoH clients and resolvers pad DNS queries and responses with null bytes to specific block sizes before encrypting them, as recommended in RFC 8467. However, as shown by prior research and our current work, this padding alone is not effective in defeating TA attacks, and even with this padding, attackers can still achieve over 95% accuracy.In this paper, we propose a novel client-side obfuscation approach to defeat TA attacks on the DoH footprint of websites. Using a combination of (1) compression-aware padding, (2) fake query injection, and (3) random delaying of queries, our approach obfuscates query/response lengths, counts, and timings. Our approach does not require changes to existing protocols and is incrementally deployed on client machines as a local proxy. Using a robust classifier with a comprehensive set of over 150 features derived from recently proposed traffic analysis attacks on encrypted DNS traffic and experimenting with the Cloudflare and Google resolvers and 200 target websites, we show that while with the existing padding-only obfuscation, the TA attacker can achieve over 95.58% accuracy, our obfuscation algorithm can degrade this accuracy to below 9% with reasonable overhead.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []