Software Dynamic Translation (SDT) is used for instrumentation, optimization, security, and many other uses. A major source of SDT overhead is the execution of code to translate an indirect branch's target address into the translated destination block's address. This article discusses sources of Indirect Branch (IB) overhead in SDT systems and evaluates techniques for overhead reduction. Measurements using SPEC CPU2000 show that the appropriate choice and configuration of IB translation mechanisms can significantly reduce the overhead. Further, cross-architecture evaluation of these mechanisms reveals that the most efficient implementation and configuration can be highly dependent on the architecture implementation.
Dynamic binary translators (DBT) have recently attracted much attention for embedded systems. The effective implementation of DBT in these systems is challenging due to tight constraints on memory and performance. A DBT uses a software-managed code cache to hold blocks of translated code. To minimize overhead, the code cache is usually large so blocks are translated once and never discarded. However, an embedded system may lack the resources for a large code cache. This constraint leads to significant slowdowns due to the retranslation of blocks prematurely discarded from a small code cache. This paper addresses the problem and shows how to impose a tight size bound on the code cache without performance loss. We show that about 70% of the code cache is consumed by instructions that the DBT introduces for its own purposes. Based on this observation, we propose novel techniques that reduce the amount of space required by DBT-injected code, leaving more room for actual application code and improving the miss ratio. We experimentally demonstrate that a bounded code cache can have performance on-par with an unbounded one.
Research and development of techniques which detect or remediate malicious network activity require access to diverse, realistic, contemporary data sets containing labeled malicious connections. In the absence of such data, said techniques cannot be meaningfully trained, tested, and evaluated. Synthetically produced data containing fabricated or merged network traffic is of limited value as it is easily distinguishable from real traffic by even simple machine learning (ML) algorithms. Real network data is preferable, but while ubiquitous is broadly both sensitive and lacking in ground truth labels, limiting its utility for ML research.
This paper presents a multi-faceted approach to generating a data set of labeled malicious connections embedded within anonymized network traffic collected from large production networks. Real-world malware is defanged and introduced to simulated, secured nodes within those networks to generate realistic traffic while maintaining sufficient isolation to protect real data and infrastructure. Network sensor data, including this embedded malware traffic, is collected at a network edge and anonymized for research use.
Network traffic was collected and produced in accordance with the aforementioned methods at two major educational institutions. The result is a highly realistic, long term, multi-institution data set with embedded data labels spanning over 1.5 trillion connections and over a petabyte of sensor log data. The usability of this data set is demonstrated by its utility to our artificial intelligence and machine learning (AI/ML) research program.
Current software development methodologies and practices, while enabling the production of large complex software systems, can have a serious negative impact on software quality. These negative impacts include excessive and unnecessary software complexity, higher probability of software vulnerabilities, diminished execution performance in both time and space, and the inability to easily and rapidly deploy even minor updates to deployed software, to name a few. Consequently, there has been growing interest in the capability to do late-stage software (i.e., at the binary level) manipulation to address these negative impacts. Unfortunately, highly robust, late-stage manipulation of arbitrary binaries is difficult due to complex implementation techniques and the corresponding software structures. Indeed, many binary rewriters have limitations that constrain their use. For example, to the best of our knowledge, no binary rewriters handle applications that include and use exception handlers-a feature used in programming languages such as C++, Ada, Common Lisp, ML, to name a few.
Modern computers have taken advantage of the instruction-level parallelism (ILP) available in programs with advances in both architecture and compiler design. Unfortunately, large amounts of ILP hardware and aggressive instruction scheduling techniques put great demands on a machine's register resources. With increasing ILP, it becomes difficult to maintain a single monolithic register bank and a high clock rate. To provide support for large amounts of ILP while retaining a high clock rate, registers can be partitioned among several different register banks. Each bank is directly accessible by only a subset of the functional units with explicit inter-bank copies required to move data between banks. Therefore, a compiler must deal not only with achieving maximal parallelism via aggressive scheduling, but also with data placement to limit inter-bank copies. Our approach to code generation for ILP architectures with partitioned register resources provides flexibility by representing machine dependent features as node and edge weights and by remaining independent of scheduling and register allocation methods. Experimentation with our framework has shown a degradation in execution performance of 10% on average when compared to an unrealizable monolithic-register-bank architecture with the same level of ILP.
Recently, coordinated attack campaigns started to become more widespread on the Internet. In May 2017, WannaCry infected more than 300,000 machines in 150 countries in a few days and had a large impact on critical infrastructure. Existing threat sharing platforms cannot easily adapt to emerging attack patterns. At the same time, enterprises started to adopt machine learning-based threat detection tools in their local networks. In this paper, we pose the question: \emph{What information can defenders share across multiple networks to help machine learning-based threat detection adapt to new coordinated attacks?} We propose three information sharing methods across two networks, and show how the shared information can be used in a machine-learning network-traffic model to significantly improve its ability of detecting evasive self-propagating malware.
Coverage-guided fuzzing is a powerful technique for finding security vulnerabilities and latent bugs in software. Such fuzzers usually store the coverage information in a small bitmap. Hash collision within this bitmap is a well-known issue and can reduce fuzzers' ability to discover potential bugs. Prior works noted that collision mitigation with naïvely enlarging the hash space leads to an unacceptable runtime overhead. This paper describes BigMap, a two-level hashing scheme that enables using an arbitrarily large coverage_bitmap with low overhead. The key observation is that the overhead stems from frequent operations performed on the full bitmap, although only a fraction of the map is actively used. BigMap condenses these scattered active regions on a second bitmap and limits the operations only on that condensed area. We implemented our approach on top of the popular fuzzer AFL and conducted experiments on 19 benchmarks from FuzzBench and OSS-Fuzz. The results indicate that BigMap does not suffer from increased runtime overhead even with large map sizes. Compared to AFL, BigMap achieved an average of 4.5x higher test case generation throughput for a 2MB map and 33.1x for an 8MB map. The throughput gain for the 2MB map increased further to 9.2x with parallel fuzzing sessions, indicating superior scalability of BigMap. More importantly, BigMap's compatibility with most coverage metrics, along with its efficiency on bigger maps, enabled exploring aggressive compositions of expensive coverage metrics and fuzzing algorithms, uncovering 33% more unique crashes. BigMap makes using large bitmaps practical and enables researchers to explore a wider design space of coverage metrics
Randomizing software characteristics can help thwart cyberattacks by denying critical information about a target system previously known to an attacker.
N-variant systems protect software from attack by executing multiple variants of a single program in parallel, checking regularly that they are behaving consistently. The variants are designed to behave identically during normal operation and differently during an attack. When different behavior (divergence) is detected, N-variant systems self-heal by either rolling back to a safe state or restarting. Unfortunately, an attacker can create a denial-of-service (DoS) attack from a diverging input by using it to force an N-variant system into an endless diverge/restart cycle. This paper describes a defense, CRISPR-Inspired Program Resiliency (Crispy), that automatically protects N-variant systems from such DoS attacks. Crispy mitigates DoS attacks against N-variant systems using an automatic signature generation technique modeled on CRISPR/Cas, the bacterial adaptive immune system. Experiments on two webservers using exploits developed by an independent Red Team showed Crispy protected against 87.5% of DoS attacks with zero false positives. Overhead was minimal and varied according to the number of signatures maintained, which can be tailored to the threat model and performance requirements.
Modern cyber-physical systems, like automotive systems and aerial vehicles, are not built with cyber security in mind. Several techniques have been developed recently to overcome cyber-attacks on cyber-physical systems both at the software and the control levels. Adding such cyber security techniques to protect a system against malicious attacks, however, can incur runtime overheads that, in the case of autonomous systems, results in performance degradation and may lead the system into unsafe states. In this paper, we propose a framework for online control performance adaptation for secure and safe navigation of autonomous vehicles. Our approach leverages model predictive control (MPC) and knowledge about the system dynamics and the maximum performance degradation that cyber security techniques can impose at every time step to compute the control input that guarantees a safe operation of the system at all times. We validate the proposed approach both with simulations and experiments for an unmanned ground vehicle (UGV) motion planning case study in a cluttered environment.