Dynamic binary translators (DBT) have recently attracted much attention for embedded systems. The effective implementation of DBT in these systems is challenging due to tight constraints on memory and performance. A DBT uses a software-managed code cache to hold blocks of translated code. To minimize overhead, the code cache is usually large so blocks are translated once and never discarded. However, an embedded system may lack the resources for a large code cache. This constraint leads to significant slowdowns due to the retranslation of blocks prematurely discarded from a small code cache. This paper addresses the problem and shows how to impose a tight size bound on the code cache without performance loss. We show that about 70% of the code cache is consumed by instructions that the DBT introduces for its own purposes. Based on this observation, we propose novel techniques that reduce the amount of space required by DBT-injected code, leaving more room for actual application code and improving the miss ratio. We experimentally demonstrate that a bounded code cache can have performance on-par with an unbounded one.
Software Dynamic Translation (SDT) is used for instrumentation, optimization, security, and many other uses. A major source of SDT overhead is the execution of code to translate an indirect branch's target address into the translated destination block's address. This article discusses sources of Indirect Branch (IB) overhead in SDT systems and evaluates techniques for overhead reduction. Measurements using SPEC CPU2000 show that the appropriate choice and configuration of IB translation mechanisms can significantly reduce the overhead. Further, cross-architecture evaluation of these mechanisms reveals that the most efficient implementation and configuration can be highly dependent on the architecture implementation.
Research and development of techniques which detect or remediate malicious network activity require access to diverse, realistic, contemporary data sets containing labeled malicious connections. In the absence of such data, said techniques cannot be meaningfully trained, tested, and evaluated. Synthetically produced data containing fabricated or merged network traffic is of limited value as it is easily distinguishable from real traffic by even simple machine learning (ML) algorithms. Real network data is preferable, but while ubiquitous is broadly both sensitive and lacking in ground truth labels, limiting its utility for ML research.
This paper presents a multi-faceted approach to generating a data set of labeled malicious connections embedded within anonymized network traffic collected from large production networks. Real-world malware is defanged and introduced to simulated, secured nodes within those networks to generate realistic traffic while maintaining sufficient isolation to protect real data and infrastructure. Network sensor data, including this embedded malware traffic, is collected at a network edge and anonymized for research use.
Network traffic was collected and produced in accordance with the aforementioned methods at two major educational institutions. The result is a highly realistic, long term, multi-institution data set with embedded data labels spanning over 1.5 trillion connections and over a petabyte of sensor log data. The usability of this data set is demonstrated by its utility to our artificial intelligence and machine learning (AI/ML) research program.
column Practices of PLDI Authors: Hans Boehm View Profile , Jack Davidson View Profile , Kathleen Fisher View Profile , Cormac Flanagan View Profile , Jeremy Gibbons View Profile , Mary Hall View Profile , Graham Hutton View Profile , David Padua View Profile , Frank Tip View Profile , Jan Vitek View Profile , Philip Wadler View Profile Authors Info & Claims ACM SIGPLAN NoticesVolume 49Issue 4SApril 2014 pp 33–38https://doi.org/10.1145/2641638.2641649Published:01 July 2014Publication History 0citation30DownloadsMetricsTotal Citations0Total Downloads30Last 12 Months1Last 6 weeks1 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my AlertsNew Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteGet Access
Building compilers that generate correct code is difficult. In this paper we present a compiler testing technique that closes the gap between actual compiler implementations and correct compilers. Using formal specifications of procedure calling conventions, we have built a target-sensitive test suite generator that builds test cases for a specific aspect of compiler code generators the procedure calling sequence generator. By exercising compilers with these target-specific test suites, our automated testing tool has exposed bugs in every compiler tested. These compilers include ones that have been in heavy use for many years. The detected bugs cause more than 14,000 test cases to fail.
Current software development methodologies and practices, while enabling the production of large complex software systems, can have a serious negative impact on software quality. These negative impacts include excessive and unnecessary software complexity, higher probability of software vulnerabilities, diminished execution performance in both time and space, and the inability to easily and rapidly deploy even minor updates to deployed software, to name a few. Consequently, there has been growing interest in the capability to do late-stage software (i.e., at the binary level) manipulation to address these negative impacts. Unfortunately, highly robust, late-stage manipulation of arbitrary binaries is difficult due to complex implementation techniques and the corresponding software structures. Indeed, many binary rewriters have limitations that constrain their use. For example, to the best of our knowledge, no binary rewriters handle applications that include and use exception handlers-a feature used in programming languages such as C++, Ada, Common Lisp, ML, to name a few.
This study presents the design and implementation of DES encryption in ECB (electronic codebook) mode under three different parallelization schemes. The first method we suggest is block based, where 64-bit blocks of text are distributed among the parallel processors. The second approach is pipeline based, where we parallelize each of the 16 internal rounds of DES. The third parallelization scheme is plaintext based which divides the plaintext into equal partitions first and then assigns the 64-bit blocks of each partition to parallel processors. We obtained runtime performances for each parallelization. To observe whether the source language has an effect on the overall performance, we run our algorithm on text from 5 different languages as English, French, German, Spanish and Turkish
Peephole optimizers improve object code by replacing certain sequences of instructions with better sequences. This paper describes PO, a peephole optimizer that uses a symbolic machine description to simulate pairs of adjacent instructions, replacing them, where possible, with an equivalent single instruction. As a result of this organization, PO is machine independent and can be described formally and concisely: when PO is finished, no instruction, and no pair of adjacent instructions, can be replaced with a cheaper single instruction that has the same effect. This thoroughness allows PO to relieve code generators of much case analysis; for example, they might produce only load/add-register sequences and rely on PO to, where possible, discard them in favor or add-memory, add-immediate, or increment instructions. Experiments indicate that naive code generators can give good code if used with PO.
The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG@CACM community. In each issue of Communications, we'll publish selected posts or excerpts.
Recently, coordinated attack campaigns started to become more widespread on the Internet. In May 2017, WannaCry infected more than 300,000 machines in 150 countries in a few days and had a large impact on critical infrastructure. Existing threat sharing platforms cannot easily adapt to emerging attack patterns. At the same time, enterprises started to adopt machine learning-based threat detection tools in their local networks. In this paper, we pose the question: \emph{What information can defenders share across multiple networks to help machine learning-based threat detection adapt to new coordinated attacks?} We propose three information sharing methods across two networks, and show how the shared information can be used in a machine-learning network-traffic model to significantly improve its ability of detecting evasive self-propagating malware.