Computational models of cognition enable a better understanding of the human brain and behavior, psychiatric and neurological illnesses, clinical interventions to treat illnesses, and also offer a path towards human-like artificial intelligence. Cognitive models are also, however, laborious to develop, requiring composition of many types of computational tasks, and suffer from poor performance as they are generally designed using high-level languages like Python. In this work, we present Distill, a domain-specific compilation tool to accelerate cognitive models while continuing to offer cognitive scientists the ability to develop their models in flexible high-level languages. Distill uses domain-specific knowledge to compile Python-based cognitive models into LLVM IR, carefully stripping away features like dynamic typing and memory management that add performance overheads without being necessary for the underlying computation of the models. The net effect is an average of 27 × performance improvement in model execution over state-of-the-art techniques using Pyston and PyPy. Distill also repurposes classical compiler data flow analyses to reveal properties about data flow in cognitive models that are useful to cognitive scientists. Distill is publicly available, integrated in the PsyNeuLink cognitive modeling environment, and is already being used by researchers in the brain sciences.
This paper discusses our proposal and implementation of Distill, a domain-specific compilation tool based on LLVM to accelerate cognitive models. Cognitive models explain the process of cognitive function and offer a path to human-like artificial intelligence. However, cognitive modeling is laborious, requiring composition of many types of computational tasks, and suffers from poor performance as it relies on high-level languages like Python. In order to continue enjoying the flexibility of Python while achieving high performance, Distill uses domain-specific knowledge to compile Python-based cognitive models into LLVM IR, carefully stripping away features like dynamic typing and memory management that add overheads to the actual model. As we show, this permits significantly faster model execution. We also show that the code so generated enables using classical compiler data flow analysis passes to reveal properties about data flow in cognitive models that are useful to cognitive scientists. Distill is publicly available, is being used by researchers in cognitive science, and has led to patches that are currently being evaluated for integration into mainline LLVM.
This paper discusses our proposal and implementation of Cognac, a domain-specific compilation tool based on LLVM to accelerate cognitive models. Cognitive models explain the process of cognitive function and offer a path to human-like artificial intelligence. However, cognitive modeling is laborious, requiring composition of many types of computational tasks, and suffers from poor performance as it relies on high-level languages like Python. In order to continue enjoying the flexibility of Python while achieving high performance, Cognac uses domain-specific knowledge to compile Python-based cognitive models into LLVM IR, carefully stripping away features like dynamic typing and memory management that add overheads to the actual model. As we show, this permits significantly faster model execution. We also show that the code so generated enables using classical compiler data flow analysis passes to reveal properties about data flow in cognitive models that are useful to cognitive scientists. Cognac is publicly available, is being used by researchers in cognitive science, and has led to patches that are currently being evaluated for integration into mainline LLVM.
Memory prefetching improves performance across many systems layers. However, achieving high prefetch accuracy with low overhead is challenging, as memory hierarchies and application memory access patterns become more complicated. Furthermore, a prefetcher's ability to adapt to new access patterns as they emerge is becoming more crucial than ever. Recent work has demonstrated the use of deep learning techniques to improve prefetching accuracy, albeit with impractical compute and storage overheads. This paper suggests taking inspiration from the learning mechanisms and memory architecture of the human brain---specifically, the hippocampus and neocortex---to build resource-efficient, accurate, and adaptable prefetchers.
Continual learning on sequential data is critical for many machine learning (ML) deployments. Unfortunately, LSTM networks, which are commonly used to learn on sequential data, suffer from catastrophic forgetting and are limited in their ability to learn multiple tasks continually. We discover that catastrophic forgetting in LSTM networks can be overcome in two novel and readily-implementable ways -- separating the LSTM memory either for each task or for each target label. Our approach eschews the need for explicit regularization, hypernetworks, and other complex methods. We quantify the benefits of our approach on recently-proposed LSTM networks for computer memory access prefetching, an important sequential learning problem in ML-based computer system optimization. Compared to state-of-the-art weight regularization methods to mitigate catastrophic forgetting, our approach is simple, effective, and enables faster learning. We also show that our proposal enables the use of small, non-regularized LSTM networks for complex natural language processing in the offline learning scenario, which was previously considered difficult.