OSDI 2018
Data
Paper Name Author(s)
Noria: Dynamic, Partially-stateful Data-flow For High-performance Web Applications

the authors introduce partially-stateful data-flow, a new streaming data-flow model that supports eviction and reconstruction of data-flow state on demand.

Jon Gjengset, Malte Schwarzkopf, Jonathan Behrens, Lara Araujo, Martin Ek, Eddie Kohler, M. Frans Kaashoek, Robert Morris
Deconstructing RDMA-enabled Transaction Processing: Hybrid Is Better!

There is currently an active debate on which RDMA primitive (i.e., one-sided or two-sided) is optimal for distributed transactions.In this paper, the authors perform a systematic comparison be- tween different RDMA primitives with a combination of various optimizations using representative OLTP workloads. Then we investigate the implementation of optimistic concurrency control (OCC) by comparing different RDMA primitives using a phase-by-phase approach with various transactions from TPC-C, SmallBank, and TPC-E. Our results show that no single primitive (one-sided or two-sided) wins over the other on all phases.We further conduct an end-to-end comparison of prior designs on the same codebase and find none of them is optimal.This number outperforms the pure one-sided and two- sided systems by up to 1.89X and 2.96X for TPC-C with over 49% and 65% latency reduction.

Xingda Wei, Zhiyuan Dong, Rong Chen, Haibo Chen
Dynamic Query Re-Planning Using QOOP

Modern data processing clusters are highly dynamic . To improve job performance, recent works have focused on optimizing the cluster scheduler and the jobs' query planner with a focus on picking the right query execution plan (QEP) .This paper argues for dynamic query re-planning, wherein we re-evaluate and re-plan a job's QEP during its execution.

Kshiteej Mahajan, Mosharaf Chowdhury, Aditya Akella, Shuchi Chawla
Focus: Querying Large Video Datasets With Low Latency And Low Cost

the authors present Focus, a system providing both low-cost and low-latency querying on large video datasets. Large volumes of videos are continuously recorded from cameras deployed for traffic control and surveillance with the goal of answering “after the fact” queries: identify video frames with objects of certain classes (cars, bags) from many days of recorded video.At ingest time (on live videos), Focus uses cheap convolutional network classifiers (CNNs) to construct an approximate index of all possible object classes in each frame (to handle queries for any class in the future).

Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, Onur Mutlu
Debugging
Paper Name Author(s)
Orca: Differential Bug Localization In Large-Scale Services

Today, we depend on large-scale services for basic operations such as email.These services are extremely dynamic as developers continously commit code and introduce new features, fixes and new bugs.the authors have built Orca, a customized code search-engine that implements differential bug localization. We have built Orca, a customized code search-engine that implements differential bug localization.

Ranjita Bhagwan, Rahul Kumar, Chandra Sekhar Maddila, Adithya Abraham Philip
Differential Energy Profiling: Energy Optimization Via Diffing Similar Apps Abhilash Jindal, Y. Charlie Hu
WPerf: Generic Off-CPU Analysis To Identify Bottleneck Waiting Events

This paper tries to identify waiting events that limit the maximal throughput of a multi-threaded application.

Fang Zhou, Yifan Gan, Sixiang Ma, Yang Wang
Sledgehammer: Cluster-Fueled Debugging

Current debugging tools force developers to choose between power and interactivity.In this paper, he authors propose cluster-fueled debugging, which provides interactivity for powerful debugging tools by parallelizing their work across many cores in a cluster. Then, they re-run applications with this instrumentation.

Andrew Quinn, Jason Flinn, Michael J Cafarella
File Systems
Paper Name Author(s)
Pocket: Ephemeral Storage For Serverless Analytics

Serverless computing is becoming increasingly popular.However exchanging intermediate data between execution stages in an analytics job is a key challenge as direct communication between serverless tasks is difficult.the authors present Pocket, an elastic, distributed data store that automatically scales to provide applications with desired performance at low cost. We present Pocket, an elastic, distributed data store that automatically scales to provide applications with desired performance at low cost.

Ana Klimovic, Yawen Wang, Christos Kozyrakis, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, Christos Kozyrakis
Sharding The Shards: Managing Datastore Locality At Scale With Akkio

Akkio is a locality management service layered between client applications and distributed datastore systems. It determines how and when to migrate data to reduce response times and resource usage.

Muthukaruppan Annamalai, Kaushik Ravichandran, Harish Srinivas, Igor Zinkovsky, Luning Pan, Tony Savor, David Nagle, Michael Stumm
Write-Optimized And High-Performance Hashing Index Scheme For Persistent Memory

Non-volatile memory (NVM) as persistent memory is expected to substitute or complement DRAM in memory hierarchy. However, due to the requirement of data consistency and hardware limitations of NVM, traditional indexing techniques originally designed for DRAM become inefficient in persistent memory. this paper proposes a write-optimized and high-performance hashing index scheme, called level hashing, with low-overhead consistency guarantee and cost-efficient resizing. Non-volatile memory (NVM) as persistent memory is expected to substitute or complement DRAM in memory hierarchy, due to the strengths of non-volatility, high density, and near-zero standby power.Level hashing provides a sharing-based two-level hash table, which achieves a constant-scale search/insertion/deletion/update time complexity in the worst case and rarely incurs extra NVM writes.Experimental results demonstrate that level hashing achieves $1.4 imes$$-$$3.0 imes$ speedup for insertions, $1.2 imes$$-$$2.1 imes$ speedup for updates, and over $4.3 imes$ speedup for resizing, while maintaining high search and deletion performance, compared with state-of-the-art hashing schemes.

Pengfei Zuo, Yu Hua, Jie Wu
FlashShare: Punching Through Server Storage Stack From Kernel To Firmware For Ultra-Low Latency SSDs

In this paper, the authors propose FlashShare to assist ULL SSDs to satisfy different levels of I/O service latency requirements for different co-running applications. We also enhance the NVMe controller and cache layer at the SSD firmware-level, by dynamically partitioning DRAM in the ULL SSD and adjusting its caching strategies to meet diverse user requirements.At the kernel-level, we extend the data structures of the storage stack to pass attributes of (co-running) applications through all the layers of the underlying storage stack spanning from the OS kernel to the SSD firmware.The evaluation results demonstrate that FlashShare can shorten the average and 99th-percentile turnaround response times of co-running applications by 22% and 31%, respectively.

Jie Zhang, Miryeong Kwon, Donghyun Gouk, Changlim Lee, Mohammad Alian, Myoungjun Chun, Mahmut Kandemir, Nam Sung Kim, Jihong Kim, Myoungsoo Jung
Graphs and Data
Paper Name Author(s)
ASAP: Fast, Approximate Pattern Mining At Scale

This paper presents ASAP, a fast, approximate computation engine for graph pattern mining.To enable the users to navigate the trade-off between the result accuracy and latency, the authors propose a novel approach to build the Error-Latency Profile (ELP) for a given computation. This paper presents ASAP, a fast, approximate computation engine for graph pattern mining.To enable the users to navigate the trade-off between the result accuracy and latency, we propose a novel approach to build the Error-Latency Profile (ELP) for a given computation.

Anand Padmanabha Iyer, Zaoxing Liu, Xin Jin, Shivaram Venkataraman, Vladimir Braverman, Ion Stoica
RStream: Marrying Relational Algebra With Streaming For Efficient Graph Mining On A Single Machine

Graph mining is an important category of graph algorithms that aim to discover structural patterns such as cliques and motifs in a graph.the authors built RStream, the first single-machine, out-of-core mining system that leverages disk support to store intermediate data. Graph mining is an important category of graph algorithms that aim to discover structural patterns such as cliques and motifs in a graph.

Kai Wang, Zhiqiang Zuo, John Thorpe, Tim Nguyen, Guoqing Harry Xu
Three Steps Is All You Need: Fast, Accurate, Automatic Scaling Decisions For Distributed Streaming Dataflows

Streaming computations are by nature long-running, and their workloads can change in unpredictable ways.the authors present DS2, an automatic scaling controller which combines a general performance model of streaming dataflows with lightweight instrumentation to estimate the true processing and output rates of individual dataflow operators

Vasiliki Kalavri, John Liagouris, Moritz Hoffmann, Desislava Dimitrova, Matthew Forshaw, Timothy Roscoe
Flare: Optimizing Apache Spark For Scale-Up Architectures And Medium-Size Data

In recent years, Apache Spark has become the de facto standard for big data processing.the authors present Flare, an accelerator module for Spark that delivers order of magnitude speedups on scale-up architectures for a large class of applications. Inspired by query compilation techniques from main-memory database systems, Flare incorporates a code generation strategy designed to match the unique aspects of Spark and the characteristics of scale-up architectures, in particular processing data directly from optimized file formats and combining SQL-style relational processing with external frameworks such as TensorFlow.

Gregory Essertel, Ruby Tahboub, James Decker, Kevin Brown, Kunle Olukotun, Tiark Rompf
Machine Learning
Paper Name Author(s)
Ray: A Distributed Framework For Emerging AI Applications

The next generation of AI applications will continuously interact with the environment and learn from these interactions.These applications impose demanding systems requirements.In this paper, the authors consider these requirements and present Ray — a distributed system to address them. Ray implements a unified interface that can express both task-parallel and actor-based computations, supported by a single dynamic execution engine.

Philipp Moritz, Robert Nishihara, Michael I. Jordan, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul
TVM: An Automated End-to-End Optimizing Compiler For Deep Learning

There is an increasing need to bring machine learning to a wide diversity of hardware devices.the authors propose TVM, a compiler that exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads across diverse hardware back-ends. Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs.

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Gandiva: Introspective Cluster Scheduling For Deep Learning

the authors introduce Gandiva, a new cluster scheduling framework that utilizes domain-specific knowledge to improve latency and efficiency of training deep learning models in a GPU cluster.

Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, Fan Yang, Lidong Zhou
PRETZEL: Opening The Black Box Of Machine Learning Prediction Serving Systems

Machine Learning models are often composed of pipelines of transformations. prediction serving has different requirements. the authors present PRETZEL, a prediction serving system introducing a novel white box architecture enabling both end-to-end and multi-model optimizations. Using production-like model pipelines, our experiments show that PRETZEL is able to introduce performance improvements over different dimensions; compared to state-of-the-art approaches PRETZEL is on average able to reduce 99th percentile latency by 5.5× while reducing memory footprint by 25×, and increasing throughput by 4.7×.

Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Marco Domenico Santambrogio, Markus Weimer, Matteo Interlandi
Networking
Paper Name Author(s)
Splinter: Bare-Metal Extensions For Multi-Tenant Low-Latency Storage

the authors present Splinter, a low-latency key-value store that clients extend by pushing code to it. Splinter is designed for modern multi-tenant data centers; it allows mutually distrusting tenants to write their own fine-grained extensions and push them to the store at runtime. The core of Splinter’s design relies on type- and memory-safe extension code to avoid conventional hardware isolation costs.

Chinmay Kulkarni, Sara Moore, Mazhar Naqvi, Tian Zhang, Robert Ricci, Ryan Stutsman
Neural Adaptive Content-aware Internet Video Delivery

the quality of existing video delivery critically depends on the bandwidth resource. the authors present a new video delivery framework that utilizes client computation and recent advances in deep neural networks (DNNs) to reduce the dependency for delivering high-quality video. We design a practical system that addresses several challenges, such as client heterogeneity, interaction with bitrate adaptation, and DNN transfer, in enabling the idea.Our evaluation using 3G and broadband network traces shows the proposed system outperforms the current state of the art, enhancing the average QoE by 43.08% using the same bandwidth budget or saving 17.13% of bandwidth while providing the same user QoE.

Hyunho Yeo, Youngmok Jung, Jaehong Kim, Jinwoo Shin, Dongsu Han
Floem: Programming System For NIC-Accelerated Network Applications

Developing server applications that offload computation and data to a NIC accelerator is laborious.the authors propose programming abstractions for NIC-accelerated applications, balancing the ease of developing a correct application and the ability to refactor it to explore different design choices. The design space includes semantic changes as well as variations on parallelization and program-to-resource mapping.We develop Floem, a programming system that provides these abstractions, and show that the system helps explore a space of NIC-offloading designs for real-world applications, including a key-value store and a distributed real-time data analytics system, improving throughput by 1.3--3.6x.

Phitchaya Mangpo Phothilimthana, Ming Liu, Antoine Kaufmann, Simon Peter, Rastislav Bodik, Thomas Anderson
Operating Systems
Paper Name Author(s)
LegoOS: A Disseminated, Distributed OS For Hardware Resource Disaggregation

The monolithic server model where a server is the unit of deployment, operation, and failure is meeting its limits in the face of several recent hardware and application trends.Using the splitkernel model, the authors built LegoOS, a new OS designed for hardware resource disaggregation.the authors implemented LegoOS from scratch and evaluated it by emulating hardware components using commodity servers. The monolithic server model where a server is the unit of deployment, operation, and failure is meeting its limits in the face of several recent hardware and application trends.Using the splitkernel model, the authors built LegoOS, a new OS designed for hardware resource disaggregation.the authors implemented LegoOS from scratch and evaluated it by emulating hardware components using commodity servers.

Yizhou Shan, Yutong Huang, Yilun Chen, Yiying Zhang
The Benefits And Costs Of Writing A POSIX Kernel In A High-level Language

This paper presents an evaluation of the use of a high-level language (HLL) with garbage collection to implement a monolithic POSIX-style kernel.

Cody Cutler, M. Frans Kaashoek, Robert T. Morris
Sharing, Protection And Compatibility For Reconfigurable Fabric With AmorphOS

At the same time, technology advancements such as 3D stacking, through-silicon vias (TSVs), and FinFETs have greatly increased FPGA density.Unlike software, where resource configurations are limited to simple dimensions of compute, memory, and I/O, FPGAs provide a multi-dimensional sea of resources known as the FPGA fabric: logic cells, floating point units, memories, and I/O can all be wired together, leading to spatial constraints on FPGA resources.Current stacks either support only a single application or statically partition the FPGA fabric into fixed-size slots.To build Morphlets, developers provide a parameterized hardware design that interfaces with AmorphOS, along with a mesh, which specifies external resource requirements.AmorphOS multiplexes Morphlets on the FPGA in both space and time to maximize FPGA utilization.We implement AmorphOS on Amazon F1 and Microsoft Catapult. We show that protected sharing and dynamic scalability support on workloads such as DNN inference and blockchain mining improves aggregate throughput up to 4x and 23x on Catapult and F1 respectively.

Ahmed Khawaja, Rohith Prakash, Michael Wei, Joshua Landgraf, Christopher J. Rossbach
Adaptive Dynamic Checkpointing For Safe Efficient Intermittent Computing

This paper presents Chinchilla, a compiler and runtime system that allows running unmodified C code efficiently on an energy-harvesting device with little additional programmer effort and no additional hardware support.C

Kiwan Maeng, Brandon Lucia
Reliability
Paper Name Author(s)
The FuzzyLog: A Partially Ordered Shared Log

the authors present Dapple, a distributed implementation of the FuzzyLog abstraction that stores the partial order compactly and supports efficient appends / playback via a new ordering protocol. The FuzzyLog is a partially ordered shared log abstraction.Our evaluation shows that these applications are compact, fast, and flexible: they retain the simplicity (100s of lines of code) and strong semantics (durability and failure atomicity) of a shared log design while exploiting the partial order of the Fuzzy-Log for linear scalability, flexible consistency guarantees (e.g., causal+ consistency), and network partition tolerance.

Joshua Lockerman, Jose Faleiro, Juno Kim, Soham Sankaran, Daniel J Abadi, James Aspnes, Siddhartha Sen, Mahesh Balakrishnan
Maelstrom: Mitigating Datacenter-level Disasters By Draining Interdependent Traffic Safely And Efficiently

the authors present Maelstrom, a new system for mitigating and recovering from datacenter-level disasters.Maelstrom exploits parallelism to drain and restore independent traffic sources efficiently.

Kaushik Veeraraghavan, Yee Jiun Song, Justin Meza, Tianyin Xu
Fault-Tolerance, Fast And Slow: Exploiting Failure Asynchrony In Distributed Systems

the authors introduce situation-aware updates and crash recovery (SAUCR), a new approach to performing replicated data updates in a distributed system.

Ramnatthan Alagappan, Aishwarya Ganesan, Jing Liu, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau
Taming Performance Variability

The performance of compute hardware varies: software run repeatedly on the same server (or a different server with supposedly identical parts) can produce performance results that differ with each execution.the authors conducted a large-scale study capturing nearly 900,000 data points from 835 servers.the authors examine this data from two perspectives: that of a service provider wishing to offer a consistent environment, and that of a systems researcher who must understand how variability impacts experimental results. The full dataset and our analysis tools are publicly available, and we have built a system to interactively explore the data and make recommendations for experiment parameters based on statistical analysis of historical data.

Aleksander Maricq and Dmitry Duplyakin, Ivo Jimenez, Carlos Maltzahn, Ryan Stutsman, Robert Ricci
Scheduling
Paper Name Author(s)
Arachne: Core-Aware Thread Management

Arachne is a new user-level implementation of threads that provides both low latency and high throughput for applications with extremely short-lived threads (only a few microseconds). A central core arbiter allocates cores between applications.

Henry Qin, Qian Li, Jacqueline Speiser, Peter Kraft, John Ousterhout
Principled Schedulability Analysis For Distributed Storage Systems Using Thread Architecture Models

In this paper, the authors present an approach to systematically examine the schedulability of distributed storage systems, identify their scheduling problems, and enable effective scheduling in these systems. We use Thread Architecture Models (TAMs) to describe the behavior and interactions of different threads in a system, and show both how to construct TAMs for existing systems and utilize TAMs to identify critical scheduling problems.

Suli Yang, Jing Liu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
μTune: Auto-tuned Threading For OLDI Microservices

the authors investigate how threading design critically impacts microservice tail latency by developing a taxonomy of threading models—a structured understanding of the implications of how microservices manage concurrency and interact with RPC interfaces under wide-ranging loads. We develop μTune, a system that has two features: (1) a novel framework that abstracts threading model implementation from application code, and (2) an automatic load adaptation system that curtails microservice tail latency by exploiting inherent latency trade-offs revealed in our taxonomy to transition among threading models.

Akshitha Sriraman, Thomas F. Wenisch
RobinHood: Tail Latency Aware Caching—Dynamic Reallocation From Cache-Rich To Cache-Poor

the authors propose a novel solution for maintaining low request tail latency: repurpose existing caches to mitigate the effects of backend latency variability, rather than just caching popular data. their solution, RobinHood, dynamically reallocates cache resources from the cache-rich to the cache-poor . We evaluate RobinHood with production traces on a 50-server cluster with 20 different backend systems.In the presence of load spikes, RobinHood meets a 150ms P99 goal 99.7% of the time, whereas the next best policy meets this goal only 70% of the time.

Daniel S. Berger, Benjamin Berg, Timothy Zhu, Siddhartha Sen, Mor Harchol-Balter
Security
Paper Name Author(s)
Graviton: Trusted Execution Environments On GPUs

the authors propose Graviton, an architecture for supporting trusted execution environments on GPUs. We also propose extensions to the CUDA runtime for securely copying data and executing kernels on the GPU.Our evaluation shows that overheads are low(17-33%)with encryption and decryption of traffic to and from the GPU being the main source of overheads.

Stavros Volos, Kapil Vaswani, Rodrigo Bruno
ZebRAM: Comprehensive And Compatible Software Protection Against Rowhammer Attacks

The Rowhammer vulnerability common to many modern DRAM chips allows attackers to trigger bit flips in a row of memory cells by accessing the adjacent rows at high frequencies.This paper introduces ZebRAM, a novel and comprehensive software-level protection against Rowhammer. ZebRAM isolates every DRAM row that contains data with guard rows that absorb any Rowhammer-induced bit flips; the only known method to protect against all forms of Rowhammer.

Radhesh Krishnan Konoth, Marco Oliverio, Andrei Tatar, Dennis Andriesse, Herbert Bos, Cristiano Giuffrida, Kaveh Razavi
Karaoke: Fast And Strong Metadata Privacy With Low Noise

Karaoke is a system for low-latency metadata-private communication.

David Lazar, Yossi Gilad, Nickolai Zeldovich
Obladi: Oblivious Serializable Transactions In The Cloud

This paper presents the design and implementation of Obladi, the first system to provide ACID transactions while also hiding access patterns.

Natacha Crooks, Matthew Burke, Sitar Harel, Ethan Cecchetti, Rachit Agarwal, Lorenzo Alvisi
Understanding Failures
Paper Name Author(s)
Capturing And Enhancing In Situ System Observability For Failure Detection

Real-world distributed systems suffer unavailability due to various types of failure.But, despite enormous effort, many failures, especially gray failures, still escape detection.Panorama incorporates techniques for making observations even when indirection exists between components.

Peng Huang, Chuanxiong Guo, Jacob R. Lorch, Lidong Zhou, Yingnong Dang
REPT: Reverse Debugging Of Failures In Deployed Software

In this paper, the authors present REPT, a practical system that enables reverse debugging of software failures in deployed systems.

Weidong Cui, Xinyang Ge, Baris Kasikci, Ben Niu, Upamanyu Sharma, Ruoyu Wang, Insu Yun
Finding Crash-Consistency Bugs With Bounded Black-Box Crash Testing

the authors present a new approach to testing file-system crash consistency: bounded black-box crash testing (B3).Since the space of possible workloads is infinite, B3 bounds this space based on parameters such as the number of file-system operations or which operations to include, and exhaustively generates workloads within this bounded space.the authors build two tools, CrashMonkey and Ace, to demonstrate the effectiveness of this approach. B3 tests the file system in a black-box manner using workloads of file-system operations.Each workload is tested on the target file system by simulating power-loss crashes while the workload is being executed, and checking if the file system recovers to a correct state after each crash.Our tools also revealed 10 new crash-consistency bugs in widely-used, mature Linux file systems, seven of which existed in the kernel since 2014.

Jayashree Mohan, Ashlie Martinez, Soujanya Ponnapalli, Pandian Raju, Vijay Chidambaram
An Analysis Of Network-Partitioning Failures In Cloud Systems

the authors present a comprehensive study of 136 system failures attributed to network-partitioning faults from 25 widely used distributed systems.the authors built NEAT, a testing framework that simplifies the coordination of multiple clients and can inject different types of network-partitioning faults. We used NEAT to test seven popular systems and found and reported 32 failures.We found that the majority of the failures led to catastrophic effects, such as data loss, reappearance of deleted data, broken locks, and system crashes.

Ahmed Alquraan, Hatem Takruri, Mohammed Alfatafta, Samer Al-Kiswany
Verification
Paper Name Author(s)
Nickel: A Framework For Design And Verification Of Information Flow Control Systems

Nickel is a framework that helps developers design and verify information flow control systems by systematically eliminating covert channels inherent in the interface, which can be exploited to circumvent the enforcement of information flow policies.

Helgi Sigurbjarnarson, Luke Nelson, Bruno Castro-Karney, James Bornholt, Emina Torlak, Xi Wang
Verifying A Concurrent Mail Server With CSPEC

Writing concurrent systems software is error-prone, because multiple processes or threads can interleave in many ways, and it is easy to forget about a subtle corner case. This paper introduces CSPEC, a framework for formal verification of concurrent software, which ensures that no corner cases are missed.

Tej Chajed, M. Frans Kaashoek, Butler Lampson, Nickolai Zeldovich
Proving Confidentiality In A File System Using DiskSec

SFSCQ is the first file system with a machine-checked proof of security. To develop, specify, and prove SFSCQ, this paper introduces DiskSec, a novel approach for reasoning about confidentiality of storage systems, such as a file system. SFSCQ is the first file system with a machine-checked proof of security.

Atalay Ileri, Tej Chajed, Adam Chlipala, M. Frans Kaashoek, Nickolai Zeldovich
Proving The Correct Execution Of Concurrent Services In Zero-knowledge

This paper introduces Spice, a system for building verifiable state machines (VSMs). A VSM is a request-processing service that produces proofs establishing that requests were executed correctly according to a specification.

Srinath Setty, Sebastian Angel, Trinabh Gupta, Jonathan Lee