Hardware-assisted enclaves with memory encryption have been widely adopted in the prevailing architectures, e.g., Intel SGX/TDX, AMD SEV, ARM CCA, etc. However, existing enclave designs fall short in supporting efficient cooperation among cross-node enclaves (i.e., multi-machines) because the range of hardware memory protection is within a single node. A naive approach is to leverage cryptography at the application level and transfer data between nodes through secure channels (e.g., SSL). However, it incurs orders of magnitude costs due to expensive encryption/decryption, especially for distributed applications with large data transfer, e.g., MapReduce and graph computing. A secure and efficient mechanism of distributed secure memory is necessary but still missing.This paper proposes Migratable Merkle Tree (MMT), a design enabling efficient distributed secure memory to support distributed confidential computing. MMT sets up an integrity forest for distributed memory on multiple nodes. It allows an enclave to securely delegate an MMT closure, which contains both data and metadata of a subtree, to a remote enclave. By reusing the memory encryption mechanisms of existing enclaves, our design achieves secure data transfer without software re-encryption. We have implemented a prototype of MMT and a trusted firmware for management, and further applied MMT to real-world distributed applications. The evaluation results show that compared with existing systems using the AES-NI instruction, MMT can achieve up to 13x speedup on data transferring, and gain 12%~58% improvement on the end-to-end performance of MapReduce and PageRank.
Trusted Execution Environments (TEEs), like Intel SGX/TDX, AMD SEV-SNP, ARM TrustZone/CCA, have been widely adopted in prevailing architectures. However, these TEEs typically do not consider I/O isolation (e.g., defending against malicious DMA requests) as a first-class citizen, which may degrade the I/O performance. Traditional methods like using IOMMU or software I/O can degrade throughput by at least 20% for I/O intensive workloads. The main reason is that the isolation requirements for I/O devices differ from CPU ones. This paper proposes a novel I/O isolation mechanism for TEEs, named sIOPMP (scalable I/O Physical Memory Protection), with three key features. First, we design a Multi-stage-Tree-based checker, supporting more than 1,000 hardware regions. Second, we classify the devices into hot and cold, and support unlimited devices with the mountable entry. Third, we propose a remapping mechanism to switch devices between hot and cold status for dynamic I/O workloads. Evaluation results show that sIOPMP introduces only negligible performance overhead for both benchmarks and real-world workloads, and improves 20% ~ 38% network throughput compared with IOMMU-based mechanisms or software I/O adopted in TEEs.
Many cloud providers, including Amazon, Google, Microsoft, and Alibaba Cloud, offer support for blockchain cloud services that rely on a runtime environment, such as the Ethereum Virtual Machine (EVM), to execute smart contracts and ensure consistency between participants. However, existing runtime systems suffer from two main limitations. Firstly, traditional runtime systems like EVM cannot guarantee privacy protection as all the data uploaded to the blockchain is visible to all participants. This restricts the use of blockchain in limited scenarios. Secondly, each computation on the runtime system must be synchronized to all nodes in the network, resulting in a significant increase in computational overhead, which can be challenging to implement for more complex applications. One approach to address these limitations is to utilize Trusted Execution Environments (TEE) for blockchain runtime, which can provide privacy protection and mitigate redundant synchronization operations. However, using TEE for blockchain may significantly increase cloud costs. To overcome these challenges, this paper proposes PL-EVM, a new runtime environment for smart contracts that utilizes jointcloud. PL-EVM achieves high-security guarantees by using TEE to protect privacy-sensitive data and incorporates dynamic migration and splitting mechanisms to achieve high efficiency and low costs. Our evaluation results show that PL-EVM can improve performance and reduce costs by 4% to 32.22%.
To support highly scalable and fine-grained computing paradigms such as microservices and serverless computing better, modern hardware-assisted confidential computing systems, such as Intel TDX and ARM CCA, introduce permission table to achieve fine-grained and scalable memory isolation among different domains. However, it also adds an extra dimension to page walks besides page tables, leading to significantly more memory references (e.g., 4 → 12 for RISC-V Sv39)1. We observe that most costs (about 75%) caused by the extra dimension of page walks are used to validate page table pages. Based on this observation, this paper proposes HPMP (Hybrid Physical Memory Protection), a hardware-software co-design (on RISC-V) that protects page table pages using segment registers and normal pages using permission tables to balance scalability and performance. We have implemented HPMP and Penglai-HPMP (a TEE system based on HPMP) on FPGA with two RISC-V cores (both in-order and out-of-order). Evaluation results show that HPMP can reduce costs by 23.1%–73.1% on BOOM and significantly improve performance on real-world applications, including serverless computing (FunctionBench) and Redis.
The Meltdown vulnerability, which exploits the inherent out-of-order execution in common processors like x86, ARM and PowerPC, has shown to break the fundamental isolation boundary between user and kernel space. This has stimulated a non-trivial patch to modern OS to separate page tables for user space and kernel space, namely, KPTI (kernel page table isolation). While this patch stops kernel memory leakages from rouge user processes, it mandates users to patch their kernels (usually requiring a reboot), and is currently only available on the latest versions of OS kernels. Further, it also introduces non-trivial performance overhead due to page table switching during user/kernel crossings.
In this paper, we present EPTI, an alternative approach to defending against the Meltdown attack for unpatched VMs (virtual machines) in cloud, yet with better performance than KPTI. Specifically, instead of using two guest page tables, we use two EPTs (extended page tables) to isolate user space and kernel space, and unmap all the kernel space in user's EPT to achieve the same effort of KPTI. The switching of EPTs is done through a hardware-support feature called EPT switching within guest VMs without hypervisor involvement. Meanwhile, EPT switching does not flush TLB since each EPT has its own TLB, which further reduces the overhead. We have implemented our design and evaluated it on Intel Kaby Lake CPU with different versions of Linux kernel. The results show that EPTI only introduces up to 13% overhead, which is around 45% less than KPTI.
This paper presents a simple and effective approach to improve dependency parsing by exploiting multiple feature-sets. Traditionally, features are extracted by applying the feature templates to all the word pairs(first-order features)and word tuples(second-order features). In this pa per, we show that exploiting different feature templates for different word pairs and word tuples achieves significant improvement overbaseline parsers. First, we train a text chunker using a freely available implementation of the first-order linear conditional random fields model. Then we build a clause-chunk tree for a given sentence based on chunking information and punctuation marks. Finally, we extract features for dependency parsing according to multiple feature-sets. We extend the projective parsing algorithms of McDonald[20] and Carreras[1] for our case, experimental results show that our approach significantly outperform the baseline systems without increasing complexity. Given correct chunking information, we improve from baseline accuracies of 91.36% and 92.20% to 93.19% and 93.89%, respectively.