Abstract
Translation Lookaside Buffers (TLBs) are critical for building performant virtual memory systems. However, TLBs mappings must be synchronized for both security and correctness. Most processors do not provide coherence for TLB mappings, so a software mechanism known as a TLB shootdown is employed which invokes interprocessor interrupts (IPIs) to synchronize TLBs. TLB shootdowns are expensive, so recent work has aimed to avoid the frequency of shootdowns through techniques such as batching. In this paper, we show that aggressive batching can result in correctness issues and addressing them can obviate the benefits of batching. Instead, our work takes a different approach which focuses on both improving the performance of TLB shootdowns and carefully selecting where to avoid shootdowns. Overall, we show that our approach results in significant speedups without sacrificing safety and correctness in both microbenchmarks and real-world applications.