VMware Research | SplinterDB and Maplets: Improving the Trade-Offs in LSM Compaction Policy

Abstract

A critical aspect of modern key-value stores is the interaction between compaction policy and filters. Aggressive compaction reduces the on-disk footprint of a key-value store and can improve query performance, but can reduce insertion throughput because it is I/O and CPU expensive. Filters can mitigate the query costs of lazy compaction, but only if they fit in RAM, limiting the scalability of queries with lazy compaction.

In this work, we extend SplinterDB, a state-of-the-art lazily compacted system, with a new key-value filter, which we call a maplet, yielding a key-value store that achieves excellent insertion performance, query performance, space efficiency, and scalability by compacting filters aggressively and data lazily. Maplets can accelerate queries even when they don’t fit in RAM, improving scalability to huge datasets. We further show how to use maplets to estimate when a compaction could resolve a high density of updates, enabling SplinterDB to perform targeted compactions for space recovery.

In our benchmarks, SplinterDB with maplets matches the insertion performance of plain SplinterDB and beats RocksDB by up to 9×. On queries, SplinterDB with maplets outperforms SplinterDB and RocksDB by up to 89% and 83%, respectively, and scales gracefully to huge datasets. Maplets also enable SplinterDB to dynamically trade update performance for space efficiency, resulting in space overheads on update-heavy workloads as low as 15-61%, whereas RocksDB had 80-117% and plain SplinterDB had up to 137%.

Date

2023

Authors

Alex Conway
Martin Farach-Colton
Rob Johnson

Related projects

Research Areas

Approximate Membership Query Data Structure
Write-Optimized Data Structures

Type

Inproceedings

Journal

SIGMOD