Introduction

Theory and practice of quotient filters and other Bloom-filter-like structures

Summary

Approximate Membership Query (AMQ) data structures, such as the Bloom filter, quotient filter, and cuckoo filter, have found numerous applications in databases, storage systems, networks, computational biology, and other domains. However, many applications must work around limitations in the capabilities or performance of current AMQs, making these applications more complex and less performant. For example, many current AMQs cannot delete or count the number of occurrences of each input item, take up large amounts of space, are slow, cannot be resized or merged, or have poor locality of reference and hence perform poorly when stored on SSD or disk.

This project is investigating alternatives to the Bloom filter, such as quotient filters, counting quotient filters, and cuckoo filters, that overcome these limitations. The project includes both theoretical analyses and practical applications of these data structures.

The counting quotient filter (CQF), a Bloom-filter replacement that is faster than a Bloom filter and supports counting and deletion in less space than a Bloom filter, is available on github.

News

  • Our SIGMOD 2021 paper shows how to combine power-of-two-choice hashing, Robin-Hood hashing, and new x86 vector instructions to create an extremely high-performance filter.

Researchers

External Researchers

  • Martin Farach-Colton
  • Michael A. Bender
  • Rob Patro