Abstract

Counters are the fundamental building block of many data sketching schemes, which hash items to a small number of counters and account for collisions to provide good approximations for frequencies and other measures. Most existing methods rely on fixed-size counters, which may be wasteful in terms of space, as counters must be large enough to eliminate any risk of overflow. Instead, some solutions use small, fixed-size counters that may overflow into secondary structures.This paper takes a different approach. We propose a simple and general method called SALSA for dynamic re-sizing of counters, and show its effectiveness. SALSA starts with small counters, and overflowing counters simply merge with their neighbors. SALSA can thereby allow more counters for a given space, expanding them as necessary to represent large numbers. Our evaluation demonstrates that, at the cost of a small overhead for its merging logic, SALSA significantly improves the accuracy of popular schemes (such as Count-Min Sketch and Count Sketch) over a variety of tasks. Our code is released as open source.

Date

2021

Authors

  • Shay Vargaftik
  • Ran Ben Basat
  • Gil Einziger
  • Michael Mitzenmacher
  • Shay Vargaftik

Research Areas

  • Network measurments
  • Streaming Algorithms

Type

Conference

Journal

Proceedings - 2021 IEEE 37th International Conference on Data Engineering, ICDE 2021