Introduction
This tutorial describes the Speculating Snapshot of Gafni and Malkhi as an algorithmic framework for dynamic distributed computing tasks.
Abstract
A key challenge for distributed systems is the problem of reconfiguration.
Clearly, any production storage system that provides data
reliability and availability for long periods must be
able to reconfigure in order to remove failed or old servers and
add healthy or new ones.
This is far from trivial since we do not
want the reconfiguration management to be centralized or cause a
system shutdown.
In this tutorial we look into existing reconfigurable storage
algorithms and propose a common model and failure condition capturing their
guarantees.
We define a reconfiguration problem around which dynamic object solutions may be designed.
To demonstrate its strength, we use it to implement dynamic
atomic storage.
We present a generic framework for solving the
reconfiguration problem, show how to recast existing algorithms in terms of
this framework, and compare among them.