keyspace
Keyspace partitioning and re-balancing for distributed systems.
Motivation
Implement a keyspace partitioning and re-balancing algorithm that is:
- Memory/space efficient: no virtual nodes, scalable to thousands of physical nodes.
- Fair: data is uniformly distributed across partitions.
- Compact: to compute the target node of a key, we only need to know the number of nodes
n, and operation isO(1). - Adaptive: supports node addition and removal, with close to theoretically minimal data movement.
- Robust: supports replication out of the box.
- Heterogeneous: supports weighted nodes with different storage capacities.
The idea is to allow system to grow and shrink easily, and to process millions of keys per second efficiently. Additionally, provide a simple API exposing the keyspace data movement details, so that the system can be re-balanced in a distributed fashion.
Usage
The API is designed to be simple and easy to use. It provides a way to start a keyspace with some nodes, then add/remove nodes with minimal data movement (with migration plans calculated and returned), and finally query the keyspace for the target node of a key:
The purpose of the keyspace is to route keys to nodes. To do that, we need to define a node type
that implements the KeyspaceNode trait.
use ;
// Node type holds enough information about our physical node.
// For a node to be used in keyspace, it must implement `KeyspaceNode` trait.
// Each keyspace must start from a set of initial nodes.
// The node count must be at least replication factor number of nodes.
let init_nodes = vec!;
// Create a keyspace with the (default) replication factor of 3.
let mut ks = new
.build
.expect;
// Check replicas for the key.
let replicas = ks.replicas.;
assert_eq!;
assert!;
// Add another node, see updated replica set.
ks.add_node
.expect;
// Check replicas for the for `key1` -- replica set remained the same!
// This is expected, the whole point of the keyspace is that it is not totally
// rehashed on updates - only part of the keyspace is updated.
let replicas = ks.replicas.;
assert_eq!;
assert!;
// When it comes to `key2` its replica set should include the new node.
let replicas = ks.replicas.;
assert!;
This is only a minimal use case, real life scenarios would likely require:
- Nodes holding more information than just an ID.
- Heterogeneous cluster with nodes having different capacities.
- Full support for migrations and re-balancing, i.e. ability to pull data from data holding nodes on a node addition/removal.
- For failure tolerance, keys may need to be replicated across multiple physical machines.
- Moreover, such a replication should be flexible enough, with custom replication strategies, e.g. strategy that ensures that replicas of a key live in different availability zones or racks.
See the documentation for more details on such use cases.