Expand description
§DSON: A Delta-State CRDT for JSON-like Data Structures
This crate provides a Rust implementation of DSON, a space-efficient, delta-state Conflict-Free Replicated Datatype (CRDT) for JSON-like data structures. It is based on the research paper “DSON: JSON CRDT Using Delta-Mutations For Document Stores” and inspired by the original author’s JavaScript implementation.
The primary goal of this library is to enable robust, and efficient multi-writer collaboration in extremely constrained environments (high latency and low bandwidth; opportunistic networking).
Unlike other CRDT libraries that expose a single “Document” type, DSON provides a set of
composable primitives. This allows you to build the exact data structure you need. The most
common top-level structure is an OrMap
, which can contain other CRDTs, enabling nested,
JSON-like objects. The entire state is typically wrapped in a CausalDotStore
, which
tracks the causal history.
§Core Concepts
DSON provides three fundamental, composable CRDTs:
OrMap
: An Observed-Remove Map, mapping arbitrary keys to other CRDT values.OrArray
: An Observed-Remove Array, providing a list-like structure.MvReg
: A Multi-Value Register, for storing primitive values. When concurrent writes occur, the register holds all conflicting values. This is the only CRDT in this library that can represent value conflicts.
These primitives can be nested to create arbitrarily complex data structures, such as a map containing an array of other maps.
All modifications produce a delta. Instead of sending the entire state after each change, only this small delta needs to be transmitted to other replicas.
§Observed-Remove Semantics
DSON uses Observed-Remove (OR) semantics for its collections. This means an element can only be removed if its addition has been observed. If an element is updated concurrently with its removal, the update “wins,” and the element is preserved. OR-semantics are intuitive, and this is often the desired behavior. Consider a collaborative shopping list:
- Initial State: Both Alice and Bob see
["apples", "bananas"]
. - Alice’s Action: Alice updates “bananas” to “blueberries”.
- Bob’s Action: Concurrently, Bob removes “bananas”.
With OR-semantics, the final list will be ["apples", "blueberries"]
. Bob’s removal
is overridden by Alice’s concurrent update because the update implies the continued
existence of the item.
DSON can be extended with special CRDTs providing different semantics for specific use cases though.
§Causal CRDTs and Tombstone-Free Removals
DSON is a causal CRDT, meaning it uses causal history to resolve conflicts.
This history is tracked in a CausalContext
, which contains a set of “dots”—unique
identifiers for every operation.
§Dots
A dot is a globally unique identifier for an operation (for example, adding or updating a value). It is the fundamental unit for tracking causality.
A Dot
is a tuple (Identifier, Sequence)
:
Identifier
: A unique ID for the actor (a specific application instance on a specific node) that performed the operation. It is composed of aNodeId
and anApplicationId
. This structure allows multiple applications on the same machine to collaborate without their histories conflicting.Sequence
: A monotonically increasing number (effectively a Lamport timestamp) that is unique to that actor.
When a replica makes a change, it generates a new dot. This dot is broadcast to other replicas along with the delta describing the change.
The collection of all dots a replica has observed forms its CausalContext
. This
context represents the replica’s knowledge of the document’s history. By comparing its
local CausalContext
with the context from a received delta, a replica can determine
which operations are new, which are concurrent, and which have already been seen. This
allows DSON to merge changes correctly and guarantee convergence.
A key advantage of this model is the elimination of tombstones. In many other CRDTs, when an item is deleted, a “tombstone” marker is left behind to signify its removal. These tombstones are never garbage-collected and can cause unbounded metadata growth in long-lived documents.
DSON avoids this growth by tracking which operations are “live”. A removal is simply the absence of an operation’s dot from the causal context. When replicas sync, they can determine which items have been deleted by comparing their causal contexts, without needing explicit tombstone markers. This ensures that the metadata size remains proportional to the size of the live data, not the entire history of operations.
§Scope of this Crate
This crate provides the core data structures and algorithms for DSON. It is
responsible for generating deltas from mutations and merging them to ensure
eventual consistency. It is up to you to build your document structure by
composing the provided CRDT primitives, most commonly by wrapping an OrMap
in a CausalDotStore
.
Note that this is a low-level library. You will likely want to build a
typed abstraction layer on top of dson
rather than use it directly in your
application code.
It does not include any networking protocols.
You are responsible for implementing the transport layer to broadcast deltas
to other replicas. The correctness of this library, particularly its
causal consistency guarantees, relies on the transport layer delivering
deltas in an order that respects the causal history of events. This is typically
achieved with an anti-entropy algorithm that exchanges deltas and their
causal metadata (CausalContext
).
§Getting Started: A Simple Conflict
This section models a simple scenario where two users, Alice and Bob, concurrently edit the same piece of data, leading to a conflict that DSON resolves gracefully.
This example uses a top-level OrMap
keyed by a string, holding a MvReg
(Multi-Value Register)
for a string value. While OrMap
is generic over its key, it’s typically a String
to
model JSON-like objects.
use dson::{
crdts::{mvreg::MvRegValue, snapshot::ToValue, NoExtensionTypes},
CausalDotStore, Identifier, OrMap,
};
// 1. SETUP: TWO REPLICAS
// Create two replicas, Alice and Bob, each with a unique ID.
let alice_id = Identifier::new(0, 0);
let mut alice_store = CausalDotStore::<OrMap<String>>::default();
let bob_id = Identifier::new(1, 0);
let mut bob_store = CausalDotStore::<OrMap<String>>::default();
// 2. INITIAL STATE
// Alice creates an initial value. The key "123" holds a register.
let key = "123".to_string();
let delta_from_alice = dson::api::map::apply_to_register::<_, NoExtensionTypes, _>(
|reg, ctx, id| reg.write("initial value".to_string().into(), ctx, id),
key.clone(),
)(&alice_store.store, &alice_store.context, alice_id);
// Alice applies the change to her own store.
alice_store.join_or_replace_with(delta_from_alice.store.clone(), &delta_from_alice.context);
// 3. SYNC
// Bob receives Alice's initial change.
bob_store.join_or_replace_with(delta_from_alice.store, &delta_from_alice.context);
assert_eq!(alice_store, bob_store);
// 4. CONCURRENT EDITS
// Now, Alice and Bob make changes without syncing.
// Alice updates the value to "from Alice".
let delta_alice_edit = dson::api::map::apply_to_register::<_, NoExtensionTypes, _>(
|reg, ctx, id| reg.write("from Alice".to_string().into(), ctx, id),
key.clone(),
)(&alice_store.store, &alice_store.context, alice_id);
alice_store.join_or_replace_with(delta_alice_edit.store.clone(), &delta_alice_edit.context);
// Concurrently, Bob updates the value to "from Bob".
let delta_bob_edit = dson::api::map::apply_to_register::<_, NoExtensionTypes, _>(
|reg, ctx, id| reg.write("from Bob".to_string().into(), ctx, id),
key.clone(),
)(&bob_store.store, &bob_store.context, bob_id);
bob_store.join_or_replace_with(delta_bob_edit.store.clone(), &delta_bob_edit.context);
// 5. MERGE
// The replicas now exchange their changes.
// Alice merges Bob's edit.
alice_store.join_or_replace_with(delta_bob_edit.store, &delta_bob_edit.context);
// Bob merges Alice's edit.
bob_store.join_or_replace_with(delta_alice_edit.store, &delta_alice_edit.context);
// After merging, both stores are identical.
assert_eq!(alice_store, bob_store);
// 6. VERIFY CONFLICT
// The concurrent writes are preserved as a conflict in the MvReg.
let register_value = alice_store.store.get(&key).unwrap();
let values: Vec<_> = register_value.reg.values().into_iter().collect();
assert_eq!(values.len(), 2);
assert!(values.contains(&&MvRegValue::String("from Alice".to_string())));
assert!(values.contains(&&MvRegValue::String("from Bob".to_string())));
§Advanced Topics
§The Extension System
DSON includes an extension system that allows developers to define custom CRDTs by
implementing the ExtensionType
trait. This is for building domain-specific data
structures that go beyond the standard JSON-like primitives.
By implementing the ExtensionType
trait, you define how your custom type should be
serialized, deserialized, and merged. The system handles conflict resolution based on
the rules you define.
This can be used to implement custom data structures like counters, text objects, or more efficient state representation.
§Validation and Observation
DSON provides a Sentinel
trait that allows you to observe or
validate changes as they are applied during a merge. This can be used for implementing
authorization, logging, or triggering side effects.
§Network and Consistency
DSON’s delta-based approach minimizes the amount of data that needs to be transmitted between replicas, making it efficient for low-bandwidth or high-latency networks.
However, much of the complexity of using DSON in practice lies in the correct design and implementation of the gossip protocol used to exchange deltas between replicas. An efficient gossip protocol is not trivial to implement. For guidance, refer to the research on opportunistic networking (oppnet).
It is also important to understand that DSON’s causal consistency guarantees are provided on
a per-register basis. This means that while individual values are guaranteed to be causally
consistent, the relationships between different values are not. This can lead to very
unintuitive behavior.
For example, if you have two registers, x
and y
, you write to x
and then to y
,
another replica might see the write to y
before the write to x
.
§License
This project is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
§Features
json
: Enables serialization and deserialization of DSON documents to and fromserde_json::Value
. This feature is enabled by default.serde
: Providesserde
support for all CRDT types.arbitrary
: Implementsquickcheck::Arbitrary
for CRDT types, useful for property-based testing.chrono
: Enableschrono
support forTimestamp
. This feature is enabled by default.ulid
: Enables registers to hold ulids. This feature is enabled by default.
Re-exports§
pub use causal_context::CausalContext;
pub use causal_context::Dot;
pub use causal_context::Identifier;
pub use causal_context::MAX_APPLICATION_ID;
pub use causal_context::NodeId;
pub use causal_context::Priority;
pub use causal_context::ROOT_APP_ID;
pub use crdts::mvreg::MvReg;
pub use crdts::orarray::OrArray;
pub use crdts::ormap::OrMap;
pub use chrono;
Modules§
- api
- causal_
context - Causal Context
- crdts
- Composable CRDTs for JSON-like Data
- datetime_
literal - macros
- Macros usable for tests and initialization
- sentinel
- Observe and validate changes to a CRDT.
Macros§
- causal_
context - This macro parses the debug-representation of a CausalContext, giving an easy way to instantiate a cc with a given set of dots:
- crdt_
literal - Convenience macro for creating a TypeVariantValue, of either map, array or register type.
- crdt_
map_ literal - Convenience macro for creating a OrMap instance.
- crdt_
map_ store - datetime
- Declarative macro to create a
chrono::DateTime<chrono::Utc>
suitable for const evaluation, as this is otherwise cumbersome. - dot
- Convenience macro for creating dot values.
Structs§
- Causal
DotStore - A
DotStore
paired with aCausalContext
. - Compute
Deletions Arg - DotFun
- A map from
Dot
toV
whose computed dots is its keyset. - DotFun
Map - A map from
Dot
toV: DotStore
, whose computed dots is the union of the dots of its values. - DotMap
- A map from an arbitrary key type to a
V: DotStore
, whose computed dots is the union of the dots of its values. - DryJoin
Output - The outcome of performing a dry-join.
- Dson
Random State - This is a small wrapper around the standard RandomState. This allows us to easily switch to a non-random RandomState for use in tests.
Enums§
- DotChange
- An observed change to a dot store.
Traits§
- DotStore
- A container for data-type specific information that stores the state of a 𝛿-based CRDT.
- DotStore
Join - A trait for dot stores that can be joined.
- Extension
Type - A type that extends
TypeVariantValue
and friends with additional value types.
Functions§
- compute_
deletions_ unknown_ to - Returns dots that
known_dots
has deleted (by virtue of not being inlive_dots
) that are still present inignorant
.