CRDT-Lite
[!WARNING] This project is in early development and not intended for production use.
A lightweight implementation of Conflict-free Replicated Data Types (CRDTs) in both Rust and C++. CRDT-Lite provides generic CRDT structures for building distributed systems requiring eventual consistency.
CRDT-Lite is currently being used in Formabble, a collaborative game engine, and will be integrated into a derived product that we will announce soon.
What's Inside
This library includes two CRDT implementations:
- Column-Based CRDT (Rust:
src/lib.rs, C++:crdt.hpp) - Generic key-value store with fine-grained column-level conflict resolution - Text CRDT (C++:
text_crdt.hpp) - Line-based collaborative text editor with fractional positioning
Both Rust and C++ implementations share the same core algorithms and maintain API compatibility.
Features
Column-Based CRDT
- ✅ Generic over key/value types - Use any type that meets the requirements
- ✅ Fine-grained conflict resolution - Column-level (field-level) versioning
- ✅ Last-Write-Wins semantics - Deterministic conflict resolution using logical clocks
- ✅ Tombstone-based deletion - Proper handling of deleted records across nodes
- ✅ Parent-child hierarchies - Support for layered CRDT structures
- ✅ Custom merge rules - Implement your own conflict resolution strategies
- ✅ Change compression - Optimized transmission by removing redundant changes
- ✅ Efficient sync - Track version boundaries to skip unchanged records
Text CRDT
- ✅ Line-based editing - Collaborative text with line granularity
- ✅ Fractional positioning - Infinite density between any two lines
- ✅ Multiple merge strategies - Last-Write-Wins and Both-Writes-Win support
- ✅ Dual indexing - Fast position-ordered iteration and ID-based lookup
- ✅ Conflict detection - Preserve or resolve concurrent edits
- ✅ Change streaming - Real-time collaboration support
Platform Support
- ✅
no_stdcompatible - Works in embedded systems and otherno_stdenvironments (requiresalloc) - ✅ Zero-cost abstractions - Uses hashbrown (same HashMap as std) in
no_stdmode - ✅ Optional std - std is enabled by default for backwards compatibility
Using in no_std Environments
The Rust implementation supports no_std environments with allocator support.
Requirements:
alloccrate required: This library needs an allocator forVec,HashMap,Arc, etc.- Example setup:
extern crate alloc; use CRDT; // ... use as normal
Cargo.toml configuration:
[]
# For no_std with basic CRDT functionality (requires alloc feature)
= { = "0.3", = false, = ["alloc"] }
# For no_std with JSON serialization
= { = "0.3", = false, = ["alloc", "json"] }
# For no_std with binary serialization (bincode)
= { = "0.3", = false, = ["alloc", "binary"] }
# For standard environments (default, uses std::collections::HashMap)
= { = "0.3", = ["json"] }
Implementation Notes:
no_stdmode useshashbrown::HashMap, which is the same underlying implementation thatstd::collections::HashMapuses (identical performance)- The
stdfeature is enabled by default for backwards compatibility - The
allocfeature is required forno_stdenvironments and pulls inhashbrownonly when needed - When using std (default),
hashbrownis not compiled, reducing dependency bloat - Binary serialization uses
bincode2.0 which has fullno_stdsupport
NodeId Type Configuration
By default, NodeId is u64. For applications using UUID-based node identifiers, enable the node-id-u128 feature:
[]
= { = "0.4", = ["node-id-u128"] }
Why u128?
u64provides 2^64 (~18 quintillion) unique IDs- With random UUID generation, birthday paradox makes collisions likely after ~4 billion IDs
u128provides 2^128 unique IDs, eliminating collision concerns for UUID-based systems- C++ implementation allows customizing
CrdtNodeIdvia preprocessor (seecrdt.hpp)
Quick Start
Rust Implementation
# or add to Cargo.toml: crdt-lite = "0.3"
use ;
// Create two CRDT nodes
let mut node1: = CRDTnew;
let mut node2: = CRDTnew;
// Node 1: Insert data
let changes1 = node1.insert_or_update;
// Node 2: Insert conflicting data
let changes2 = node2.insert_or_update;
// Merge changes bidirectionally
let merge_rule = DefaultMergeRule;
node1.merge_changes;
node2.merge_changes;
// Both nodes converge to same state (node2 wins due to higher node_id)
assert_eq!;
Run tests:
C++ Implementation
Compile and run:
# Column CRDT tests
&&
# Text CRDT tests
&&
Column CRDT example:
CRDT<std::string, std::string> ;
CRDT<std::string, std::string> ;
// Node 1: Insert data
std::vector<Change<std::string, std::string>> changes1;
node1.;
// Node 2: Insert conflicting data
std::vector<Change<std::string, std::string>> changes2;
node2.;
// Merge changes bidirectionally
node1.;
node2.;
// Both nodes converge
;
Text CRDT example:
TextCRDT<std::string> ;
TextCRDT<std::string> ;
// Both nodes insert lines concurrently
auto id1 = doc1.;
auto id2 = doc2.;
// Sync changes
uint64_t sync_version = 0;
auto changes1 = doc1.;
auto changes2 = doc2.;
doc2.;
doc1.;
// Both nodes have both lines
;
;
Core Concepts
Column-Based Design
Records are stored as maps of columns (field names) to values. Each column has independent version tracking:
// Rust
Why column-based?
- Conflicts resolved per-field, not per-record
- Only changed columns need syncing
- Natural fit for structured data (forms, database records)
Conflict Resolution
Conflicts are resolved deterministically using a three-tier comparison:
- Column Version (per-field edit counter) - Higher wins
- Database Version (global logical clock) - Higher wins
- Node ID (unique node identifier) - Higher wins (tie-breaker)
This ordering ensures:
- Field-level granularity: Each field resolves independently
- Causal ordering: Logical clocks prevent out-of-order updates
- Determinism: All nodes converge to identical state
Logical Clocks
Maintains causality using Lamport-style logical clocks:
Important: Always update clock on merge, even for rejected changes (prevents clock drift).
Tombstone-Based Deletion
Deleted records are marked with tombstones rather than immediately removed:
⚠️ Critical: Tombstone Management
Tombstones accumulate indefinitely unless compacted. To prevent memory exhaustion:
- Track which versions have been acknowledged by ALL nodes
- Call
compact_tombstones(min_acknowledged_version)periodically - Never compact early - deleted records will reappear on nodes that haven't seen the deletion yet (zombie records)
Field Deletion
Individual fields can be deleted from records without removing the entire record:
// Rust
let change = crdt.delete_field;
// C++
std::vector<Change<std::string, std::string>> changes;
bool success = crdt.;
How it works:
- Field is removed from
record.fieldsmap ColumnVersionentry remains inrecord.column_versions(acts as implicit field tombstone)- Syncs to other nodes as
Change { col_name: Some("field"), value: None } - Field deletion is versioned like field updates (increments
col_version)
Distinguished from null values:
- Field deletion: Field removed entirely from map
- Null value: Field exists in map with null/None value (if V = Option)
Fractional Positioning (Text CRDT)
Each line in the text CRDT has a position defined by a path of integers:
;
Properties:
- Total ordering with infinite density
- Can always insert between any two positions
- Automatically extends depth when space runs out
Synchronization
Basic Sync Protocol
// Rust
let changes = node1.get_changes_since;
node2.merge_changes;
// Optionally exclude changes from specific nodes
let excluding = from;
let changes = node1.get_changes_since_excluding;
// C++
auto changes = node1.;
node2.;
// Or use the helper function
uint64_t last_sync = 0;
;
Change Compression
When syncing with parent-child CRDTs or after accumulating many changes:
// Rust
CRDT::compress_changes;
// C++
;
This removes redundant operations (O(n log n)):
- Superseded field updates (same record+column, older version)
- Field updates replaced by record deletions
Advanced Features
Parent-Child Hierarchies
Create temporary overlays or transaction isolation:
// Rust
use Arc;
let parent = new;
let child = CRDTnew;
// Child sees parent data but maintains separate modifications
// C++
auto parent = std::make_shared<CRDT<K, V>>;
CRDT<K, V> ;
// Generate inverse changes to undo child's work
auto revert_changes = child.;
// Compute difference between two CRDTs
auto diff = child.;
Custom Merge Rules
// Rust
;
crdt.merge_changes;
// C++
;
crdt.;
Text CRDT Merge Strategies
Last-Write-Wins (default):
doc.;
Both-Writes-Win (preserve conflicts):
BothWritesWinMergeRule<std::string, std::string> bww;
doc.;
// Check for conflicts
auto line = doc.;
if
⚠️ Auto-Merge (EXPERIMENTAL - DO NOT USE):
The AutoMergingTextRule is currently broken and violates CRDT convergence guarantees. See CLAUDE.md for details.
Security and DoS Protection
Trust Model
⚠️ This is a data structure library, not a complete distributed system. Security must be implemented at higher layers:
- No authentication (accepts any changes)
- No authorization (no access control)
- Assumes all nodes are non-malicious
DoS Mitigation Strategies
-
Tombstone Accumulation
- Track tombstone count:
crdt.tombstone_count() - Set application-level limits
- Compact periodically after all nodes acknowledge a version
- Track tombstone count:
-
Resource Exhaustion
- Implement rate limiting on operations
- Validate and limit key/value sizes
- Set maximum records/columns per record
-
Clock Manipulation
- Malicious nodes can set high
db_versionto win all conflicts - Use authenticated logical clocks in production
- Malicious nodes can set high
Production Recommendations
- Network Layer: Use TLS/encryption for change transmission
- Authentication: Verify node identity (HMAC, digital signatures)
- Rate Limiting: Per-node operation limits
- Input Validation: Sanitize and limit sizes
- Monitoring: Track tombstone growth and memory usage
- Thread Safety: Use
Arc<Mutex<CRDT>>in Rust, external locks in C++
Performance Characteristics
Time Complexity
| Operation | Average Case | Notes |
|---|---|---|
insert_or_update |
O(n) | n = number of fields |
delete_field |
O(1) | HashMap field removal |
delete_record |
O(1) | HashMap record removal |
merge_changes |
O(c) | c = number of changes |
get_changes_since |
O(r × f) | r = records, f = fields (optimized with version bounds) |
compress_changes |
O(n log n) | Uses unstable sort for better performance |
compact_tombstones |
O(t) | t = number of tombstones |
Memory Efficiency
- HashMap-based storage: O(1) average case lookups
- Version boundaries: Skip unchanged records during sync
- Change compression: Remove redundant operations
- Tombstone compaction: Prevent unbounded growth
Limitations
- Thread Safety: Not thread-safe; external synchronization required
- Network Transport: Not included; implement your own sync protocol
- Text CRDT: Auto-merge feature is incomplete (use BWW instead)
- No Encryption: Implement at application/network layer
Migration Between Languages
| C++ | Rust |
|---|---|
CRDT<K, V> crdt(node_id); |
let mut crdt = CRDT::<K, C, V>::new(node_id, None); |
crdt.insert_or_update(id, changes, pair1, pair2); |
let changes = crdt.insert_or_update(&id, vec![pair1, pair2]); |
crdt.delete_field(id, "field", changes); |
if let Some(change) = crdt.delete_field(&id, &"field") { ... } |
crdt.delete_record(id, changes); |
if let Some(change) = crdt.delete_record(&id) { ... } |
crdt.merge_changes(std::move(changes)); |
crdt.merge_changes(changes, &DefaultMergeRule); |
auto* record = crdt.get_record(id); |
let record = crdt.get_record(&id); |
Documentation
CLAUDE.md- Comprehensive technical documentation for developers (and Claude!)- See inline documentation in source files
Future Enhancements
Planned
- Tombstone garbage collection improvements
- Custom merge functions for specialized use cases
- Text CRDT: Fix auto-merge algorithm
- Text CRDT: Implement move operations
- Text CRDT: Position rebalancing
Rust-Specific Possibilities
-
asyncsupport for network operations - Optional resource limits (max records, max tombstones)
Completed
-
serdeintegration for serialization (v0.2.0) - WebAssembly support via
no_std+alloc(v0.2.0)
Testing
# Rust
# C++ - Column CRDT
&&
# C++ - Text CRDT
&&
Contributing
Contributions are welcome! Please ensure:
Rust:
- All tests pass (
cargo test) - Code is formatted (
cargo fmt) - No clippy warnings (
cargo clippy) - Maintain feature parity with C++
C++:
- Requires C++20 compatible compiler
- All tests pass
- Maintain feature parity with Rust
License
This project is licensed under the MIT License - see the LICENSE file for details.
References
CRDT Theory
Text CRDTs
- LSEQ: Adaptive Structure for Sequences
- Logoot: Scalable Optimistic Replication
- Fractional Indexing - Figma
Logical Clocks
CRDT-Lite offers a streamlined approach to conflict-free replicated data types, balancing simplicity and efficiency. By focusing on fine-grained conflict resolution and deterministic merge semantics, CRDT-Lite is well-suited for applications requiring scalability and low overhead in distributed environments.