Skip to main content

Crate cardano_lsm

Crate cardano_lsm 

Source
Expand description

§Cardano LSM - Log-Structured Merge Tree for Blockchain Indexing

A pure Rust implementation of an LSM tree optimized for blockchain indexing workloads, particularly UTxO-based systems like Cardano. This crate provides fast snapshots, rollback capabilities, and efficient range queries without requiring a Haskell runtime.

§Quick Start

use cardano_lsm::{LsmTree, LsmConfig, Key, Value};

// Open or create an LSM tree
let config = LsmConfig::default();
let mut tree = LsmTree::open(db_path, config)?;

// Insert key-value pairs
let key = Key::from(b"utxo_123");
let value = Value::from(b"transaction_data");
tree.insert(&key, &value)?;

// Retrieve values
if let Some(v) = tree.get(&key)? {
    println!("Found: {:?}", v);
}

// Delete keys
tree.delete(&key)?;

§Common Use Cases

§Blockchain Indexing with Snapshots

The primary use case is maintaining blockchain state with the ability to quickly roll back during chain reorganizations:

// Take a snapshot before processing a new block
let snapshot = tree.snapshot();

// Process block transactions
tree.insert(&Key::from(b"utxo_new"), &Value::from(b"data"))?;
tree.delete(&Key::from(b"utxo_spent"))?;

// If block is invalid or reorg occurs, rollback
tree.rollback(snapshot)?;
// Tree is now back to the snapshot state

§Range Queries

Efficiently scan ranges of keys, useful for querying all UTxOs for an address:

let start = Key::from(b"addr_123_");
let end = Key::from(b"addr_124_");

for (key, value) in tree.range(&start, &end) {
    // Process each key-value pair in range
    println!("Key: {:?}, Value: {:?}", key, value);
}

§Batch Operations

For high-throughput scenarios, use batch operations to amortize I/O costs:

let batch = vec![
    (Key::from(b"key1"), Value::from(b"value1")),
    (Key::from(b"key2"), Value::from(b"value2")),
];
tree.insert_batch(batch)?;

// Deletions are done separately
tree.delete(&Key::from(b"key3"))?;

§Configuration

Customize the LSM tree behavior with LsmConfig:

use cardano_lsm::LsmConfig;

let config = LsmConfig {
    memtable_size: 16 * 1024 * 1024,  // 16 MB memtable
    bloom_filter_bits_per_key: 12,     // Bits per key for bloom filter
    ..Default::default()
};

§Persistent Snapshots

Save and restore snapshots for backup or testing:

// Save current state to disk
tree.save_snapshot("block_12345", "after block 12345")?;
drop(tree); // Close the tree

// Later, restore from the saved snapshot
let tree = LsmTree::open_snapshot(db_path, "block_12345")?;

§Performance Characteristics

  • Snapshot creation: < 10ms (reference counting, no data copy)
  • Rollback: < 1s (typically used for short-term reorgs)
  • Write throughput: Optimized for blockchain workloads with batch operations
  • Read latency: Bloom filters provide fast negative lookups
  • Compaction: LazyLevelling strategy balances write amp and space amp

§Thread Safety

LsmTree uses internal locking and can be safely shared across threads:

// Wrap in Arc to share across threads (reads are thread-safe)
let tree = Arc::new(tree);

let tree_a = tree.clone();
let tree_b = tree.clone();

let ha = thread::spawn(move || tree_a.get(&Key::from(b"key")));
let hb = thread::spawn(move || tree_b.get(&Key::from(b"key")));

ha.join().unwrap()?;
hb.join().unwrap()?;

§Testing

This implementation includes 10,000+ property-based conformance tests validated against the Haskell reference implementation with a 100% pass rate.

§Important Notes

  • No Write-Ahead Log (WAL): Writes are lost on crash until a snapshot is saved. This design choice trades crash recovery for simplicity and performance.
  • Session locking: Only one process can access a database directory at a time.
  • Snapshots are reference-counted: Immutable view of data at a point in time.

Structs§

Hash
IncrementalMerkleTree
Key
LsmConfig
LsmSnapshot
LsmStats
LsmTree
MerkleDiff
MerkleLeaf
MerkleProof
MerkleRoot
MerkleSnapshot
MonoidalLsmTree
LSM tree with monoidal value support for efficient aggregation
MonoidalSnapshot
Immutable snapshot of a MonoidalLsmTree
PersistentSnapshot
A persistent snapshot on disk
RangeIter
SnapshotMetadata
Metadata for a persistent snapshot (CBOR format matching Haskell)
SnapshotRun
Information about a single SSTable run in a snapshot
Value

Enums§

CompactionStrategy
Compaction strategy for the LSM tree
Direction
Error

Traits§

Monoidal
Values that can be combined with an associative operation (monoid)

Type Aliases§

Result