txn-db 1.0.0

MVCC transaction engine for Rust storage layers. Snapshot isolation and serializable transactions with multi-version concurrency control, conflict detection, and a durable transaction log on wal-db. The transaction layer for embedded databases and Hive DB.
Documentation

Available now (0.5, feature-complete):

  • MVCC — each write creates a new version; readers see a consistent snapshot without blocking writers
  • Snapshot isolation — a transaction reads the database as of its start timestamp; its own writes are visible to itself before commit
  • Serializable (SSI) — opt-in read-set validation under the serializable feature, rejecting write skew and the read-only anomaly
  • Durable commit log — under the durability feature, Db::open logs each commit to a wal-db write-ahead log and syncs before acknowledging; the log is replayed on restart
  • Garbage collectionDb::collect_garbage reclaims versions no live transaction or snapshot can observe; an oldest-reader watermark guarantees a held snapshot's versions are never reclaimed
  • Write-write conflict detection — first-committer-wins at commit; the later writer is told to retry with a typed, retryable error
  • Sharded commit path — lock-free timestamp allocation and per-shard conflict checks, so commits to unrelated keys do not contend (loom-checked)
  • Pluggable backing store — the version store is the VersionStore trait; an in-memory store ships, and any backend (an LSM tree, a B-tree, a remote store) plugs in unchanged

Installation

[dependencies]
txn-db = "1.0"

# Opt into serializable isolation and/or a durable commit log:
txn-db = { version = "1.0", features = ["serializable", "durability"] }

Quick start

For a single read or write, skip the ceremony — get, put, and delete on the database run in their own transaction (writes retry on conflict):

use txn_db::Db;

let db = Db::new();
db.put(b"user:1:name".to_vec(), b"ada".to_vec())?;
assert_eq!(db.get(b"user:1:name")?.as_deref(), Some(&b"ada"[..]));
# Ok::<(), txn_db::TxnError>(())

When several operations must be atomic, open a transaction: begin, read and write through it, commit.

use txn_db::Db;

let db = Db::new();

// Write two keys in one atomic transaction.
let mut tx = db.begin();
tx.put(b"user:1:name".to_vec(), b"ada".to_vec());
tx.put(b"user:1:role".to_vec(), b"admin".to_vec());
tx.commit()?;

// A later transaction reads a consistent snapshot.
let tx = db.begin();
assert_eq!(tx.get(b"user:1:name")?.as_deref(), Some(&b"ada"[..]));
# Ok::<(), txn_db::TxnError>(())

When two transactions race to write the same key, the first to commit wins and the second is told to retry — that is what prevents lost updates:

use txn_db::Db;

let db = Db::new();
let mut a = db.begin();
let mut b = db.begin();
a.put(b"counter".to_vec(), b"1".to_vec());
b.put(b"counter".to_vec(), b"2".to_vec());

a.commit()?;                          // first committer wins
let err = b.commit().unwrap_err();    // second is rejected
assert!(err.is_retryable());          // retry against the fresh snapshot
# Ok::<(), txn_db::TxnError>(())

The retry loop is a few lines; see examples/concurrent_counter.rs for the contended read-modify-write pattern, examples/bank_transfer.rs for an atomic multi-key transfer, and examples/custom_store.rs for plugging in your own VersionStore.

Serializable isolation

Snapshot isolation still allows write skew: two transactions that read the same rows and write different ones can both commit, breaking an invariant that ties those rows together. With the serializable feature, begin_serializable validates a transaction's read set at commit and rejects exactly those cases.

# #[cfg(feature = "serializable")]
# {
use txn_db::Db;

let db = Db::new();
let mut seed = db.begin();
seed.put(b"on_call:alice".to_vec(), vec![1]);
seed.put(b"on_call:bob".to_vec(), vec![1]);
seed.commit()?;

// Both read the pair, then each takes one row off — classic write skew.
let mut t1 = db.begin_serializable();
let mut t2 = db.begin_serializable();
let _ = (t1.get(b"on_call:alice")?, t1.get(b"on_call:bob")?);
let _ = (t2.get(b"on_call:alice")?, t2.get(b"on_call:bob")?);
t1.put(b"on_call:alice".to_vec(), vec![0]);
t2.put(b"on_call:bob".to_vec(), vec![0]);

t1.commit()?;                          // first commits
assert!(t2.commit().is_err());         // second read a row t1 changed — rejected
# }
# Ok::<(), txn_db::TxnError>(())

See examples/serializable_doctors.rs for the full on-call-doctors demonstration, side by side under both isolation levels.

Durability

With the durability feature, Db::open backs the database with a wal-db write-ahead log. Each commit's record is appended and synced before commit returns, so an acknowledged commit survives a crash; on restart the log is replayed and uncommitted work leaves no trace.

# #[cfg(feature = "durability")]
# {
# let dir = tempfile::tempdir().unwrap();
# let path = dir.path().join("txn.wal");
use txn_db::Db;

// First run: commit, then the process exits.
{
    let db = Db::open(&path)?;
    let mut tx = db.begin();
    tx.put(b"k".to_vec(), b"v".to_vec());
    tx.commit()?;
}

// Restart: the log is replayed and the committed write is back.
let db = Db::open(&path)?;
assert_eq!(db.begin().get(b"k")?.as_deref(), Some(&b"v"[..]));
# }
# Ok::<(), txn_db::TxnError>(())

See examples/durable_store.rs for a commit / drop / reopen walkthrough.

Garbage collection

Every write keeps the previous version so in-flight readers see a stable snapshot, so versions accumulate. Db::collect_garbage reclaims the versions no live transaction or snapshot can still observe and returns how many it removed. A held snapshot pins the versions it can see, so collection never reclaims data a live reader depends on.

use txn_db::Db;

let db = Db::new();
for v in 0..100u8 {
    let mut tx = db.begin();
    tx.put(b"k".to_vec(), vec![v]);
    tx.commit()?;
}

// With no snapshot held, only the newest version need be kept.
let reclaimed = db.collect_garbage();
assert!(reclaimed > 0);
assert_eq!(db.begin().get(b"k")?.as_deref(), Some(&[99u8][..]));
# Ok::<(), txn_db::TxnError>(())

See examples/garbage_collection.rs for a demonstration of a held snapshot pinning versions against collection.

Examples

Example What it shows
quick_start Shortest end-to-end: open, write, read back.
bank_transfer Atomic multi-key update with conflict retries.
concurrent_counter Many threads increment one key; no update is lost.
snapshot_reads A snapshot stays stable as the database moves on.
custom_store Backing the engine with a custom VersionStore.
serializable_doctors Write skew under SI vs serializable (needs --features serializable).
durable_store Commit, drop, reopen — recovery from the log (needs --features durability).
garbage_collection Reclaiming old versions; a held snapshot pins what it can see.
cargo run --example quick_start
cargo run --example garbage_collection
cargo run --example serializable_doctors --features serializable
cargo run --example durable_store --features durability

Status

This is 1.0stable. The engine is feature-complete (snapshot and serializable isolation, sharded lock-free commits, a durable wal-db commit log, watermark garbage collection), tuned, hardened against adversarial schedules, and benchmarked honestly. The public API is frozen until 2.0 and the durable commit-log format is frozen for the 1.x series (docs/COMMIT_LOG_FORMAT.md). See docs/API.md for the full surface and docs/PERFORMANCE.md for hot-path and comparison numbers.

Where It Fits

txn-db is the transaction layer. It builds on:

  • wal-db — durable transaction commit log
  • lsm-db — a natural backing version store
  • Hive DB — the transaction orchestration layer (DISTRO) builds on these semantics

It stays foreign-compatible: usable standalone over any version store that implements the trait.

Contributing

Before opening a PR, cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean. Hot-path changes require a criterion benchmark; correctness-critical paths require property and/or loom tests.