holt 0.5.2 - Docs.rs

# Holt 0.4: Multi-Tree DB, Verus Verification, and Crash Soak Testing

*May 2026*

[`holt`](https://github.com/feichai0017/holt) is an embedded
persistent Adaptive Radix Tree (ART) engine designed for
path-shaped metadata workloads — S3 object catalogs, POSIX
filesystem metadata, container image layer indexes, and AI
artifact registries. It packs variable-size ART nodes into
self-describing 512 KB blob frames with a cross-blob walker
abstraction, with LeanStore-style hybrid latching for
concurrency.

The 0.4 cycle (`0.4.0`, `0.4.1`, and `0.4.2`) is the most significant
release since the initial v0.3 line. This note covers what's
new, why each piece exists, and lessons from shipping it.

## TL;DR

Five things landed in 0.4:

1. **Multi-tree DB layer with cross-tree atomic batches.** A
   `DB` now hosts multiple named `Tree`s, with `DB::atomic`
   making cross-tree mutations all-or-nothing under one
   envelope WAL record.
2. **Verus formal verification.** A separate `verified/`
   subcrate models the ART shape invariants and the DB
   catalog state machine. The DB catalog model proves
   visibility-transition correctness for tree
   create/drop/finalize.
3. **`SIGKILL`-based crash soak harness.** A new `holt-soak`
   tool runs five workload modes including `crash` and
   `db-crash` — the parent process repeatedly `SIGKILL`s a
   child writer and verifies every `fsync`-acknowledged
   operation replays on reopen.
4. **Cold-read mitigations.** Route anchor residency keeps
   root and hot prefix anchors pinned through eviction; a
   TinyLFU admission filter replaces pure LRU on the
   file-backed buffer pool.
5. **Runtime telemetry.** Cache hit/miss, WAL queue / write /
   flush progress, checkpoint debt, dirty blob counts,
   route-cache behaviour, admission/eviction decisions all
   exposed via `Tree::stats()` / `DB::stats()`.

## 1. From `Tree` to `DB`

In 0.3, a Holt store was a single `Tree`. Real metadata
backends typically have multiple logical namespaces —
buckets vs. objects vs. multipart uploads in an S3
implementation, or dentry vs. inode vs. xattr in a POSIX
filesystem. The 0.3 workaround was prefix segmentation
inside one tree; the new 0.4 model is a proper multi-tree
DB:

```rust
use holt::{DB, DBAtomicBatch};

let db = DB::open("/var/lib/myapp/meta.holt")?;

db.create_tree("buckets")?;
db.create_tree("objects")?;
db.create_tree("multipart-uploads")?;

// Cross-tree atomic batch: bucket update + object create
// + multipart cleanup, all or nothing under one WAL record.
db.atomic(|b: &mut DBAtomicBatch| {
    b.tree("buckets")
     .compare_and_put(b"acme", bucket_version, &updated_bucket);
    b.tree("objects")
     .put_if_absent(b"acme/path/to/object", &object_meta);
    b.tree("multipart-uploads")
     .delete(b"acme/upload-id");
})?;
```

The preflight/apply protocol is overlay-based: the closure
buffers ops, the engine walks the overlay under
`maintenance_gate.enter_exclusive()` to evaluate every guard
(`put_if_absent`, `compare_and_put`, `delete_if_version`,
`assert_prefix_empty`) against the projected state, and
only emits the envelope WAL record if every guard passes.
This gives etcd-Txn-class semantics in an embedded library.

## 2. Verus formal verification

The 0.3.2 release added a small Verus model for the ART
shape invariants — node capacity classes, grow/shrink
hysteresis, prefix splits, delimiter rollup bounds, leaf
extent alignment. 0.4 extends this to cover the new DB
catalog state machine:

* Only live trees are visible catalog entries.
* Dropped trees stay hidden until finalised.
* User tree-id allocation is monotonic.
* The reserved catalog id is never handed out to a user
  tree.

Verus lives in a separate subcrate so the production build
doesn't depend on it; `cargo verus build` runs the proof
checker, and the existing `cargo test` path is unaffected.

This isn't full ART correctness — it's a deliberately small
set of invariants that the implementation must respect,
chosen so each spec corresponds to a real invariant in
`src/`. The point is to harden the highest-risk pieces
(catalog visibility transitions, node shape preservation)
rather than to formalise everything.

## 3. Crash soak testing

The `tools/soak/` harness runs five workload modes:

* `normal` — multi-threaded read/write/delete + key-only
  prefix scan + atomic batch + checkpoint + reopen + oracle
  verification.
* `db-normal` — same protocol but on a multi-tree DB with
  cross-tree atomic batches.
* `crash` — parent process repeatedly starts a child writer,
  kills it with `SIGKILL`, reopens the tree, and verifies
  every operation the child acknowledged in `soak-ack.log`
  actually replayed.
* `db-crash` — same kill/reopen protocol, but each ack'd
  operation is one cross-tree `DB::atomic` transaction
  recorded in `soak-db-ack.log`.
* `child` — internal mode used by `crash` / `db-crash`.

The interesting ones are the `crash` family. The harness
maintains an append-only ack log of operations the child
`fsync`ed; after each kill, the parent reopens the store
and replays the log to check that every ack'd op survived.
This catches the class of bugs where a write returns `Ok`
to the caller but the WAL record didn't actually make it
to disk — a category of bugs that fully escape unit tests.

Inspired by the TigerBeetle / Jepsen-style testing
discipline, scaled to a single-developer project.

## 4. Cold-read mitigations

A 512 KB blob is great when locality lets you amortise the
read across thousands of leaves — `list_dir`, prefix scans,
and burst writes all benefit. It's not great when your
working set is much larger than your buffer pool and access
patterns are uniform: pulling 512 KB for one leaf is
wasteful.

0.4 ships two mitigations:

* **Route anchor residency.** The root blob and frequently
  walked prefix-anchor blobs stay pinned through ordinary
  leaf eviction. Tree-level invariant: the descent path for
  hot prefixes stays in cache even when the total working
  set thrashes.
* **TinyLFU admission for file-backed cache.** Pure LRU
  thrashes badly when a single full-tree scan touches every
  blob once. TinyLFU's frequency sketch + admission policy
  rejects one-shot scans from polluting the cache,
  protecting actual hot blobs.

Together these reduce p99 latency variance on prefix-heavy
metadata reads under buffer-pool pressure.

## 5. Runtime telemetry

Production deployment needs observability. 0.4 exposes:

```rust
let stats = db.stats()?;
println!("cache hit rate: {:.2}%",
    stats.bm.cache_hit_rate() * 100.0);
println!("WAL queue depth: {}", stats.journal.queued_records);
println!("checkpoint debt blobs: {}", stats.checkpointer.debt);
println!("route cache hits: {}", stats.route_cache.hits);
```

These are read-only sketches over the existing internal
counters — no impact on hot paths.

## What's next

* **Predicate-native primitives.** Extending the storage
  layer with crash-safe subtree summaries (live count,
  negative filter, component summary) so high-level metadata
  operations like `assert_prefix_empty` and delimiter rollup
  answer in O(depth), not O(leaves). This is the subject of
  an in-progress HotStorage 2026 submission.
* **External integration.** HoltFS — a Rust-C++ FFI wrapper
  exposing Holt indexes inside DuckDB — was merged into
  `duckdb/community-extensions` in May. DuckDB users can
  `INSTALL holtfs`.

## Try it

```toml
[dependencies]
holt = "0.4.2"
```

Examples: [`basic_kv`](https://github.com/feichai0017/holt/blob/main/examples/basic_kv.rs),
[`filesystem_meta`](https://github.com/feichai0017/holt/blob/main/examples/filesystem_meta.rs),
[`s3_metadata`](https://github.com/feichai0017/holt/blob/main/examples/s3_metadata.rs).

Architecture deep-dive (~410 lines):
[`ARCHITECTURE.md`](https://github.com/feichai0017/holt/blob/main/ARCHITECTURE.md).

Feedback / issues / contributions all welcome. The
`good first issue` label has a handful of accessible
starting points.

---

GitHub: <https://github.com/feichai0017/holt>  
crates.io: <https://crates.io/crates/holt>  
docs.rs: <https://docs.rs/holt>