# Holt 0.4: Multi-Tree DB, Verus Verification, and Crash Soak Testing
*May 2026*
[`holt`](https://github.com/feichai0017/holt) is an embedded
persistent Adaptive Radix Tree (ART) engine designed for
path-shaped metadata workloads — S3 object catalogs, POSIX
filesystem metadata, container image layer indexes, and AI
artifact registries. It packs variable-size ART nodes into
self-describing 512 KB blob frames with a cross-blob walker
abstraction, with LeanStore-style hybrid latching for
concurrency.
The 0.4 cycle (`0.4.0`, `0.4.1`, and `0.4.2`) is the most significant
release since the initial v0.3 line. This note covers what's
new, why each piece exists, and lessons from shipping it.
## TL;DR
Five things landed in 0.4:
1. **Multi-tree DB layer with cross-tree atomic batches.** A
`DB` now hosts multiple named `Tree`s, with `DB::atomic`
making cross-tree mutations all-or-nothing under one
envelope WAL record.
2. **Verus formal verification.** A separate `verified/`
subcrate models the ART shape invariants and the DB
catalog state machine. The DB catalog model proves
visibility-transition correctness for tree
create/drop/finalize.
3. **`SIGKILL`-based crash soak harness.** A new `holt-soak`
tool runs five workload modes including `crash` and
`db-crash` — the parent process repeatedly `SIGKILL`s a
child writer and verifies every `fsync`-acknowledged
operation replays on reopen.
4. **Cold-read mitigations.** Route anchor residency keeps
root and hot prefix anchors pinned through eviction; a
TinyLFU admission filter replaces pure LRU on the
file-backed buffer pool.
5. **Runtime telemetry.** Cache hit/miss, WAL queue / write /
flush progress, checkpoint debt, dirty blob counts,
route-cache behaviour, admission/eviction decisions all
exposed via `Tree::stats()` / `DB::stats()`.
## 1. From `Tree` to `DB`
In 0.3, a Holt store was a single `Tree`. Real metadata
backends typically have multiple logical namespaces —
buckets vs. objects vs. multipart uploads in an S3
implementation, or dentry vs. inode vs. xattr in a POSIX
filesystem. The 0.3 workaround was prefix segmentation
inside one tree; the new 0.4 model is a proper multi-tree
DB:
```rust
use holt::{DB, DBAtomicBatch};
let db = DB::open("/var/lib/myapp/meta.holt")?;
db.create_tree("buckets")?;
db.create_tree("objects")?;
db.create_tree("multipart-uploads")?;
// Cross-tree atomic batch: bucket update + object create
// + multipart cleanup, all or nothing under one WAL record.
.compare_and_put(b"acme", bucket_version, &updated_bucket);
b.tree("objects")
.put_if_absent(b"acme/path/to/object", &object_meta);
b.tree("multipart-uploads")
.delete(b"acme/upload-id");
})?;
```
The preflight/apply protocol is overlay-based: the closure
buffers ops, the engine walks the overlay under
`maintenance_gate.enter_exclusive()` to evaluate every guard
(`put_if_absent`, `compare_and_put`, `delete_if_version`,
`assert_prefix_empty`) against the projected state, and
only emits the envelope WAL record if every guard passes.
This gives etcd-Txn-class semantics in an embedded library.
## 2. Verus formal verification
The 0.3.2 release added a small Verus model for the ART
shape invariants — node capacity classes, grow/shrink
hysteresis, prefix splits, delimiter rollup bounds, leaf
extent alignment. 0.4 extends this to cover the new DB
catalog state machine:
* Only live trees are visible catalog entries.
* Dropped trees stay hidden until finalised.
* User tree-id allocation is monotonic.
* The reserved catalog id is never handed out to a user
tree.
Verus lives in a separate subcrate so the production build
doesn't depend on it; `cargo verus build` runs the proof
checker, and the existing `cargo test` path is unaffected.
This isn't full ART correctness — it's a deliberately small
set of invariants that the implementation must respect,
chosen so each spec corresponds to a real invariant in
`src/`. The point is to harden the highest-risk pieces
(catalog visibility transitions, node shape preservation)
rather than to formalise everything.
## 3. Crash soak testing
The `tools/soak/` harness runs five workload modes:
* `normal` — multi-threaded read/write/delete + key-only
prefix scan + atomic batch + checkpoint + reopen + oracle
verification.
* `db-normal` — same protocol but on a multi-tree DB with
cross-tree atomic batches.
* `crash` — parent process repeatedly starts a child writer,
kills it with `SIGKILL`, reopens the tree, and verifies
every operation the child acknowledged in `soak-ack.log`
actually replayed.
* `db-crash` — same kill/reopen protocol, but each ack'd
operation is one cross-tree `DB::atomic` transaction
recorded in `soak-db-ack.log`.
* `child` — internal mode used by `crash` / `db-crash`.
The interesting ones are the `crash` family. The harness
maintains an append-only ack log of operations the child
`fsync`ed; after each kill, the parent reopens the store
and replays the log to check that every ack'd op survived.
This catches the class of bugs where a write returns `Ok`
to the caller but the WAL record didn't actually make it
to disk — a category of bugs that fully escape unit tests.
Inspired by the TigerBeetle / Jepsen-style testing
discipline, scaled to a single-developer project.
## 4. Cold-read mitigations
A 512 KB blob is great when locality lets you amortise the
read across thousands of leaves — `list_dir`, prefix scans,
and burst writes all benefit. It's not great when your
working set is much larger than your buffer pool and access
patterns are uniform: pulling 512 KB for one leaf is
wasteful.
0.4 ships two mitigations:
* **Route anchor residency.** The root blob and frequently
walked prefix-anchor blobs stay pinned through ordinary
leaf eviction. Tree-level invariant: the descent path for
hot prefixes stays in cache even when the total working
set thrashes.
* **TinyLFU admission for file-backed cache.** Pure LRU
thrashes badly when a single full-tree scan touches every
blob once. TinyLFU's frequency sketch + admission policy
rejects one-shot scans from polluting the cache,
protecting actual hot blobs.
Together these reduce p99 latency variance on prefix-heavy
metadata reads under buffer-pool pressure.
## 5. Runtime telemetry
Production deployment needs observability. 0.4 exposes:
```rust
let stats = db.stats()?;
println!("cache hit rate: {:.2}%",
stats.bm.cache_hit_rate() * 100.0);
println!("WAL queue depth: {}", stats.journal.queued_records);
println!("checkpoint debt blobs: {}", stats.checkpointer.debt);
println!("route cache hits: {}", stats.route_cache.hits);
```
These are read-only sketches over the existing internal
counters — no impact on hot paths.
## What's next
* **Predicate-native primitives.** Extending the storage
layer with crash-safe subtree summaries (live count,
negative filter, component summary) so high-level metadata
operations like `assert_prefix_empty` and delimiter rollup
answer in O(depth), not O(leaves). This is the subject of
an in-progress HotStorage 2026 submission.
* **External integration.** HoltFS — a Rust-C++ FFI wrapper
exposing Holt indexes inside DuckDB — was merged into
`duckdb/community-extensions` in May. DuckDB users can
`INSTALL holtfs`.
## Try it
```toml
[dependencies]
holt = "0.4.2"
```
Examples: [`basic_kv`](https://github.com/feichai0017/holt/blob/main/examples/basic_kv.rs),
[`filesystem_meta`](https://github.com/feichai0017/holt/blob/main/examples/filesystem_meta.rs),
[`s3_metadata`](https://github.com/feichai0017/holt/blob/main/examples/s3_metadata.rs).
Architecture deep-dive (~410 lines):
[`ARCHITECTURE.md`](https://github.com/feichai0017/holt/blob/main/ARCHITECTURE.md).
Feedback / issues / contributions all welcome. The
`good first issue` label has a handful of accessible
starting points.
---
GitHub: <https://github.com/feichai0017/holt>
crates.io: <https://crates.io/crates/holt>
docs.rs: <https://docs.rs/holt>