# iqdb v0.4.0 — Durable Storage
**It survives a restart now.** v0.4.0 wires `Iqdb::open(path)` through to a directory-backed durable store: a length-prefixed binary snapshot, a write-ahead log of changes since the snapshot, and a cross-platform `full_sync` primitive that takes the strongest sync each operating system exposes — `F_FULLFSYNC` on macOS via a direct `fcntl` call, `fsync(2)` on other Unix, `FlushFileBuffers` on Windows. `Iqdb::close` runs a compaction (snapshot rewrite + atomic rename + WAL truncate) so the next open is a single-file load. Recovery handles corrupted WAL tails by truncating to the last known-good offset. The v0.3.0 CRUD and search surface is unchanged; every method dispatches through a `pub(crate)` `Backend` enum so in-memory and file-backed paths share the same public API with zero dynamic-dispatch cost.
## What is iqdb?
An embedded vector database for Rust — a single-process, in-application similarity-search engine designed for high-dimensional workloads where every microsecond on the query path matters. It targets the same operational shape as `sqlite` or `redb`: no daemon, no network hop, no separate runtime. Open a handle, write vectors, query nearest neighbours — all from inside your binary. The engine is built against a lock-free hot path, an allocation-free steady state, and a cache-aware on-disk layout, with pluggable indices and pluggable storage so workloads can trade recall for latency without rewriting the surrounding application.
## What's new in 0.4.0
### `Iqdb::open(path)` is load-bearing
The simplest possible promotion: what used to be `Err(Error::NotImplemented)` is now a fully wired durable open path. Pass a directory; iqdb creates or opens it; records survive process restart:
```rust,no_run
use iqdb::{Iqdb, Record, RecordId, Result, Vector};
fn run() -> Result<()> {
let db = Iqdb::open("./data/my-db")?;
db.upsert(Record::new(
RecordId::new(1),
Vector::new(vec![0.1, 0.2, 0.3])?,
))?;
db.flush()?;
db.close() // compacts: snapshot rewrite + WAL truncate
}
```
Subsequent opens of the same path return a handle whose in-memory map is reconstructed from the on-disk snapshot + WAL. The first session's records survive byte-for-byte.
### Directory layout
The store owns two files inside the path you give it:
```text
<path>/
├── snap — most recent durable snapshot (all records)
└── wal — write-ahead log of changes since the snapshot
```
That's the entire on-disk footprint. No journal files, no lock files (single-process safety is a documented limitation in v0.4.0; multi-process opens of the same path are undefined behaviour, recommend wrapping `Iqdb::open` in an application-level lock if you need it). No background compaction thread — compaction runs once per `close`.
### Write path — WAL append, then memory update
Every `upsert` and `delete` runs the same sequence:
1. Encode the op into a framed binary entry (length prefix + body + CRC32).
2. Append the entry to the WAL file.
3. Apply the change to the in-memory map.
The WAL append happens **before** the memory update. If the append fails (`ENOSPC`, `EIO`, …), the in-memory map is not touched — the operation is reported as failed and no state change is observable to subsequent reads.
The WAL append does **not** `fsync` by default. The OS page cache holds the bytes; durability against power loss requires an explicit `flush`. This matches SQLite's default ("normal" mode) and is the right trade-off for the vast majority of workloads — per-write fsync would multiply small-write latency by 10–100× on rotating media. Per-write sync mode is reserved for a future milestone behind a builder knob.
### Durability contract
A successful `upsert` followed by a successful `flush` is durable across a power cut. An `upsert` whose `flush` has not yet returned may be lost on a crash.
A clean process exit followed by a new process opening the same path sees every record that was ever upserted — the page cache survives a clean drop and the WAL replay reconstructs the state. The integration test `upsert_without_flush_or_close_is_still_replayed_on_reopen` pins this behaviour.
### Recovery on open
`Iqdb::open(path)` rebuilds the in-memory map in three steps:
1. Read `<path>/snap` into the map (if it exists).
2. Replay `<path>/wal` on top, applying each framed op in order.
3. Truncate the WAL to the last known-good offset (handles corrupt tails left over from a prior crash).
A corrupt frame mid-WAL stops replay. Records before the corruption are preserved; anything after is discarded as un-recoverable, and the WAL is truncated so the next append is contiguous with the recovered history. The integration test `corrupt_wal_tail_is_truncated_silently` pins the truncation behaviour; appending junk to the WAL after a clean upsert leaves the original record recoverable on next open and the WAL back in a write-ready state.
A corrupt snapshot is a harder error — it fails the open with `Error::Corrupt { reason }`. The integration test `corrupt_snapshot_surfaces_corrupt_error` writes bytes that fail the magic check and asserts the open fails. Snapshot recovery from a checkpoint chain (older snapshots kept around as fall-back) is a future-milestone concern.
### Close / compact
`Iqdb::close` on a file-backed handle runs a **compaction**:
1. Acquire the write lock so the in-memory map cannot change.
2. Write a fresh snapshot to `<path>/snap.tmp`.
3. `full_sync` the snapshot file.
4. **Atomically** rename `snap.tmp` → `snap` (`MoveFileExW` with `MOVEFILE_REPLACE_EXISTING` on Windows, `rename(2)` on Unix — both single-step on the same volume).
5. Truncate the WAL to zero bytes.
6. `full_sync` the WAL.
After `close` returns, the on-disk state is the snapshot alone. The next open is a single-file load — no WAL replay required. The integration test `close_truncates_wal_to_zero_bytes` pins this behaviour.
If the process crashes between steps 3 and 4 (snapshot durable but rename not yet visible), the next open sees the previous snapshot and the un-truncated WAL — recovery is idempotent.
### Cross-platform `full_sync`
The strongest sync each platform exposes:
| OS | Primitive |
|-----------|--------------------------------------------|
| Linux | `fsync(2)` (via `File::sync_all`) |
| macOS | `fcntl(fd, F_FULLFSYNC, 0)` (direct libc) |
| Windows | `FlushFileBuffers` (via `File::sync_all`) |
| other Unix| `fsync(2)` (via `File::sync_all`) |
macOS is the only platform where `File::sync_all` is not enough: on Apple silicon and Intel macs both, `fsync` returns once the data has reached the drive's write-back cache, not the platter / flash. The drive controller's cache can be lost on a power cut. `F_FULLFSYNC` asks the drive to flush its cache to durable storage before returning. SQLite, Apple's own Core Data, and most embedded databases use it for the same reason.
The trade-off is throughput — `F_FULLFSYNC` is materially slower than `fsync` on rotating media, and somewhat slower on SSDs. Callers that prioritise throughput over durability can still get the faster `fsync` semantics by calling `File::sync_all` directly on the WAL file (the public `Iqdb::flush` always takes the strongest primitive).
### Binary frame codec
Both the snapshot and the WAL use the same self-describing format:
```text
+-----------------+----------------+
| payload_len: u32 LE |
+-----------------+----------------+
| body bytes (op_kind + …) |
+-----------------+----------------+
| crc32 over body: u32 LE |
+-----------------+----------------+
```
The snapshot file additionally carries an 8-byte header — the ASCII bytes `"IQDB"` followed by a `u32 LE` format version. The WAL has no header; its frames stand alone and are versioned implicitly with the snapshot they accompany.
Body format depends on the op kind:
- `OP_UPSERT (0)`: `id: u64 | dim: u32 | f32 × dim | has_payload: u8 | payload?`
- `OP_DELETE (1)`: `id: u64`
Payload encoding is a tagged union: each `PayloadValue` is a single tag byte followed by its native binary representation. Strings are `u32 LE` length + UTF-8 bytes. Nested objects and arrays are `u32 LE` count + repeated entries. `BTreeMap` ordering is preserved on encode (and reconstructed via insertion order on decode), so payload hashes and `serde` round-trips remain stable.
Every multi-byte value is written **little-endian** regardless of host byte order. A database written on x86_64 reads back identically on aarch64 — verified by the v0.3.0 cross-platform CI matrix, which now includes the v0.4.0 persistence tests.
### Backend enum dispatch
Internally, the `Iqdb` handle now wraps a `pub(crate) enum Backend { Memory(MemoryStore), File(FileStore) }`. Every public method (`upsert`, `get`, `delete`, `len`, `is_empty`, `flush`, `close`, `with_records` via the search kernel) dispatches through a hand-written `match`. Enum dispatch keeps the hot paths free of dynamic dispatch — the compiler can inline the match arm into the calling function and the inner search loop sees a concrete `HashMap` borrow with no virtual indirection.
The search kernel binds to both backends through a single `Backend::with_records` method, which itself dispatches to `MemoryStore::with_records` or `FileStore::with_records` (both of which have the same `FnOnce(&HashMap<RecordId, Record>) -> R` signature). The kernel's monomorphisation budget over the filter closure remains unchanged from v0.3.0.
### `Error::Corrupt { reason }`
New `#[non_exhaustive]` variant on the unified error enum. Surfaced exclusively by `Iqdb::open(path)` when the snapshot fails an integrity check. The `reason` is a static string identifying which check failed (`"bad magic"`, `"unknown format version"`, `"truncated header"`, `"frame crc mismatch"`, `"unknown op kind"`, etc.) — never contains user-supplied data, so log forwarding stays safe by default.
WAL corruption does **not** surface as `Error::Corrupt`. The recovery path silently truncates the corrupt tail and returns a clean handle — the WAL is, by its nature, allowed to be truncated mid-frame on a crash, and treating that as an error would make every crash recovery a manual operation.
### Persistence test suite
New integration test file at [`tests/persistence.rs`](../../tests/persistence.rs) — 12 tests covering:
- Fresh-directory open (creates the directory).
- File-path rejection (`Error::InvalidConfig` when the path exists and is not a directory).
- Upsert / get / delete round-trip across close + reopen.
- Payload (including `Bytes`, `Bool`, `Int`, `Float`, `Text`) round-trip through compaction.
- Recovery without close (drop the handle, the WAL is replayed on next open).
- Recovery without flush (the OS page cache holds the bytes until a clean process exit; the WAL is still replayed on next open).
- Search runs correctly against recovered data.
- Multi-cycle state preservation (5 open / upsert / close cycles all accumulate into the same database).
- `close` truncates the WAL to zero bytes.
- Corrupt snapshot surfaces `Error::Corrupt`.
- Corrupt WAL tail is silently truncated to last known-good offset.
Plus a new property test at [`tests/properties.rs`](../../tests/properties.rs): `persistence_round_trip_preserves_records` exercises the full open → upsert → close → reopen sequence over arbitrary record sets generated by proptest.
### New benchmark group — `file_store`
Two new Criterion benches:
- **`file_store/upsert_dim128_then_flush`** — full durable-write throughput: open a fresh DB, upsert a single 128-dim vector, flush, drop. Measures the per-operation cost of `WAL append + memory insert + full_sync`. The fresh-DB-per-iteration setup keeps the bench from being dominated by the cumulative cost of a growing WAL.
- **`file_store/open_snapshot_only_1k_records_dim128`** — recovery throughput against a prepared snapshot-only database (1 000 records, dim 128, WAL empty). Measures the cost of `open + load snapshot + reconstruct in-memory map`.
`cargo bench --bench vector_ops -- file_store` produces both.
### Persistence example — `examples/persistence.rs`
New three-session walkthrough:
1. **Session 1**: open at `./data/iqdb-persistence-demo/`, upsert 3 records with topic payloads, flush, close.
2. **Session 2**: reopen, verify `len == 3`, run a cosine search, delete record 3, close.
3. **Session 3**: reopen, verify `len == 2`, confirm record 3 is absent, close.
Run with `cargo run --example persistence --release`. The example leaves the directory in place so you can inspect the on-disk files (`snap`, `wal`) afterwards.
## Breaking changes
**Effectively none.** The surface is additive on top of v0.3.0, with two methods promoted from "returns `Error::NotImplemented`" to load-bearing:
- `Iqdb::open(path)` previously returned `Error::NotImplemented` unconditionally; v0.4.0 returns a real handle. **Existing callers that branched on the `NotImplemented` variant** see the branch become dead code, not a compilation error — `Error` is `#[non_exhaustive]` and remains so.
- `Iqdb::flush` previously returned `Error::NotImplemented` on the in-memory backend; v0.4.0 returns `Ok(())` (no-op for memory, real `full_sync` for files). Same story: dead code, not a compilation error.
The v0.3.0 `tests/in_memory.rs` test `flush_and_open_path_still_not_implemented_in_v0_2_0` is renamed to `flush_and_close_on_in_memory_are_ok_in_v0_4_0` and updated to assert the new `Ok(())` semantics. No other test required changes.
No existing v0.3.0 type changes shape. No new public types. The new `Error::Corrupt` variant is allowed by the existing `#[non_exhaustive]` annotation.
## Verification
Run on Windows x86_64 and on WSL2 Ubuntu (Rust stable 1.95.0). The same commands run in the configured CI matrix on Linux, macOS, and Windows:
```bash
cargo fmt --all -- --check
cargo clippy --all-targets -- -D warnings
cargo clippy --all-targets --all-features -- -D warnings
cargo test
cargo test --all-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo deny check
cargo audit --deny warnings
```
All green on both hosts. Test counts at this tag:
- **Default features:** 99 unit + 8 in_memory + 12 persistence + 9 properties + 14 search + 1 smoke (44 integration) + 39 doctests.
- **`--all-features`:** 99 unit + 11 in_memory (+3 serde JSON round-trips) + 12 persistence + 9 properties + 14 search + 1 smoke (47 integration) + 39 doctests.
`cargo deny check` reports `advisories ok, bans ok, licenses ok, sources ok`. `cargo audit` scans 60 transitive dependencies (default features on Windows) with zero advisories. On Unix the count rises by one (`libc`) for the same zero-advisory result.
## What's next
- **v0.5.0 — Approximate indices.** IVF and HNSW behind the same trait the flat index implements. Build-time index selection via a builder. Approximate recall measured against the v0.3.0 flat-search ground truth. Bench numbers comparing flat vs IVF vs HNSW at multiple recall targets.
- **v0.6.0 — Async surface.** Tokio-driven async mirror of the public API. Cancellation-safe. The file-backed write path will gain a `spawn_blocking` boundary so async-context callers do not block the executor thread on `full_sync`.
- **v0.4.x — `mmap` and `io_uring` (optional).** Both behind feature flags. `mmap` for the read-mostly snapshot path; `io_uring` for batched WAL appends on Linux.
## Installation
```toml
[dependencies]
iqdb = "0.4"
# Enable the optional `serde` feature
iqdb = { version = "0.4", features = ["serde"] }
```
MSRV: Rust 1.87.
## Documentation
- [README](https://github.com/jamesgober/iqdb/blob/main/README.md)
- [API Reference](https://github.com/jamesgober/iqdb/blob/main/docs/API.md)
- [Standards (REPS)](https://github.com/jamesgober/iqdb/blob/main/REPS.md)
- [CHANGELOG](https://github.com/jamesgober/iqdb/blob/main/CHANGELOG.md)
- [docs.rs/iqdb](https://docs.rs/iqdb)
---
**Full diff:** [`v0.3.0...v0.4.0`](https://github.com/jamesgober/iqdb/compare/v0.3.0...v0.4.0).
**Changelog:** [`CHANGELOG.md`](https://github.com/jamesgober/iqdb/blob/main/CHANGELOG.md#040--2026-05-30).