Expand description
§wal-db
A write-ahead log primitive for Rust storage engines.
A write-ahead log (WAL) is the durability substrate every database leans on:
a state change is appended to a durable, append-only log before it is
acknowledged, and that log is the source of truth used to rebuild state after
a crash. wal-db publishes that primitive as a small, audited, benchmarked
crate so the storage engines in the portfolio — lsm-db, txn-db,
raft-io, Hive DB — share one well-tested implementation instead of each
re-deriving the durability contract and getting it subtly wrong.
§The four-call API
The common case is four calls: open, append, sync, iterate.
use wal_db::Wal;
// Open (or create) the log.
let wal = Wal::open(&path)?;
// Append a record; `append` returns once the bytes are in the kernel
// page cache. It does not flush the disk. The returned LSN is the record's
// byte offset — the first record starts at 0.
let lsn = wal.append(b"the first record")?;
assert_eq!(lsn.get(), 0);
// `sync` is the durability barrier: it returns once every record appended
// before it is on stable storage.
wal.sync()?;
// On restart, replay the log to rebuild state.
for entry in wal.iter()? {
let entry = entry?;
assert_eq!(entry.data(), b"the first record");
}§Concurrency and group commit
Wal is built for many writers. append is lock-free: each
call reserves its byte range with a single atomic step — that range’s start
offset is the record’s Lsn — then writes its record without blocking
the others. Share one Wal behind an Arc and append from
every thread.
Durability is where threads cooperate. When several call sync
at once, they coalesce into a single fsync — group commit — so the cost
of making data durable is amortised across everyone committing together
rather than paid N times. append_and_sync does an
append and a group-commit-aware sync in one call.
§The durability contract
Two operations, two distinct guarantees. Confusing them is the single most common way to lose data with a WAL, so they are kept explicit:
Wal::appendreturns when the record is in the operating system’s page cache. A crash afterappendbut beforesyncmay lose the record.Wal::syncreturns only when every previously appended record is on stable storage and will survive a power loss.
The flush is platform-correct on each target, which is not the same call everywhere:
| Platform | Durability call |
|---|---|
| Linux | fdatasync (via std::fs::File::sync_data) |
| Windows | FlushFileBuffers (via std::fs::File::sync_data) |
| macOS | fcntl(F_FULLFSYNC) — not plain fsync, which leaves data in the drive’s write cache |
§Recovery
Every record carries a CRC32C checksum over its own bytes. Recovery walks
the log forward and stops at the first record whose checksum fails or whose
bytes are incomplete — a torn write from a crash mid-append. Records up to
that point are returned; the torn tail is discarded. Recovery never reads a
partially written record as if it were complete, and a corrupt length prefix
can never trigger an unbounded allocation: lengths are validated against
WalConfig::max_record_size before a single byte of payload is read.
§Backends
Wal::open uses the file-backed FileStore. Custom backends — in-memory
for tests, or an alternative storage layer — implement the WalStore trait
and plug in through Wal::with_store. An in-memory MemStore ships for
testing and examples.
§Status
This is the 0.3 core: lock-free multi-writer append, group commit, and a
frozen record format, on top of the platform-correct durability and
torn-write recovery from 0.2. Segment rotation follows in 0.3.1. The
four-call API is stable and will not change shape.
Re-exports§
pub use pack_io;
Modules§
- prelude
- The common imports for working with a log.
Structs§
- File
Store - A file-backed
WalStore: the default storage forWal::open. - Lsn
- A log sequence number: a record’s byte position in the log, assigned at append time.
- MemStore
- An in-memory
WalStorebacked by aVec<u8>behind a short lock. - Record
- One record read back during iteration: its
Lsnand its payload bytes. - Segmented
Store - A
WalStorethat stripes one flat byte space across fixed-size segment files in a directory. - Wal
- A durable, append-only log.
- WalConfig
- Tunable parameters for a
Wal. - WalIter
- The iterator returned by
Wal::iter.
Enums§
- Recovery
Policy - How
Wal::iterreacts to a damaged record. - WalError
- Everything that can go wrong while appending to, syncing, or recovering a log.
Traits§
- WalStore
- A byte-addressable, append-only store with an explicit durability barrier.