Skip to main content

Crate wal_db

Crate wal_db 

Source
Expand description

§wal-db

A write-ahead log primitive for Rust storage engines.

A write-ahead log (WAL) is the durability substrate every database leans on: a state change is appended to a durable, append-only log before it is acknowledged, and that log is the source of truth used to rebuild state after a crash. wal-db publishes that primitive as a small, audited, benchmarked crate so the storage engines in the portfolio — lsm-db, txn-db, raft-io, Hive DB — share one well-tested implementation instead of each re-deriving the durability contract and getting it subtly wrong.

§The four-call API

The common case is four calls: open, append, sync, iterate.

use wal_db::Wal;

// Open (or create) the log.
let wal = Wal::open(&path)?;

// Append a record; `append` returns once the bytes are in the kernel
// page cache. It does not flush the disk. The returned LSN is the record's
// byte offset — the first record starts at 0.
let lsn = wal.append(b"the first record")?;
assert_eq!(lsn.get(), 0);

// `sync` is the durability barrier: it returns once every record appended
// before it is on stable storage.
wal.sync()?;

// On restart, replay the log to rebuild state.
for entry in wal.iter()? {
    let entry = entry?;
    assert_eq!(entry.data(), b"the first record");
}

§Concurrency and group commit

Wal is built for many writers. append is lock-free: each call reserves its byte range with a single atomic step — that range’s start offset is the record’s Lsn — then writes its record without blocking the others. Share one Wal behind an Arc and append from every thread.

Durability is where threads cooperate. When several call sync at once, they coalesce into a single fsync — group commit — so the cost of making data durable is amortised across everyone committing together rather than paid N times. append_and_sync does an append and a group-commit-aware sync in one call.

§The durability contract

Two operations, two distinct guarantees. Confusing them is the single most common way to lose data with a WAL, so they are kept explicit:

  • Wal::append returns when the record is in the operating system’s page cache. A crash after append but before sync may lose the record.
  • Wal::sync returns only when every previously appended record is on stable storage and will survive a power loss.

The flush is platform-correct on each target, which is not the same call everywhere:

PlatformDurability call
Linuxfdatasync (via std::fs::File::sync_data)
WindowsFlushFileBuffers (via std::fs::File::sync_data)
macOSfcntl(F_FULLFSYNC)not plain fsync, which leaves data in the drive’s write cache

§Recovery

Every record carries a CRC32C checksum over its own bytes. Recovery walks the log forward and stops at the first record whose checksum fails or whose bytes are incomplete — a torn write from a crash mid-append. Records up to that point are returned; the torn tail is discarded. Recovery never reads a partially written record as if it were complete, and a corrupt length prefix can never trigger an unbounded allocation: lengths are validated against WalConfig::max_record_size before a single byte of payload is read.

§Backends

Wal::open uses the file-backed FileStore. Custom backends — in-memory for tests, or an alternative storage layer — implement the WalStore trait and plug in through Wal::with_store. An in-memory MemStore ships for testing and examples.

§Status

This is the 0.3 core: lock-free multi-writer append, group commit, and a frozen record format, on top of the platform-correct durability and torn-write recovery from 0.2. Segment rotation follows in 0.3.1. The four-call API is stable and will not change shape.

Re-exports§

pub use pack_io;

Modules§

prelude
The common imports for working with a log.

Structs§

FileStore
A file-backed WalStore: the default storage for Wal::open.
Lsn
A log sequence number: a record’s byte position in the log, assigned at append time.
MemStore
An in-memory WalStore backed by a Vec<u8> behind a short lock.
Record
One record read back during iteration: its Lsn and its payload bytes.
SegmentedStore
A WalStore that stripes one flat byte space across fixed-size segment files in a directory.
Wal
A durable, append-only log.
WalConfig
Tunable parameters for a Wal.
WalIter
The iterator returned by Wal::iter.

Enums§

RecoveryPolicy
How Wal::iter reacts to a damaged record.
WalError
Everything that can go wrong while appending to, syncing, or recovering a log.

Traits§

WalStore
A byte-addressable, append-only store with an explicit durability barrier.

Type Aliases§

Result
A specialised Result for log operations.