Expand description
§nodedb-wal
Deterministic, O_DIRECT write-ahead log with group commit.
This crate bypasses the Linux page cache entirely. Every WAL write goes
directly to NVMe via O_DIRECT (and eventually io_uring). This is
non-negotiable: if AI agents dump 10 GB of telemetry logs, the OS must NOT
evict hot HNSW vector indexes from RAM to cache WAL pages.
§Design
- O_DIRECT: All writes bypass the page cache. Aligned to 4 KiB.
- Group commit: Thousands of concurrent writes are batched into a single
fsync, maximizing NVMe IOPS. - CRC32C: Every record has a checksum for silent bit-rot detection.
- Deterministic replay: WAL replay is idempotent — crash at any point, recover to a consistent prefix.
§Validation target
Sustain 100,000+ async writes/sec with sub-millisecond p99 latency.
free -m cached memory must not move during the benchmark.
Re-exports§
pub use error::Result;pub use error::WalError;pub use group_commit::GroupCommitter;pub use lazy_reader::LazyWalReader;pub use record::RecordHeader;pub use record::RecordType;pub use record::WalRecord;pub use recovery::RecoveryInfo;pub use recovery::recover;pub use segmented::SegmentedWal;pub use segmented::SegmentedWalConfig;pub use writer::WalWriter;
Modules§
- align
- O_DIRECT alignment utilities.
- crypto
- WAL payload encryption using AES-256-GCM.
- double_
write - Double-write buffer for torn write protection.
- error
- group_
commit - Group commit coordinator.
- lazy_
reader - Lazy WAL reader: reads headers without payload for selective replay.
- mmap_
reader - Memory-mapped WAL segment reader for Event Plane catchup.
- reader
- WAL reader for crash recovery and replay.
- record
- WAL record format.
- recovery
- WAL recovery: scan an existing WAL file to determine the last committed LSN and file offset, enabling safe reopening for continued writes.
- segment
- WAL segment management.
- segmented
- Segmented WAL: manages a directory of segment files with automatic rollover and truncation.
- writer
- WAL writer with O_DIRECT and group commit.