Expand description
§nodedb-wal
Deterministic, O_DIRECT write-ahead log with group commit.
This crate bypasses the Linux page cache entirely. Every WAL write goes
directly to NVMe via O_DIRECT (and eventually io_uring). This is
non-negotiable: if AI agents dump 10 GB of telemetry logs, the OS must NOT
evict hot HNSW vector indexes from RAM to cache WAL pages.
§Design
- O_DIRECT: All writes bypass the page cache. Aligned to 4 KiB.
- Group commit: Thousands of concurrent writes are batched into a single
fsync, maximizing NVMe IOPS. - CRC32C: Every record has a checksum for silent bit-rot detection.
- Deterministic replay: WAL replay is idempotent — crash at any point, recover to a consistent prefix.
§Validation target
Sustain 100,000+ async writes/sec with sub-millisecond p99 latency.
free -m cached memory must not move during the benchmark.
Re-exports§
pub use double_write::DoubleWriteBuffer;pub use double_write::DwbMode;pub use double_write::wal_dwb_bytes_written_total;pub use error::Result;pub use error::WalError;pub use group_commit::GroupCommitter;pub use lazy_reader::LazyWalReader;pub use preamble::CIPHER_AES_256_GCM;pub use preamble::PREAMBLE_SIZE;pub use preamble::PREAMBLE_VERSION;pub use preamble::SEG_PREAMBLE_MAGIC;pub use preamble::SegmentPreamble;pub use preamble::WAL_PREAMBLE_MAGIC;pub use record::CalvinAppliedPayload;pub use record::RecordHeader;pub use record::RecordType;pub use record::WalRecord;pub use recovery::RecoveryInfo;pub use recovery::recover;pub use replay::TombstoneSet;pub use replay::extract_tombstones;pub use secure_mem::SecureKey;pub use segmented::SegmentedWal;pub use segmented::SegmentedWalConfig;pub use temporal_purge::TemporalPurgeEngine;pub use temporal_purge::TemporalPurgePayload;pub use tombstone::CollectionTombstonePayload;pub use tombstone::MAX_COLLECTION_NAME_LEN;pub use writer::WalWriter;
Modules§
- align
- O_DIRECT alignment utilities.
- crypto
- WAL payload encryption using AES-256-GCM.
- double_
write - Double-write buffer for torn write protection.
- error
- group_
commit - Group commit coordinator.
- lazy_
reader - Lazy WAL reader: reads headers without payload for selective replay.
- mmap_
reader - Memory-mapped WAL segment reader for Event Plane catchup.
- preamble
- Segment preamble: 16-byte plaintext header written at offset 0 of every WAL segment file and every storage segment file.
- reader
- WAL reader for crash recovery and replay.
- record
- recovery
- WAL recovery: scan an existing WAL file to determine the last committed LSN and file offset, enabling safe reopening for continued writes.
- replay
- Replay-time utilities layered over raw [
WalRecord] streams. - secure_
mem - Secure memory utilities for key material.
- segment
- WAL segment management.
- segmented
- Segmented WAL: manages a directory of segment files with automatic rollover and truncation.
- temporal_
purge TemporalPurgerecord payload codec.- tombstone
CollectionTombstonedrecord payload codec.- writer
- WAL writer with O_DIRECT and group commit.