raft-wal
A minimal append-only WAL (Write-Ahead Log) optimized for Raft consensus.
General-purpose KV stores like sled or RocksDB carry unnecessary overhead for Raft log storage. raft-wal focuses on four operations: append, range read, truncate, and metadata — nothing else.
Features
- Fast — ~210ns append (with HW-accelerated CRC32C), ~1ns get (O(1) via
VecDeque) - Minimal dependencies — only
crc32crequired;tokioandopenraftare optional - Sync & async —
RaftWalandAsyncRaftWalshare the same optimized core - Raft-correct durability — metadata (term/vote) is always fsynced; log entries are buffered with opt-in
sync() - Integrity — every entry is protected by a CRC32C checksum; corrupted or partial writes are detected on recovery
- Segment-based storage — log is split into segment files (default 64 MB);
compact()deletes old segments without rewriting - Parallel recovery — segment files are read and CRC-verified in parallel across CPU cores
- openraft integration — optional
RaftLogStoragetrait implementation - Cross-platform — Linux, macOS, Windows
Usage
[]
= "0.2"
# For async support:
# raft-wal = { version = "0.2", features = ["tokio"] }
# For openraft integration:
# raft-wal = { version = "0.2", features = ["openraft-storage"] }
use RaftWal;
let mut wal = open.unwrap;
// Append log entries
wal.append.unwrap;
wal.append.unwrap;
// Read entries
assert_eq!;
let entries: = wal.iter_range.collect;
assert_eq!;
// Store Raft metadata (always fsynced)
wal.set_meta.unwrap;
wal.set_meta.unwrap;
// Snapshot compaction — deletes old segment files
wal.compact.unwrap; // discard index <= 1
// Conflict resolution
wal.truncate.unwrap; // discard index >= 2
// Opt-in durable write
wal.append.unwrap;
wal.sync.unwrap; // fsync to disk
Async
use AsyncRaftWal;
let mut wal = open.await.unwrap;
wal.append.await.unwrap;
wal.set_meta.await.unwrap;
// Must call close() — tokio can't flush in Drop
wal.close.await.unwrap;
openraft Integration
Enable openraft-storage to get RaftLogStorage + RaftLogReader implementations:
= { = "0.2", = ["openraft-storage"] }
use OpenRaftLogStorage;
let storage = open.await?;
C::Entry, VoteOf<C>, and LogIdOf<C> must implement serde::Serialize + serde::Deserialize.
Durability
| Operation | Behavior |
|---|---|
set_meta / remove_meta |
Always fsynced (Raft election safety) |
append / append_batch |
Buffered, no fsync |
sync() |
Flushes + fsyncs log entries |
flush() |
Flushes to OS without fsync |
Metadata (term, votedFor) must survive crashes per the Raft paper. set_meta writes to a temp file, fsyncs, then atomically renames.
Log entries are buffered for performance. Call sync() after append if your Raft implementation requires durable entries before acknowledging AppendEntries.
Integrity
Each entry on disk is prefixed with a CRC32C checksum covering the index, payload length, and payload bytes. On recovery, entries with invalid checksums or incomplete writes are silently discarded from the tail — the WAL recovers up to the last good entry.
Benchmarks
Measured on Linux with 128-byte entries:
| Operation | Latency |
|---|---|
append |
~210 ns |
append_batch (10 entries) |
~2.9 µs |
get |
~1 ns |
read_range (100 entries) |
~3.2 µs |
recovery (10k entries, 1 segment) |
~1.2 ms |
recovery (10k entries, multi-segment) |
~2.0 ms |
Design
- In-memory index:
VecDeque<Vec<u8>>with a base offset — O(1) append and lookup. All entries are held in memory; callcompact()periodically after snapshots to free memory. Useestimated_memory()to monitor usage. - Segment files: the log is split into segment files (
{index}.seg). When the active segment exceedsmax_segment_size(default 64 MB), it is sealed and a new segment begins.compact()deletes old segments with a file remove — no rewrite needed. - Entry format:
[u32 crc32c LE][u64 index LE][u32 payload_len LE][payload]— 16-byte header per entry - Buffered writes: 64 KB
BufWriter(sync) or userspace buffer (async) — syscalls only when the buffer fills - Parallel recovery: segment files are read and CRC-verified concurrently using one thread per CPU core (
std::thread::scope) ortokio::spawn(async) - Atomic metadata:
set_metawrites to a temp file, fsyncs, then renames — crash-safe - CRC32C: hardware-accelerated via the
crc32ccrate (SSE4.2 on x86, ARM CRC on aarch64, software fallback elsewhere)
Status
This crate is in early development and has not been battle-tested in production yet. It is planned for use in Nyx Studio infrastructure. Use at your own risk.
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for commit conventions and guidelines.
License
Licensed under either of Apache License, Version 2.0 or MIT License at your option.