Skip to main content

wal_db/
lib.rs

1//! # wal-db
2//!
3//! A write-ahead log primitive for Rust storage engines.
4//!
5//! A write-ahead log (WAL) is the durability substrate every database leans on:
6//! a state change is appended to a durable, append-only log *before* it is
7//! acknowledged, and that log is the source of truth used to rebuild state after
8//! a crash. `wal-db` publishes that primitive as a small, audited, benchmarked
9//! crate so the storage engines in the portfolio — `lsm-db`, `txn-db`,
10//! `raft-io`, Hive DB — share one well-tested implementation instead of each
11//! re-deriving the durability contract and getting it subtly wrong.
12//!
13//! ## The four-call API
14//!
15//! The common case is four calls: open, append, sync, iterate.
16//!
17//! ```
18//! use wal_db::Wal;
19//!
20//! # fn main() -> Result<(), wal_db::WalError> {
21//! # let dir = tempfile::tempdir().map_err(wal_db::WalError::from)?;
22//! # let path = dir.path().join("app.wal");
23//! // Open (or create) the log.
24//! let wal = Wal::open(&path)?;
25//!
26//! // Append a record; `append` returns once the bytes are in the kernel
27//! // page cache. It does not flush the disk. The returned LSN is the record's
28//! // byte offset — the first record starts at 0.
29//! let lsn = wal.append(b"the first record")?;
30//! assert_eq!(lsn.get(), 0);
31//!
32//! // `sync` is the durability barrier: it returns once every record appended
33//! // before it is on stable storage.
34//! wal.sync()?;
35//!
36//! // On restart, replay the log to rebuild state.
37//! for entry in wal.iter()? {
38//!     let entry = entry?;
39//!     assert_eq!(entry.data(), b"the first record");
40//! }
41//! # Ok(())
42//! # }
43//! ```
44//!
45//! ## Concurrency and group commit
46//!
47//! `Wal` is built for many writers. [`append`](Wal::append) is lock-free: each
48//! call reserves its byte range with a single atomic step — that range's start
49//! offset *is* the record's [`Lsn`] — then writes its record without blocking
50//! the others. Share one `Wal` behind an [`Arc`](std::sync::Arc) and append from
51//! every thread.
52//!
53//! Durability is where threads cooperate. When several call [`sync`](Wal::sync)
54//! at once, they coalesce into a single fsync — **group commit** — so the cost
55//! of making data durable is amortised across everyone committing together
56//! rather than paid N times. [`append_and_sync`](Wal::append_and_sync) does an
57//! append and a group-commit-aware sync in one call.
58//!
59//! ## The durability contract
60//!
61//! Two operations, two distinct guarantees. Confusing them is the single most
62//! common way to lose data with a WAL, so they are kept explicit:
63//!
64//! - [`Wal::append`] returns when the record is in the operating system's page
65//!   cache. A crash *after* `append` but *before* `sync` may lose the record.
66//! - [`Wal::sync`] returns only when every previously appended record is on
67//!   stable storage and will survive a power loss.
68//!
69//! The flush is platform-correct on each target, which is not the same call
70//! everywhere:
71//!
72//! | Platform | Durability call |
73//! |----------|-----------------|
74//! | Linux    | `fdatasync` (via [`std::fs::File::sync_data`]) |
75//! | Windows  | `FlushFileBuffers` (via [`std::fs::File::sync_data`]) |
76//! | macOS    | `fcntl(F_FULLFSYNC)` — **not** plain `fsync`, which leaves data in the drive's write cache |
77//!
78//! ## Recovery
79//!
80//! Every record carries a CRC32C checksum over its own bytes. Recovery walks
81//! the log forward and stops at the first record whose checksum fails or whose
82//! bytes are incomplete — a torn write from a crash mid-append. Records up to
83//! that point are returned; the torn tail is discarded. Recovery never reads a
84//! partially written record as if it were complete, and a corrupt length prefix
85//! can never trigger an unbounded allocation: lengths are validated against
86//! [`WalConfig::max_record_size`] before a single byte of payload is read.
87//!
88//! ## Backends
89//!
90//! [`Wal::open`] uses the file-backed [`FileStore`]. Custom backends — in-memory
91//! for tests, or an alternative storage layer — implement the [`WalStore`] trait
92//! and plug in through [`Wal::with_store`]. An in-memory [`MemStore`] ships for
93//! testing and examples.
94//!
95//! ## Status
96//!
97//! This is the `0.3` core: lock-free multi-writer append, group commit, and a
98//! frozen record format, on top of the platform-correct durability and
99//! torn-write recovery from `0.2`. Segment rotation follows in `0.3.1`. The
100//! four-call API is stable and will not change shape.
101
102#![deny(warnings)]
103#![deny(missing_docs)]
104#![deny(unsafe_op_in_unsafe_fn)]
105#![deny(unused_must_use)]
106#![deny(unused_results)]
107#![deny(clippy::unwrap_used)]
108#![deny(clippy::expect_used)]
109#![deny(clippy::todo)]
110#![deny(clippy::unimplemented)]
111#![deny(clippy::print_stdout)]
112#![deny(clippy::print_stderr)]
113#![deny(clippy::dbg_macro)]
114#![deny(clippy::unreachable)]
115#![deny(clippy::undocumented_unsafe_blocks)]
116#![deny(clippy::missing_safety_doc)]
117
118mod commit;
119mod config;
120mod error;
121mod lsn;
122mod record;
123mod segment;
124mod store;
125mod sync;
126mod wal;
127
128pub use crate::config::{RecoveryPolicy, WalConfig};
129pub use crate::error::{Result, WalError};
130pub use crate::lsn::Lsn;
131pub use crate::segment::SegmentedStore;
132pub use crate::store::{FileStore, MemStore, WalStore};
133pub use crate::wal::{Record, Wal, WalIter};
134
135/// The `pack-io` codec, re-exported so typed-record consumers can derive
136/// `Serialize`/`Deserialize` without adding the dependency themselves.
137///
138/// Available only with the `pack-io` feature. Use it as
139/// `use wal_db::pack_io::{Serialize, Deserialize};` alongside
140/// [`Wal::append_typed`] and [`Record::decode`].
141#[cfg(feature = "pack-io")]
142pub use pack_io;
143
144/// The common imports for working with a log.
145///
146/// Glob-importing the prelude pulls in the four-call API and the types its
147/// methods return, which is enough for the great majority of uses.
148///
149/// ```
150/// use wal_db::prelude::*;
151///
152/// # fn main() -> Result<()> {
153/// # let dir = tempfile::tempdir().map_err(WalError::from)?;
154/// # let path = dir.path().join("p.wal");
155/// let wal = Wal::open(&path)?;
156/// let _lsn: Lsn = wal.append(b"record")?;
157/// wal.sync()?;
158/// # Ok(())
159/// # }
160/// ```
161pub mod prelude {
162    pub use crate::config::{RecoveryPolicy, WalConfig};
163    pub use crate::error::{Result, WalError};
164    pub use crate::lsn::Lsn;
165    pub use crate::store::WalStore;
166    pub use crate::wal::{Record, Wal};
167}