Skip to main content

Module pager

Module pager 

Source
Expand description

Long-lived page cache + WAL-backed commits.

A Pager wraps an open .sqlrite file plus its -wal sidecar. It owns three maps of page bytes:

  • on_disk: snapshot of the main file as last checkpointed. Frozen across regular commits — the main file is only rewritten when the checkpointer (Phase 4d) runs.
  • wal_cache: latest committed body for each page that has been appended to the WAL since the last checkpoint. Populated at open by replaying the WAL, and kept in lockstep with each successful commit.
  • staged: pages queued for the next commit, not yet in the WAL.

Read precedence. read_page consults staged → wal_cache → on_disk, so both uncommitted writes and WAL-resident committed writes shadow the frozen main file. A bounds check against current_header.page_count hides pages that have been logically truncated by a shrink-commit even though their bytes are still present in on_disk (the real truncation waits for the checkpointer).

Commit flow. commit compares each staged page against the effective committed state (wal_cache layered on on_disk) and appends a WAL frame only for pages whose bytes actually differ. A final “commit” frame for page 0 carries the new encoded header and the post-commit page count in its commit_page_count field. That frame is fsync’d. The main file is not touched.

Checkpoint flow (Phase 4d). When the WAL accumulates past AUTO_CHECKPOINT_THRESHOLD_FRAMES frames (tracked on Wal), commit opportunistically folds them back into the main file: write every WAL-resident page at its proper offset, overwrite the main-file header, truncate the file to page_count * PAGE_SIZE bytes, fsync, then Wal::truncate the sidecar (which rolls the salt so any stale tail bytes from the old generation can’t be misread as valid). Reads stay consistent if a crash hits mid-checkpoint — the WAL still holds the authoritative bytes until its header is rewritten, and the checkpointer is idempotent, so rerunning is safe.

This matters because higher layers re-serialize the entire database on every auto-save. Without the diff, even a one-row UPDATE would append a frame for every page of every table. With the diff, unchanged tables — whose encoded pages hash identically across saves — simply stay out of the WAL.

Locking (Phase 4a → 4e). Every Pager takes an advisory lock on its main file and on its WAL sidecar. The mode is driven by AccessMode:

  • ReadWriteflock(LOCK_EX) — one writer, no other openers.
  • ReadOnlyflock(LOCK_SH) — multiple readers coexist; any writer is excluded.

Both locks are tied to their file descriptors and release automatically when the Pager drops. On collision the opener gets a clean typed error rather than racing silently. POSIX flock is “multiple readers OR one writer”, not both — true concurrent reader-and-writer access would need a shared-memory coordination file and read marks, which is not on the roadmap.

Structs§

Pager

Enums§

AccessMode
How a Pager (or Wal) intends to use the file: mutating writes vs. consistent-snapshot reads. Drives the OS-level lock mode, and the Pager uses it to reject mutation attempts on read-only openers.