Skip to main content

Module manifest

Module manifest 

Source
Expand description

In-memory manifest tracking indexed files for online reconciliation.

Each entry stores cheap stat data — (mtime, size, inode) on Unix (inode = 0 on Windows / unavailable filesystems) — plus a blake3 content hash. Reconciliation runs on every search via RipvecIndex::diff_against:

  1. Walk the corpus with the same [WalkOptions] used at index construction.
  2. For each walked file: compare the stat tuple to the manifest entry. Match → guaranteed-unchanged, skip.
  3. For mismatches: read the file, blake3-hash, compare against the stored hash. Match → metadata-only change (vim save-no-edit, build-tool touch), update the manifest’s stat tuple in place to short-circuit future diffs. Mismatch → record as dirty.
  4. Manifest entries not seen during the walk → deleted.
  5. Walked paths not in the manifest → new.

If the resulting Diff is empty, the existing index is up-to-date and no work is needed. Otherwise the caller rebuilds.

§Why blake3 + the stat tuple

The stat tuple is the cheap pre-filter: warm stat() is ~1 µs per file, so the whole tuple check on a 200-file repo is sub-millisecond. Most files won’t have a stat change between queries; the cheap path skips them entirely.

When the stat tuple does mismatch, the question is whether content actually changed. Reading + blake3’ing a typical 1-30 KB source file costs ~1-20 µs warm — two orders of magnitude cheaper than the ~1-5 ms cost of re-chunking and re-embedding it. The break-even is “blake3 is worth it when more than 0.7% of stat changes are touches rather than real edits”; real-world workflows have 5-50% touch rates (vim :w with no edits, autoformatters that hash-equal their input, build tools that touch source for dependency tracking).

§Inode as a third dimension

(mtime, size) alone has a rare blind spot: same-byte-count content swaps. Atomic-rename saves (the modern editor default) bump the inode, so adding inode to the tuple catches those without a blake3 round-trip. Inode is best-effort: 0 on Windows, where we fall back to (mtime, size). The blake3 verification path still guarantees correctness even when the inode signal is unavailable.

Structs§

Diff
Categorized filesystem changes detected by diff_against_walk.
FileEntry
One file’s tracked state in the manifest.
Manifest
Per-root manifest of indexed files.

Functions§

diff_against_walk
Compare the manifest to the current filesystem state and produce a Diff.