Skip to main content

Module worktree

Module worktree 

Source
Expand description

Worktree → tree-object builder.

Walks a directory, applies .mkitignore, hashes each file as a Blob, recurses on subdirectories, validates symlink targets against path-traversal, and writes a single root Tree into the supplied ObjectStore.

Notes:

  • Files at or below CHUNK_THRESHOLD are stored as a single Blob. Files above the threshold are chunked with crate::chunker::FastCdc::v1; each chunk is stored as a Blob and the file is represented by a ChunkedBlob manifest whose hash is what lands in the parent tree.
  • We never follow symlinks while walking. Linux/macOS read_link reports the target verbatim and we hash it as a blob.

Structs§

StatObservation
A hash-time stat observation: while building a tree we re-hashed path (its cache was absent or racy-smudged) and the result equals the staging index’s hash — so the stat captured from the OPENED file descriptor before its content was read proves the entry clean. status consumes these to heal the stat cache without ever pairing a post-verification stat with a pre-verification hash (the unsound verify-then-stat order).

Enums§

WorktreeError
Errors returned by this module.

Constants§

CHUNK_THRESHOLD
Files larger than this go through the chunker (1 MiB).
MAX_FILE_BYTES
Hard cap on a single file (1 GiB).

Functions§

build_tree
Build a tree object for dir and its subdirectories. Honours the .gitignore + .mkitignore ignore files loaded from dir.
build_tree_filtered
Like build_tree, but the caller supplies the authoritative tracked set (index). Callers that seed their index from HEAD when no index file exists yet (status, restore safety) MUST pass it here so a tracked file that matches an ignore rule is not dropped right after a checkout. None falls back to the on-disk <dir>/.mkit/index (empty if absent).
build_tree_filtered_observed
build_tree_filtered that additionally reports every StatObservation (file re-hashed to a hash matching its index entry) into observations, so callers can heal the stat cache from hash-time stats.
build_tree_from_index
Build a tree object from an Index (the staging area).
build_tree_from_index_with
build_tree_from_index writing tree objects through sink — pass a WriteBatch to amortise the flush cost of all materialised trees into the batch’s single commit. store is still needed read-only to validate that staged hashes point at blob-shaped objects (a sink cannot read).
hash_file
Read a file from disk, hash it, store it, and return the content-address of the resulting object.
hash_file_object
Content-address data exactly as store_file_object would, without storing anything. Computes per-chunk blob hashes via the streaming hasher and assembles the ChunkedBlob manifest in memory. Backs change detection (status, rm, restore safety checks) where only the answer “would this file hash to X?” is needed — writing objects there would turn a read-only query into store mutation. Equivalence with store_file_object is pinned by test.
mtime_nanos
A file’s mtime as nanoseconds since the Unix epoch, saturating; 0 (the “no cache” sentinel) when the mtime is unavailable or predates the epoch.
read_blob
Reassemble the full byte content of a Blob or ChunkedBlob object addressed by hash.
read_regular_file_bounded
Read a regular file without following the final path component on Unix, enforcing MAX_FILE_BYTES against both the opened handle’s metadata and the actual bytes read.
stat_cache_fields
The full stat-cache observation for meta, in index-entry field order: (mtime_ns, size, ino, ctime_ns). The single producer-side dual of stat_matches — every site that records the cache uses this so the recorded and compared field sets can never drift. ino/ctime_ns are 0 (= don’t check) on platforms without them.
stat_matches
True iff meta proves the worktree file behind entry is byte-identical to entry.object_hash without reading it: the cached mtime is nonzero (cache present, not racy-smudged) and equal, the size is equal, the inode and ctime match when recorded (catching replace-by-rename and touch -r-style timestamp restoration — ctime cannot be set from userspace), and the live mode’s exec class matches the staged status. Symlink entries never stat-match — the target re-read is cheap and meta semantics differ.
store_file_object
Store a regular file’s bytes as the canonical object and return its content-address.
validate_symlink_target
Validate a symlink target: must be relative and contain no .. segments.

Type Aliases§

WorktreeResult
Result alias used throughout this module.