Expand description
Worktree → tree-object builder.
Walks a directory, applies .mkitignore, hashes each file as a
Blob, recurses on subdirectories, validates
symlink targets against path-traversal, and writes a single root
Tree into the supplied ObjectStore.
Notes:
- Files at or below
CHUNK_THRESHOLDare stored as a singleBlob. Files above the threshold are chunked withcrate::chunker::FastCdc::v1; each chunk is stored as aBloband the file is represented by aChunkedBlobmanifest whose hash is what lands in the parent tree. - We never follow symlinks while walking. Linux/macOS
read_linkreports the target verbatim and we hash it as a blob.
Structs§
- Stat
Observation - A hash-time stat observation: while building a tree we re-hashed
path(its cache was absent or racy-smudged) and the result equals the staging index’s hash — so the stat captured from the OPENED file descriptor before its content was read proves the entry clean.statusconsumes these to heal the stat cache without ever pairing a post-verification stat with a pre-verification hash (the unsound verify-then-stat order).
Enums§
- Worktree
Error - Errors returned by this module.
Constants§
- CHUNK_
THRESHOLD - Files larger than this go through the chunker (1 MiB).
- MAX_
FILE_ BYTES - Hard cap on a single file (1 GiB).
Functions§
- build_
tree - Build a tree object for
dirand its subdirectories. Honours the.gitignore+.mkitignoreignore files loaded fromdir. - build_
tree_ filtered - Like
build_tree, but the caller supplies the authoritative tracked set (index). Callers that seed their index fromHEADwhen no index file exists yet (status, restore safety) MUST pass it here so a tracked file that matches an ignore rule is not dropped right after a checkout.Nonefalls back to the on-disk<dir>/.mkit/index(empty if absent). - build_
tree_ filtered_ observed build_tree_filteredthat additionally reports everyStatObservation(file re-hashed to a hash matching its index entry) intoobservations, so callers can heal the stat cache from hash-time stats.- build_
tree_ from_ index - Build a tree object from an
Index(the staging area). - build_
tree_ from_ index_ with build_tree_from_indexwriting tree objects throughsink— pass aWriteBatchto amortise the flush cost of all materialised trees into the batch’s single commit.storeis still needed read-only to validate that staged hashes point at blob-shaped objects (a sink cannot read).- hash_
file - Read a file from disk, hash it, store it, and return the content-address of the resulting object.
- hash_
file_ object - Content-address
dataexactly asstore_file_objectwould, without storing anything. Computes per-chunk blob hashes via the streaming hasher and assembles theChunkedBlobmanifest in memory. Backs change detection (status,rm, restore safety checks) where only the answer “would this file hash to X?” is needed — writing objects there would turn a read-only query into store mutation. Equivalence withstore_file_objectis pinned by test. - mtime_
nanos - A file’s mtime as nanoseconds since the Unix epoch, saturating;
0(the “no cache” sentinel) when the mtime is unavailable or predates the epoch. - read_
blob - Reassemble the full byte content of a
BloborChunkedBlobobject addressed byhash. - read_
regular_ file_ bounded - Read a regular file without following the final path component on
Unix, enforcing
MAX_FILE_BYTESagainst both the opened handle’s metadata and the actual bytes read. - stat_
cache_ fields - The full stat-cache observation for
meta, in index-entry field order:(mtime_ns, size, ino, ctime_ns). The single producer-side dual ofstat_matches— every site that records the cache uses this so the recorded and compared field sets can never drift.ino/ctime_nsare 0 (= don’t check) on platforms without them. - stat_
matches - True iff
metaproves the worktree file behindentryis byte-identical toentry.object_hashwithout reading it: the cached mtime is nonzero (cache present, not racy-smudged) and equal, the size is equal, the inode and ctime match when recorded (catching replace-by-rename andtouch -r-style timestamp restoration — ctime cannot be set from userspace), and the live mode’s exec class matches the staged status. Symlink entries never stat-match — the target re-read is cheap andmetasemantics differ. - store_
file_ object - Store a regular file’s bytes as the canonical object and return its content-address.
- validate_
symlink_ target - Validate a symlink target: must be relative and contain no
..segments.
Type Aliases§
- Worktree
Result - Result alias used throughout this module.