Skip to main content

Module scanner

Module scanner 

Source
Expand description

Scanner: walk git history, find LFS pointer blobs.

This is the entry point used by git lfs fetch/pull/push to enumerate the LFS pointers reachable from a set of refs. The pipeline mirrors upstream:

  1. rev_list emits every reachable object (commits, trees, blobs).
  2. CatFileBatchCheck filters those to blobs whose size could fit in a pointer file (≤ MAX_POINTER_SIZE). Blobs are read from index; cheap header-only check, no content I/O.
  3. CatFileBatch reads the surviving candidates’ content. Each is parsed as a Pointer; non-pointers are silently skipped.
  4. The output is deduplicated by LFS OID (the pointer’s content OID, not the git blob OID): the same LFS object can appear in many blobs/paths, but we only need to fetch it once.

Structs§

PointerEntry
One LFS pointer discovered by the scanner.
TreeBlob
One blob found while walking a tree, before any pointer-parsing or size-based filtering. Paths and OIDs are reported verbatim from git ls-tree.

Functions§

scan_index_lfs
Scan the index for LFS pointers via git ls-files --stage -z -- :(attr:filter=lfs).
scan_index_pointers
LFS pointers in the index or working tree that differ from ref (typically HEAD). Mirrors upstream’s lfs/gitscanner_index.go:: scanIndex: runs git diff-index <ref> and git diff-index --cached <ref> to surface staged + working-tree changes, then dedupes by (sha, path).
scan_pointers
Walk history reachable from include minus exclude, return unique LFS pointers.
scan_pointers_with_args
scan_pointers with extra rev-list cmdline args. See rev_list_with_args.
scan_previous_versions
Walk git log -G "oid sha256:" -p <ref> since since, returning every LFS pointer that appears as the previous state of a modified file (i.e. lives on the - side of a unified diff).
scan_stashed
LFS pointers reachable from refs/stash and its associated WIP / index / untracked merge parents. Mirrors upstream’s lfs/gitscanner_log.go::scanStashed.
scan_tree
Walk the tree at reference, returning one entry per LFS pointer blob.
scan_tree_blobs
Walk the tree at reference and return every blob — no size filter, no pointer parsing. Used by fsck --pointers for its full-tree sweep when classifying paths against .gitattributes.