Skip to main content

Module scanner

Module scanner 

Source
Expand description

Scanner: walk git history, find LFS pointer blobs.

This is the entry point used by git lfs fetch/pull/push to enumerate the LFS pointers reachable from a set of refs. The pipeline mirrors upstream:

  1. rev_list() emits every reachable object (commits, trees, blobs).
  2. CatFileBatchCheck filters those to blobs whose size could fit in a pointer file (≤ MAX_POINTER_SIZE). Blobs are read from index; cheap header-only check, no content I/O.
  3. CatFileBatch reads the surviving candidates’ content. Each is parsed as a Pointer; non-pointers are silently skipped.
  4. The output is deduplicated by LFS OID (the pointer’s content OID, not the git blob OID): the same LFS object can appear in many blobs/paths, but we only need to fetch it once.

Structs§

PointerEntry
One LFS pointer discovered by the scanner.
TreeBlob
One blob found while walking a tree, before any pointer-parsing or size-based filtering. Paths and OIDs are reported verbatim from git ls-tree.

Functions§

scan_pointers
Walk history reachable from include minus exclude, return unique LFS pointers.
scan_tree
Walk the tree at reference, returning one entry per LFS pointer blob.
scan_tree_blobs
Walk the tree at reference and return every blob — no size filter, no pointer parsing. Used by fsck --pointers for its full-tree sweep when classifying paths against .gitattributes.