Expand description
Git-based staleness computation for symbol embeddings.
Staleness reflects how much time has passed (in commits) since a file was last updated. For symbol chunks (where doc and code are colocated), the approximation is:
- Find
last_doc_commit— the most recent commit that touched the file. - Count commits to the file since
last_doc_commit(exclusive). Becauselast_doc_commitis itself the last file touch, this counts repo-wide commits that did NOT touch the file — i.e. how many commits have elapsed with no update to this file. staleness = min(1.0, commits_since_doc_update as f64 / 50.0)
Gives 0.0 for files updated in the most recent commit, approaching 1.0 after 50+ commits have landed without touching the file.
§Performance
Use compute_staleness_batch to amortize the git walk across all files in
a populate run. The function deduplicates paths and walks history once per
unique file.
Functions§
- compute_
staleness_ batch - Compute staleness scores for a batch of file paths.