pub fn file<E>(
odb: impl Find + FindHeader,
traverse: impl IntoIterator<Item = Result<Info, E>>,
resource_cache: &mut Platform,
file_path: &BStr,
range: Option<Range<u32>>,
) -> Result<Outcome, Error>Expand description
Produce a list of consecutive BlameEntry instances to indicate in which commits the ranges of the file
at traverse[0]:<file_path> originated in.
§Paramters
odb- Access to database objects, also for used for diffing.
- Should have an object cache for good diff performance.
traverse- The list of commits from the most recent to prior ones, following all parents sorted by time.
- It’s paramount that older commits are returned after newer ones.
- The first commit returned here is the first eligible commit to be responsible for parts of
file_path.
file_path- A slash-separated worktree-relative path to the file to blame.
range- A 1-based inclusive range, in order to mirror
git’s behaviour.Some(20..40)represents 21 lines, spanning from line 20 up to and including line 40. This will be converted to19..40internally as the algorithm uses 0-based ranges that are exclusive at the end.
- A 1-based inclusive range, in order to mirror
resource_cache- Used for diffing trees.
§The algorithm
*For brevity, HEAD denotes the starting point of the blame operation. It could be any commit, or even commits that
represent the worktree state.
We begin with a single Unblamed Hunk and a single suspect, usually the HEAD commit as the commit containing the
Blamed File, so that it contains the entire file, with the first commit being a candidate for the entire Blamed File.
We traverse the commit graph starting at the first suspect, and see if there have been changes to file_path.
If so, we have found a Source File and a Suspect commit, and have hunks that represent these changes.
Now the Unblamed Hunk is split at the boundaries of each matching change, creating a new Unblamed Hunk on each side,
along with a BlameEntry to represent the match.
This is repeated until there are no non-empty Unblamed Hunks left.
At a high level, what we want to do is the following:
- get the commit
- walk through its parents
- for each parent, do a diff and mark lines that don’t have a suspect yet (this is the term
used in
libgit2), but that have been changed in this commit
- for each parent, do a diff and mark lines that don’t have a suspect yet (this is the term
used in
The algorithm in libgit2 works by going through parents and keeping a linked list of blame
suspects. It can be visualized as follows: