Function file

Source
pub fn file(
    odb: impl Find + FindHeader,
    suspect: ObjectId,
    cache: Option<Graph>,
    resource_cache: &mut Platform,
    file_path: &BStr,
    options: Options,
) -> Result<Outcome, Error>
Expand description

Produce a list of consecutive BlameEntry instances to indicate in which commits the ranges of the file at suspect:<file_path> originated in.

§Paramters

  • odb
    • Access to database objects, also for used for diffing.
    • Should have an object cache for good diff performance.
  • suspect
    • The first commit to be responsible for parts of file_path.
  • cache
    • Optionally, the commitgraph cache.
  • file_path
    • A slash-separated worktree-relative path to the file to blame.
  • range
    • A 1-based inclusive range, in order to mirror git’s behaviour. Some(20..40) represents 21 lines, spanning from line 20 up to and including line 40. This will be converted to 19..40 internally as the algorithm uses 0-based ranges that are exclusive at the end.
  • resource_cache
    • Used for diffing trees.

§The algorithm

*For brevity, HEAD denotes the starting point of the blame operation. It could be any commit, or even commits that represent the worktree state.

We begin with one or more Unblamed Hunks and a single suspect, usually the HEAD commit as the commit containing the Blamed File, so that it contains the entire file, with the first commit being a candidate for the entire Blamed File. We traverse the commit graph starting at the first suspect, and see if there have been changes to file_path. If so, we have found a Source File and a Suspect commit, and have hunks that represent these changes. Now the Unblamed Hunk is split at the boundaries of each matching change, creating a new Unblamed Hunk on each side, along with a BlameEntry to represent the match. This is repeated until there are no non-empty Unblamed Hunks left.

At a high level, what we want to do is the following:

  • get the commit
  • walk through its parents
    • for each parent, do a diff and mark lines that don’t have a suspect yet (this is the term used in libgit2), but that have been changed in this commit

The algorithm in libgit2 works by going through parents and keeping a linked list of blame suspects. It can be visualized as follows: