Function file

Source
pub fn file<E>(
    odb: impl Find + FindHeader,
    traverse: impl IntoIterator<Item = Result<Info, E>>,
    resource_cache: &mut Platform,
    file_path: &BStr,
    range: Option<Range<u32>>,
) -> Result<Outcome, Error>
where E: Into<Box<dyn Error + Send + Sync + 'static>>,
Expand description

Produce a list of consecutive BlameEntry instances to indicate in which commits the ranges of the file at traverse[0]:<file_path> originated in.

§Paramters

  • odb
    • Access to database objects, also for used for diffing.
    • Should have an object cache for good diff performance.
  • traverse
    • The list of commits from the most recent to prior ones, following all parents sorted by time.
    • It’s paramount that older commits are returned after newer ones.
    • The first commit returned here is the first eligible commit to be responsible for parts of file_path.
  • file_path
    • A slash-separated worktree-relative path to the file to blame.
  • range
    • A 1-based inclusive range, in order to mirror git’s behaviour. Some(20..40) represents 21 lines, spanning from line 20 up to and including line 40. This will be converted to 19..40 internally as the algorithm uses 0-based ranges that are exclusive at the end.
  • resource_cache
    • Used for diffing trees.

§The algorithm

*For brevity, HEAD denotes the starting point of the blame operation. It could be any commit, or even commits that represent the worktree state. We begin with a single Unblamed Hunk and a single suspect, usually the HEAD commit as the commit containing the Blamed File, so that it contains the entire file, with the first commit being a candidate for the entire Blamed File. We traverse the commit graph starting at the first suspect, and see if there have been changes to file_path. If so, we have found a Source File and a Suspect commit, and have hunks that represent these changes. Now the Unblamed Hunk is split at the boundaries of each matching change, creating a new Unblamed Hunk on each side, along with a BlameEntry to represent the match. This is repeated until there are no non-empty Unblamed Hunks left.

At a high level, what we want to do is the following:

  • get the commit
  • walk through its parents
    • for each parent, do a diff and mark lines that don’t have a suspect yet (this is the term used in libgit2), but that have been changed in this commit

The algorithm in libgit2 works by going through parents and keeping a linked list of blame suspects. It can be visualized as follows: