Module git_historian::history [] [src]

Builds a tree of Git history based on the stream of changes parsed from Git

The basic algorithm is as follows: given a set of paths we care about and a series of commits (provided by the parsing module), do the following for each commit:

  1. Call the user-provided filter to see if the user cares about this commit. If they do, call the user-provided callback to extract information. The callback can use the data provided by ParsedCommit, or it can gather its own info using the commit's SHA1 ID and git commands. (The latter is, of course, much slower.)

  2. Then, for each added/removed/changed/etc. file in the commit,

    • Create a new node representing the delta.

    • Connect it to previous nodes using the "pending edges" map (see the next step).

    • In a map of "pending edges", place an entry indicating what the file's name was before this change. If the change was a modification, the previous name is the same as the current one. If the change was a move or a copy, the previous name will be different. If the change was the addition of the file, there is no previous name to add.

The net effect is that files' histories are tracked through name changes, a la git log --follow. Currently the act of renaming a file is considered a change, even though the actual contents haven't changed at all. (This seems to be consistent with git log --follow). If, in the future, this is not desired, we do track the amount a file has been changed during a rename, and could skip adding a node if no changes are made to the contents.

Functions

gather_history

Traverses Git history, grabbing arbitrary data at each change for files in the given set