Expand description
A content-addressable identity for software artifacts.
§What are GitOIDs?
Git Object Identifiers (GitOIDs) are a mechanism for identifying artifacts in a manner which is independently reproducible because it relies only on the contents of the artifact itself.
The GitOID scheme comes from the Git version control system, which uses this mechanism to identify commits, tags, files (called “blobs”), and directories (called “trees”).
This implementation of GitOIDs is produced by the OmniBOR working group, which uses GitOIDs as the basis for OmniBOR Artifact Identifiers.
§GitOID URL Scheme
gitoid
is also an IANA-registered URL scheme, meaning that GitOIDs
are represented and shared as URLs. A gitoid
URL looks like:
gitoid:blob:sha256:fee53a18d32820613c0527aa79be5cb30173c823a9b448fa4817767cc84c6f03
This scheme starts with “gitoid
”, followed by the object type
(“blob
” in this case), the hash algorithm (“sha256
”), and the
hash produced by the GitOID hash construction. Each of these parts is
separated by a colon.
§GitOID Hash Construction
GitOID hashes are made by hashing a prefix string containing the object type and the size of the object being hashed in bytes, followed by a null terminator, and then hashing the object itself. So GitOID hashes do not match the result of only hashing the object.
§GitOID Object Types
The valid object types for a GitOID are:
blob
tree
commit
tag
Currently, this crate implements convenient handling of blob
objects,
but does not handle ensuring the proper formatting of tree
, commit
,
or tag
objects to match the Git implementation.
§GitOID Hash Algorithms
The valid hash algorithms are:
sha1
sha1dc
sha256
sha1dc
is actually Git’s default algorithm, and is equivalent to sha1
in most cases. Where it differs is when the hasher detects what it
believes to be an attempt to generate a purposeful SHA-1 collision,
in which case it modifies the hash process to produce a different output
and avoid the malicious collision.
Git does this under the hood, but does not clearly distinguish to end users that the underlying hashing algorithm isn’t equivalent to SHA-1. This is fine for Git, where the specific hash used is an implementation detail and only matters within a single repository, but for the OmniBOR working group it’s important to distinguish whether plain SHA-1 or SHA-1DC is being used, so it’s distinguished in the code for this crate.
This means for compatibility with Git that SHA-1DC should be used.
§Why Care About GitOIDs?
GitOIDs provide a convenient mechanism to establish artifact identity and
validate artifact integrity (this artifact hasn’t been modified) and
agreement (I have the same artifact you have). The fact that they’re based
only on the type of object (“blob
”, usually) and the artifact itself
means they can be derived independently, enabling distributed artifact
identification that avoids a central decider.
Alternative identity schemes, like Package URLs (purls) or Common Platform Enumerations (CPEs) rely on central authorities to produce identifiers or define the taxonomy in which identifiers are produced.
§Using this Crate
The central type of this crate is GitOid
, which is generic over both
the hash algorithm used and the object type being identified. These are
defined by the HashAlgorithm
and ObjectType
traits.
§Example
# use gitoid::{Sha256, Blob};
type GitOid = gitoid::GitOid<Sha256, Blob>;
let gitoid = GitOid::from_str("hello, world");
println!("gitoid: {}", gitoid);
Structs§
- A Blob GitOid object.
- A Commit GitOid object.
- A struct that computes gitoids based on the selected algorithm
- SHA-1 algorithm,
- SHA-1Cd (collision detection) algorithm.
- SHA-256 algorithm.
- A Tag GitOid object.
- A Tree GitOid object.
Enums§
- An error arising during
GitOid
construction or use.