Crate git_repository
source · [−]Expand description
This crate provides the Repository
abstraction which serves as a hub into all the functionality of git.
It’s powerful and won’t sacrifice performance while still increasing convenience compared to using the sub-crates individually. Sometimes it may hide complexity under the assumption that the performance difference doesn’t matter for all but the fewest tools out there, which would be using the underlying crates directly or file an issue.
The prelude and extensions
With use git_repository::prelude::*
you should be ready to go as it pulls in various extension traits to make functionality
available on objects that may use it.
The method signatures are still complex and may require various arguments for configuration and cache control.
Most extensions to existing objects provide an obj_with_extension.attach(&repo).an_easier_version_of_a_method()
for simpler
call signatures.
ThreadSafe Mode
By default, the Repository
isn’t Sync
and thus can’t be used in certain contexts which require the Sync
trait.
To help with this, convert it with .to_sync()
into a ThreadSafeRepository
.
Object-Access Performance
Accessing objects quickly is the bread-and-butter of working with git, right after accessing references. Hence it’s vital to understand which cache levels exist and how to leverage them.
When accessing an object, the first cache that’s queried is a memory-capped LRU object cache, mapping their id to data and kind. On miss, the object is looked up and if a pack is hit, there is a small fixed-size cache for delta-base objects.
In scenarios where the same objects are accessed multiple times, an object cache can be useful and is to be configured specifically
using the object_cache_size(…)
method.
Use the cache-efficiency-debug
cargo feature to learn how efficient the cache actually is - it’s easy to end up with lowered
performance if the cache is not hit in 50% of the time.
Environment variables can also be used for configuration if the application is calling
apply_environment()
.
Shortcomings & Limitations
- Only a single
crate::object
or derivatives can be held in memory at a time, perEasy*
. - Changes made to the configuration, packs, and alternates aren’t picked up automatically, but the current object store needs a manual refresh.
Design Sketch
Goal is to make the lower-level plumbing available without having to deal with any caches or buffers, and avoid any allocation beyond sizing the buffer to fit the biggest object seen so far.
- no implicit object lookups, thus
Oid
needs to get anObject
first to start out with data viaobject()
- Objects with
Ref
suffix can only exist one at a time unless they are transformed into an owned version of it OR multipleEasy
handles are present, each providing another ‘slot’ for an object as long as its retrieved through the respectiveEasy
object. ObjectRef
blocks the current buffer, hence many of its operations that use the buffer are consuming- All methods that access a any field from
Easy
’s mutableState
are fallible, and returneasy::Result<_>
at least, to avoid panics if the field can’t be referenced due to borrow rules ofRefCell
. - Anything attached to
Access
can be detached to lift the object limit or make themSend
-able. They can beattached
to anotherAccess
if needed. git-repository
functions related toAccess
extensions will always return attached versions of return values, likeOid
instead ofgit_hash::ObjectId
,ObjectRef
instead ofgit_odb::data::Object
, orReference
instead ofgit_ref::Reference
.- Obtaining mutable is currently a weak spot as these only work with Arc
right now and can’t work with Rc<RefCell>
due to missing GATs, presumably. AllEasy*!Exclusive
types are unable to provide a mutable reference to the underlying repository. However, other ways to adjust theRepository
of long-running applications are possible. For instance, there could be a flag that indicates a newRepository
should be created (for instance, after it was changed) which causes the next server connection to create a new one. This instance is the one to use when spawning newEasyArc
instances. Platform
types are used to hold mutable or shared versions of required state for use in dependent objects they create, like iterators. These come with the benefit of allowing for nicely readable call chains. Sometimes these are calledPlatform
for a lack of a more specific term, some are called more specifically likeAncestors
.
Terminology
WorkingTree and WorkTree
When reading the documentation of the canonical git-worktree program one gets the impression work tree and working tree are used interchangeably. We use the term work tree only and try to do so consistently as its shorter and assumed to be the same.
Cargo-features
To make using sub-crates easier these are re-exported into the root of this crate. Note that these may change their major version even if this crate doesn’t, hence breaking downstream.
git_repository::
attrs
hash
url
actor
- [
bstr
][bstr] date
discover
index
glob
path
credentials
prompt
sec
worktree
mailmap
objs
odb
refs
revision
interrupt
tempfile
lock
traverse
diff
parallel
refspec
Progress
progress
interrupt
protocol
transport
- [
packetline
][protocol::transport::packetline]
- [
Feature Flags
Mutually Exclusive Network Client
Either async-*
or blocking-*
versions of these toggles may be enabled at a time.
-
async-network-client
— Makegit-protocol
available along with an async client. -
async-network-client-async-std
— Use this if your crate usesasync-std
as runtime, and enable basic runtime integration when connecting to remote servers. -
blocking-network-client
— Makegit-protocol
available along with a blocking client. -
blocking-http-transport
— Stacks withblocking-network-client
to provide support for HTTP/S, and implies blocking networking as a whole.
Other
-
serde1
— Data structures implementserde::Serialize
andserde::Deserialize
. -
max-performance
(enabled by default) — Activate other features that maximize performance, like usage of threads,zlib-ng
and access to caching in object databases. Note that some platforms might suffer from compile failures, which is whenmax-performance-safe
should be used. -
fast-sha1
— If enabled, use assembly versions of sha1 on supported platforms. This might cause compile failures as well which is why it can be turned off separately. -
max-performance-safe
— Activate features that maximize performance, like usage of threads,zlib-ng
and access to caching in object databases, skipping the ones known to cause compile failures on some platforms. -
cache-efficiency-debug
— Print debugging information about usage of object database caches, useful for tuning cache sizes. -
regex
— For use in rev-parse, which provides searching commits by running a regex on their message.If disabled, the text will be search verbatim in any portion of the commit message, similar to how a simple unanchored regex of only ‘normal’ characters would work.
Re-exports
pub use git_actor as actor;
pub use git_attributes as attrs;
pub use git_credentials as credentials;
pub use git_date as date;
pub use git_diff as diff;
pub use git_glob as glob;
pub use git_hash as hash;
pub use git_lock as lock;
pub use git_object as objs;
pub use git_object::bstr;
pub use git_odb as odb;
pub use git_prompt as prompt;
pub use git_protocol as protocol;
pub use git_ref as refs;
pub use git_refspec as refspec;
pub use git_sec as sec;
pub use git_tempfile as tempfile;
pub use git_traverse as traverse;
pub use git_url as url;
Modules
parallel
feature toggle.progress
prodash
types along with various utilities for comfort.threading
feature toggle.Structs
.git/HEAD
, able to represent all of its possible states.Sync + Send
for most
for system resources required to interact with a git
repository which are loaded in once the instance is created.Enums
Traits
Functions
Repository
instead.Repository
instead.Repository
instead.Repository
instead.Repository
instead.