git-remote-object-store 0.2.0

Git remote helper backed by cloud object stores (S3, Azure Blob Storage)
Documentation
git-remote-object-store-0.2.0 has been yanked.

git-remote-object-store

CI CodeQL crates.io

Push, fetch, and clone Git repositories straight against a S3 compatible store or Azure Blob Storage. No intermediary servers. One static binary, one object store.

git remote add origin 's3+https://my-bucket.s3.us-west-2.amazonaws.com/my-repo'
git push -u origin main

Or, with Azure:

git remote add origin 'az+https://myaccount.blob.core.windows.net/my-container/my-repo?credential=PROD'
git push -u origin main

That's it. Your bucket is your remote.

Why?

You want a private Git remote that is:

  • Owned by you, not a vendor. No SaaS subscription, no per-seat cost, no "the host got breached" risk for your private code. Just a bucket or container in an account you already control.
  • Backed by storage you already trust. Encryption at rest, IAM/RBAC at the prefix or container, lifecycle policies, regional replication, audit logs — every control your cloud storage gives you, with no application server in between.
  • One small binary. No Python runtime, no Docker image, no webhook endpoint to babysit.

Use cases that fit naturally:

  • Private repos you do not want on GitHub or GitLab.
  • Internal libraries hosted on your team's existing S3 / Azure tenant.
  • Repos consumed by AWS CodePipeline (use ?zip=1 to mirror each push as repo.zip next to the bundle).
  • Air-gapped or sovereign-cloud environments where SaaS Git hosts are not an option.

What you get

  • Two backends behind one trait. AWS S3 and Azure Blob Storage, plus any S3-compatible endpoint (MinIO, Cloudflare R2, Wasabi, Backblaze B2, RustFS, on-prem appliances).
  • Two storage engines. A bundle engine (one git bundle per push, simple and self-contained) and a packchain engine (newest- first pack manifest with GC and compaction, smaller fetches on active repos). Pick per-remote with ?engine=; default is bundle. See docs/storage-engines.md for the comparison and when to choose which.
  • Streaming uploads end-to-end. No in-memory buffering of bundles, no 5 GiB single-PUT ceiling — multipart upload is wired into both backends.
  • Locking parity across backends. If-None-Match: * on S3, mirrored on Azure; same TTL semantics; tested across both.

Quick install

See docs/getting-started.md for the full walkthrough — install, credentials for both clouds, your first push, LFS, submodules, local development against MinIO and Azurite.

The short version:

cargo xtask install

That runs cargo install --path cli and creates the four +-form helper symlinks (git-remote-s3+https, git-remote-s3+http, git-remote-az+https, git-remote-az+http) alongside the cargo binaries, which is what git looks up by URL scheme. Re-runs are idempotent. Pass --bin-dir <PATH> to install into a custom directory, --no-install to refresh the symlinks only, or --dry-run to preview.

Using as a library

git-remote-object-store is also a Rust library crate. Remote is the entry point for reading and writing objects in the on-bucket format; the ObjectStore trait and the S3 / Azure backends are also publicly exported for building custom storage integrations. See docs/library-usage.md for a worked example and docs.rs for the full API.

Documentation

Testing

make shellspec runs the fast CLI unit suite. The end-to-end shellspec suites drive git push / git fetch / git clone through the helper binaries against real backend containers; they require Docker, the matching cloud CLI on the host, and git-lfs for the LFS scenarios.

make shellspec-integration-s3       # requires docker + aws-cli + git-lfs
make shellspec-integration-azure    # requires docker + azure-cli + git-lfs
make shellspec-integration          # both

Status

The shipping surface covers both storage engines (bundle and packchain), both backends (S3 and Azure Blob), the helper-protocol REPL, parallel fetch, locked push, the management CLI (doctor, delete-branch, protect, unprotect, gc, compact), the LFS custom-transfer agent, and the signed release pipeline. See CHANGELOG.md for the current release.

Git operations are gitoxide-backed end to end — bundle read/write is native via gix-pack, and the crate spawns no git subprocess in production code. The gix surface in use covers rev-parse, is-ancestor, ref-name validation, remote-URL inspection, archive, last-commit-message, ref discovery, and object resolution.

Known limitations

A push of a multi-GB monorepo will work today on either backend — multipart upload is wired into both — but a few sharp edges are worth knowing about before you start:

  • No resume after a failed upload. If the helper process dies mid-push (network blip, signal, reboot), the next git push re-uploads the bundle from the beginning. S3 cleans up abandoned multipart sessions per the bucket's lifecycle policy; Azure uncommitted blocks expire after seven days. Neither backend surfaces a "resume from byte N" handle today.
  • Object-size ceilings are the cloud's, not ours. S3 caps a single object at 5 TiB and a multipart upload at 10 000 parts; the single-PutObject ceiling is still 5 GiB but the helper auto- promotes large bodies to multipart well below that. Azure caps a block blob at 50 000 committed blocks (~4.75 TiB at the SDK's default block size). Repositories whose individual bundles approach those limits are outside what either backend can store.

Verifying releases

Every v* tag publishes signed, attested artefacts (minisign over SHA256SUMS, SLSA build provenance, CycloneDX SBOMs) to GitHub Releases. See docs/verifying-releases.md for the verification recipe and SECURITY.md for vulnerability reporting.

License

Apache-2.0. See LICENSE.

Credits

Inspired by awslabs/git-remote-s3, which itself draws on bgahagan/git-remote-s3 and the LFS work in nicolas-graves/lfs-s3.