Expand description
Verifiable sparse-checkout (Phase 1 scaffold).
Spec reference: docs/SPEC-SPARSE-CHECKOUT.md. Issue #158.
§What this is
Today’s mkit sparse-checkout filters paths on the client after
the server has handed over the full tree. That’s fine for the file
transport but wasteful on HTTP / S3 transports where the server
could ship a partial subtree if the client could verify the
server didn’t lie about which entries were omitted by request
versus silently dropped.
This module is the Phase 1 core scaffolding: build a manifest from
a Tree + filter, and verify a delivered set of TreeEntrys
against it. The actual transport-level integration (HTTP/S3 query
params, on-disk bitmap cache) is Phase 2 and is intentionally out
of scope.
§Authenticated bitmap
Authentication uses
commonware_storage::AuthenticatedBitMap, which provides
a Merkleized bitmap with bit-level inclusion proofs. The bitmap is
ALPHA-tier upstream and std-only, so this entire module sits
behind the sparse-checkout Cargo feature (default off).
Each entry in the underlying Tree is assigned a leaf index equal
to its position in the tree’s strict lexicographic byte ordering
(the same ordering enforced by Tree::is_sorted). A bit set at
index i means “the server is shipping entry i”; an unset bit
means “this entry is omitted by client request”. Tampering — the
server flipping a bit or omitting/inserting an entry — produces a
different bitmap root, which fails verification against the
root committed in the SparseManifest.
§Wire format
Strictly defined by docs/SPEC-SPARSE-CHECKOUT.md. Phase 1 does
not yet wire SparseProof into any transport — the type is the
in-memory carrier between build_sparse and verify_sparse.
Structs§
- Sparse
Manifest - Manifest committing to which tree entries the server is including in a sparse delivery. See SPEC-SPARSE-CHECKOUT §2.
- Sparse
Proof - Verifiable proof bundle accompanying a
SparseManifest. - Sparse
Response - Complete server-to-client sparse delivery: manifest + entries + proof, in the order they appear on the wire. The encoder and decoder are content-stable across calls so the byte layout can be pinned in golden vectors if and when needed.
Enums§
- Sparse
Error - Errors raised by
build_sparseandverify_sparse. Phase 1 keeps this small — the transport layer will wrap these in its own error type in Phase 2. - Sparse
Wire Error - Errors raised when encoding or decoding a
SparseResponseon the wire. Kept tight — the transport layer wraps these in its own transport-error type at the call site.
Constants§
- MAX_
FILTER_ PATHS - Hard cap on the number of filter paths. Prevents a hostile client from sending a billion-entry filter to a server. Mirrors the transport-side bound documented in SPEC-SPARSE-CHECKOUT §4.
- MAX_
LEAVES - Hard cap on the number of leaves in a tree we are willing to build
a sparse manifest for. Matches the per-tree
entry_countbound in SPEC-OBJECTS §4. Verifier MUST enforce the same cap so a maliciousmanifest.leaf_countcan’t allocate unbounded memory. - SPARSE_
CACHE_ DIR - Subdirectory under
.mkit/for the sparse bitmap cache. - SPARSE_
CACHE_ MAGIC - Cache file header magic —
b"MSPC"(mkit-sparse-cache). Distinct from the wire magic so a misnamed file can’t be parsed as either. - SPARSE_
CACHE_ VERSION - Cache file format version. Bumped on any breaking change.
- SPARSE_
WIRE_ MAGIC - Wire envelope magic —
b"MSP1"(mkit-sparse-v1). Helps the transport sanity-check the body before any deserialisation. Sits in the same family as the v1 object prologue, but the codes are distinct so a misrouted blob can’t be parsed as a sparse response. - SPARSE_
WIRE_ MAX_ BYTES - Maximum encoded sparse-response wire size. Caller-side cap; ~16 MiB is comfortably more than a maximum-sized bitmap (~125 KB) plus the largest possible entry stream.
- SPARSE_
WIRE_ VERSION - Wire-format version of
SparseResponse. Bumped on any non-backward-compatible change.
Functions§
- build_
sparse - Build a sparse manifest from a tree and a filter.
- decode_
sparse_ cache - Decode the on-disk cache payload. Returns the bitmap root, filter
hash, leaf count, and bitmap bytes — the caller reconstructs the
SparseManifestif needed (thetree_hashfield comes from the filename, not from the file body). - decode_
sparse_ response - Decode a wire-format sparse response. Refuses any input larger than
SPARSE_WIRE_MAX_BYTESbefore parsing. - encode_
sparse_ cache - Encode the on-disk cache payload for a verified sparse delivery.
- encode_
sparse_ response - Encode a
SparseResponseto the canonical wire bytes. - hash_
filter - Stable BLAKE3 hash of a path-prefix filter. Canonical form:
- tree_
hash - Compute the canonical SPEC-OBJECTS tree hash.
- verify_
sparse - Verify a sparse delivery against a manifest.