Expand description
SNAPPACK 1 — the snapdir pack wire format.
A pack is a single self-verifying byte stream that carries raw
content-addressed objects (and at most one manifest) between two snapdir
processes, e.g. snapdir send-pack | ssh host 'snapdir receive-pack' — the
acceleration path of the upcoming ssh:// store. Both ends of the pipe are
snapdir itself, so the format is deliberately minimal (no tar semantics, no
entry names, no padding).
§Grammar (normative)
stream := "SNAPPACK 1\n" record* "end\n"
record := "obj " hex64 " " len "\n" payload(len)
| "manifest " hex64 " " len "\n" payload(len) ; at most one; must be the LAST record
hex64 := 64 lowercase hex chars, regex ^[0-9a-f]{64}$ (validated on read AND write)
len := decimal u64
payload := exactly len raw bytes, no padding/terminator§Invariants
- Header memory bound: every header line (including its terminating
\n) is at mostMAX_HEADER_BYTESbytes. The reader rejects a longer line as soon as the cap is hit, without buffering more. - Verify-before-file: an
objpayload streams through an INCREMENTAL BLAKE3 hasher while it is staged; it is committed at its claimed content-address only if the computed hash equals the claimedhex64. A mismatch removes the staged bytes (temp file) and aborts the WHOLE stream withStoreError::Integrity— a corrupt stream taints everything after it, so nothing past the bad record is trusted. - Manifest-last / commit-at-
end: the optionalmanifestrecord must be the last record (any record after it is rejected), its payload is buffered (capped atMAX_MANIFEST_BYTES), and it is committed to the sink only after theendtrailer has been read. EOF beforeendis a hard error and the manifest is NEVER committed — so a truncated stream or dropped connection can file (verified) objects but can never make the snapshot observable, preserving the store-wide manifest-last invariant. - Idempotent duplicates: a duplicate
objrecord is skipped (write-once), but its bytes are still read and hash-verified — the stream cannot seek, and a hash mismatch on ANY record (present or not) aborts. - No path input: the on-disk location of every payload is derived
exclusively from the validated claimed checksum
(
snapdir_core::store::object_path/snapdir_core::store::manifest_path); there is no entry-name concept, so the path-traversal class is structurally absent.
§Memory profile
read_pack into a FileSink is O(1) memory per record regardless of
object size: payload bytes stream through a fixed-size buffer into a temp
sibling of the final object path (the same temp+atomic-rename discipline as
file_store.rs) while the incremental hasher runs. The generic
StreamSink buffers ONE object record at a time (its
StreamStore::put_object primitive takes whole buffers); the manifest
record is always buffered, capped at MAX_MANIFEST_BYTES.
write_pack reads one object at a time via
StreamStore::get_object (one whole object buffered at a time; the
send-pack CLI layers any further streaming on top in a later gate).
Structs§
- File
Sink - File-backed
PackSinkover aFileStore:objpayloads stream through a fixed-size buffer straight into a unique temp sibling of the final object path, then an atomic rename commits on hash match — O(1) memory per record regardless of object size. - Pack
Read Report - What
read_packfiled into its sink. - Pack
Write Report - What
write_packemitted. - Stream
Sink - Generic
PackSinkover anyStreamStore: buffers oneobjpayload at a time in memory, then files it via the store’s verify-before-writeput_object(so the store’s own integrity discipline re-checks the commit). UseFileSinkforfile://-rooted sinks to get O(1) memory per record.
Constants§
- MAX_
HEADER_ BYTES - Hard cap on a header line, INCLUDING its terminating
\n. The reader rejects a longer line the moment the cap is reached — this bounds reader memory before any validation happens. (The longest valid header —manifest <hex64> <u64::MAX>\n— is 95 bytes, so the cap is comfortable.) - MAX_
MANIFEST_ BYTES - Hard cap on a
manifestrecord’s payload, which (unlikeobjpayloads) is buffered in memory until theendtrailer commits it. - WIRE_
CAPS - The plumbing capabilities this build advertises alongside
WIRE_VERSION. - WIRE_
MAGIC - The exact magic line that opens every pack stream (version baked in; a
unit test pins it to
WIRE_VERSION). - WIRE_
VERSION - The pack wire-format version this build speaks. Single source of truth for
the wire: the capability line (
snapdir version --capabilities) bakes this value, andread_packnegotiates on an exact integer match only.
Traits§
Functions§
- is_
hex64 - Returns
truewhensis a syntactically valid snapdir content address: exactly 64 lowercase hex characters (^[0-9a-f]{64}$). - read_
pack - Consumes a SNAPPACK 1 stream from
input, filing verified records intosink. See the module docs for the full invariant list; in short: - write_
pack - Emits a SNAPPACK 1 stream: magic, one
objrecord per entry ofidsIN INPUT ORDER, then (ifmanifest_idis given) themanifestrecord LAST, then theendtrailer.