Skip to main content

Module pack

Module pack 

Source
Expand description

SNAPPACK 1 — the snapdir pack wire format.

A pack is a single self-verifying byte stream that carries raw content-addressed objects (and at most one manifest) between two snapdir processes, e.g. snapdir send-pack | ssh host 'snapdir receive-pack' — the acceleration path of the upcoming ssh:// store. Both ends of the pipe are snapdir itself, so the format is deliberately minimal (no tar semantics, no entry names, no padding).

§Grammar (normative)

stream   := "SNAPPACK 1\n" record* "end\n"
record   := "obj " hex64 " " len "\n" payload(len)
          | "manifest " hex64 " " len "\n" payload(len)   ; at most one; must be the LAST record
hex64    := 64 lowercase hex chars, regex ^[0-9a-f]{64}$ (validated on read AND write)
len      := decimal u64
payload  := exactly len raw bytes, no padding/terminator

§Invariants

  • Header memory bound: every header line (including its terminating \n) is at most MAX_HEADER_BYTES bytes. The reader rejects a longer line as soon as the cap is hit, without buffering more.
  • Verify-before-file: an obj payload streams through an INCREMENTAL BLAKE3 hasher while it is staged; it is committed at its claimed content-address only if the computed hash equals the claimed hex64. A mismatch removes the staged bytes (temp file) and aborts the WHOLE stream with StoreError::Integrity — a corrupt stream taints everything after it, so nothing past the bad record is trusted.
  • Manifest-last / commit-at-end: the optional manifest record must be the last record (any record after it is rejected), its payload is buffered (capped at MAX_MANIFEST_BYTES), and it is committed to the sink only after the end trailer has been read. EOF before end is a hard error and the manifest is NEVER committed — so a truncated stream or dropped connection can file (verified) objects but can never make the snapshot observable, preserving the store-wide manifest-last invariant.
  • Idempotent duplicates: a duplicate obj record is skipped (write-once), but its bytes are still read and hash-verified — the stream cannot seek, and a hash mismatch on ANY record (present or not) aborts.
  • No path input: the on-disk location of every payload is derived exclusively from the validated claimed checksum (snapdir_core::store::object_path / snapdir_core::store::manifest_path); there is no entry-name concept, so the path-traversal class is structurally absent.

§Memory profile

read_pack into a FileSink is O(1) memory per record regardless of object size: payload bytes stream through a fixed-size buffer into a temp sibling of the final object path (the same temp+atomic-rename discipline as file_store.rs) while the incremental hasher runs. The generic StreamSink buffers ONE object record at a time (its StreamStore::put_object primitive takes whole buffers); the manifest record is always buffered, capped at MAX_MANIFEST_BYTES.

write_pack reads one object at a time via StreamStore::get_object (one whole object buffered at a time; the send-pack CLI layers any further streaming on top in a later gate).

Structs§

FileSink
File-backed PackSink over a FileStore: obj payloads stream through a fixed-size buffer straight into a unique temp sibling of the final object path, then an atomic rename commits on hash match — O(1) memory per record regardless of object size.
PackReadReport
What read_pack filed into its sink.
PackWriteReport
What write_pack emitted.
StreamSink
Generic PackSink over any StreamStore: buffers one obj payload at a time in memory, then files it via the store’s verify-before-write put_object (so the store’s own integrity discipline re-checks the commit). Use FileSink for file://-rooted sinks to get O(1) memory per record.

Constants§

MAX_HEADER_BYTES
Hard cap on a header line, INCLUDING its terminating \n. The reader rejects a longer line the moment the cap is reached — this bounds reader memory before any validation happens. (The longest valid header — manifest <hex64> <u64::MAX>\n — is 95 bytes, so the cap is comfortable.)
MAX_MANIFEST_BYTES
Hard cap on a manifest record’s payload, which (unlike obj payloads) is buffered in memory until the end trailer commits it.
WIRE_CAPS
The plumbing capabilities this build advertises alongside WIRE_VERSION.
WIRE_MAGIC
The exact magic line that opens every pack stream (version baked in; a unit test pins it to WIRE_VERSION).
WIRE_VERSION
The pack wire-format version this build speaks. Single source of truth for the wire: the capability line (snapdir version --capabilities) bakes this value, and read_pack negotiates on an exact integer match only.

Traits§

PackSink
Where read_pack files verified records.

Functions§

is_hex64
Returns true when s is a syntactically valid snapdir content address: exactly 64 lowercase hex characters (^[0-9a-f]{64}$).
read_pack
Consumes a SNAPPACK 1 stream from input, filing verified records into sink. See the module docs for the full invariant list; in short:
write_pack
Emits a SNAPPACK 1 stream: magic, one obj record per entry of ids IN INPUT ORDER, then (if manifest_id is given) the manifest record LAST, then the end trailer.