pub struct StreamingPackBuilder<W: Write + Read + Seek> { /* private fields */ }Expand description
Streaming pack builder. Held generic over the pack writer (File
in production, Cursor<Vec<u8>> in tests).
Implementations§
Source§impl<W: Write + Read + Seek> StreamingPackBuilder<W>
impl<W: Write + Read + Seek> StreamingPackBuilder<W>
Sourcepub fn new(
pack_writer: W,
index_path: PathBuf,
compression: CompressionConfig,
bucket_dir: PathBuf,
) -> Result<Self>
pub fn new( pack_writer: W, index_path: PathBuf, compression: CompressionConfig, bucket_dir: PathBuf, ) -> Result<Self>
Open a streaming builder against pack_writer, using
bucket_dir for transient index buckets and writing the
finalized index to index_path. The bucket dir is created if
it doesn’t exist; on a successful finalize it’s removed
(along with any bucket files left in it).
index_path is not created by new — opening happens at
finalize so a misconfigured caller doesn’t leave an empty index
file behind on early failure. It’s still recorded here so
finalize can write to a known location and the caller can
install the file by path.
The pack_writer must support Read because finalize re-streams
the body to compute the trailer checksum — see the module-level
note on the format.
Sourcepub fn add(
&mut self,
hash: ContentHash,
obj_type: ObjectType,
data: Vec<u8>,
) -> Result<()>
pub fn add( &mut self, hash: ContentHash, obj_type: ObjectType, data: Vec<u8>, ) -> Result<()>
Add an object with a content-hash id.
Sourcepub fn add_id(
&mut self,
id: PackObjectId,
obj_type: ObjectType,
data: Vec<u8>,
) -> Result<()>
pub fn add_id( &mut self, id: PackObjectId, obj_type: ObjectType, data: Vec<u8>, ) -> Result<()>
Add an object with an explicit id. Mirrors super::PackBuilder::add_id.
§Memory shape
Per-entry, the only allocations are:
data: Vec<u8>(the input, owned by the caller — comes from gix’find_objectand isn’t ours to stream further).- A ~40-byte stack scratch for the entry header.
- zstd’s internal compression context (~128 KB constant).
- One 50-byte index-bucket entry buffered into the bucket’s
BufWriter.
The compressed payload is never materialized as a Vec<u8> —
it streams directly through zstd::stream::write::Encoder into
the pack writer. The pack format requires a compressed_size
varint before the compressed bytes, which we don’t know yet
when we write the header; we reserve a 10-byte placeholder and
seek-back to patch it after the encoder finishes. Heddle’s
varint decoder accepts non-canonical encodings (it walks
continuation bits without enforcing minimum-byte form), so the
padded write decodes back to the same value any reader expects.
Sourcepub fn finalize(self) -> Result<(W, PackStats)>
pub fn finalize(self) -> Result<(W, PackStats)>
Close the pack: patch the header count, append the BLAKE3
trailer, build the sorted index from bucket files, and clean up
the bucket directory. Returns (pack_writer, index_bytes, stats) so the caller can install the pack into its store.
On any failure the bucket dir is left in place; rerunning the import will overwrite stale bucket files (they’re keyed by fixed name, not content) so this isn’t a correctness issue — just a small amount of disk churn until the next clean finalize.