Skip to main content

compress_to_vec

Function compress_to_vec 

Source
pub fn compress_to_vec<R: Read>(source: R, level: CompressionLevel) -> Vec<u8> 
Expand description

Convenience function to compress some source into a Vec without reusing any resources of the compressor.

This helper eagerly buffers the full input (Read) before compression so it can provide a source-size hint to the one-shot encoder path. Peak memory can therefore be roughly input_size + output_size. For very large payloads or tighter memory budgets, prefer streaming APIs such as StreamingEncoder.

Peak-memory shape change in this revision. The implementation delegates to compress_slice_to_vec, which seeds the output Vec with min(compress_bound(input.len()), OUTPUT_BLOCK_CAP = 128 KiB) instead of the previous Vec::new() (zero-capacity + power-of-two growth). For inputs in the few-KiB to ~128 KiB range this is a strict improvement (no doubling spikes inside the measured window). For inputs significantly larger than 128 KiB the allocation curve still grows by amortized doubling but starts from a 128 KiB floor rather than 0. Downstream consumers that measure peak RSS on this entry point will see a different curve than pre-revision; bench shape, not steady-state, is what changed.

This is NOT a streaming API. The source is fully buffered into a Vec<u8> before any compression work begins, so peak input memory is bounded by source.len() (not “constant regardless of payload size” as a stream-shaped encoder would offer). The RSS notes below apply to the materialization-then-compress shape; if the source is large enough that holding it in memory is not acceptable, use StreamingEncoder which consumes chunks incrementally without the up-front Vec build.

The other side of the peak shape is the input buffering: this helper drives read_to_end to materialize the full source into a Vec<u8> before forwarding the slice to compress_slice_to_vec. For a Read whose size is unknown ahead of time, read_to_end grows that input Vec via power-of-two doubling — peak input allocation can be up to 2× the final source length transiently. At the moment that input buffer crosses ~128 KiB the output Vec seed kicks in concurrently. The total live working set on this entry point is approximately input.capacity() + output_vec_seed + internal_accumulators, where output_vec_seed is min(compress_bound(input.len()), 128 KiB) and internal_accumulators covers FrameCompressor::all_blocks (pre-reserved at frame start, up to ~130 KiB at default block cap) plus per-block scratch (hash tables, literal/sequence staging). Round the helper’s RSS peak to input.capacity() + output_vec_seed + ~130 KiB internal + per-block scratch rather than the bare input.capacity() + 128 KiB figure quoted in earlier revisions, which only accounted for the output seed. StreamingEncoder avoids the input materialization step entirely and is the right entry point when the source is large or unbounded.

use structured_zstd::encoding::{compress_to_vec, CompressionLevel};
let data: &[u8] = &[0,0,0,0,0,0,0,0,0,0,0,0];
let compressed = compress_to_vec(data, CompressionLevel::Fastest);