pub struct FrameCompressor<R: Read = &'static [u8], W: Write = Vec<u8>, M: Matcher = MatchGeneratorDriver> { /* private fields */ }Expand description
An interface for compressing arbitrary data with the ZStandard compression algorithm.
FrameCompressor will generally be used by:
- Initializing a compressor by providing a buffer of data using
FrameCompressor::new() - Starting compression and writing that compression into a vec using
FrameCompressor::begin
§Examples
use structured_zstd::encoding::{FrameCompressor, CompressionLevel};
let mock_data: &[_] = &[0x1, 0x2, 0x3, 0x4];
let mut output = std::vec::Vec::new();
// Initialize a compressor.
let mut compressor = FrameCompressor::new(CompressionLevel::Uncompressed);
compressor.set_source(mock_data);
compressor.set_drain(&mut output);
// `compress` writes the compressed output into the provided buffer.
compressor.compress();Implementations§
Source§impl<R: Read, W: Write> FrameCompressor<R, W, MatchGeneratorDriver>
impl<R: Read, W: Write> FrameCompressor<R, W, MatchGeneratorDriver>
Sourcepub fn new(compression_level: CompressionLevel) -> Self
pub fn new(compression_level: CompressionLevel) -> Self
Create a new FrameCompressor
Sourcepub fn set_parameters(&mut self, params: &CompressionParameters)
pub fn set_parameters(&mut self, params: &CompressionParameters)
Configure fine-grained compression parameters (#27).
Resets the base CompressionLevel
to the parameters’ level and installs the per-knob overrides
(window/hash/chain/search logs, strategy, LDM) applied at the next
frame. Pass None-equivalent (a builder that overrides nothing)
to fall back to plain level-based compression.
use structured_zstd::encoding::{
CompressionLevel, CompressionParameters, FrameCompressor, Strategy,
};
let params = CompressionParameters::builder(CompressionLevel::Level(19))
.strategy(Strategy::Btultra2)
.enable_long_distance_matching(true)
.build()
.unwrap();
let mut compressor: FrameCompressor = FrameCompressor::new(CompressionLevel::Default);
compressor.set_parameters(¶ms);
let compressed = compressor.compress_independent_frame(b"some data to compress");
assert!(!compressed.is_empty());Sourcepub fn compress_independent_frame_into(
&mut self,
input: &[u8],
out: &mut Vec<u8>,
)
pub fn compress_independent_frame_into( &mut self, input: &[u8], out: &mut Vec<u8>, )
Compress one contiguous &[u8] as a single independent Zstd frame,
writing the frame bytes into out (its previous contents are
replaced and its allocation reused), reusing this compressor’s heavy
state across calls.
This is the reusable-compression-context (CCtx-equivalent) entry
point, mirroring C ZSTD_compress2 over a reused ZSTD_CCtx:
construct ONE FrameCompressor and call this in a loop to emit N
independent, self-describing frames (each carrying its own header,
blocks, and checksum, decodable in isolation, with no cross-frame
match history). Every call resets the per-frame state via
Self::prepare_frame: only the allocations are kept, so the
dominant per-frame setup cost (table allocation + dictionary prime)
is paid once instead of N times. Passing the same out buffer each
call additionally reuses the output allocation, matching C’s
caller-owned dst buffer (no per-frame output allocation).
Reusing the context + out across many small frames (the typical
per-block-frame workload) is far cheaper than a fresh
compress_slice_to_vec
per block, which allocates and primes from scratch each time.
The input is read in place: no Self::set_source /
Self::set_drain setup is required, and the input lifetime is not
baked into the compressor type, so successive calls may pass slices
with unrelated lifetimes. When the Fast (Simple) backend is active
and no dictionary is set, the matcher references the input directly
(no per-block history copy); other backends / dictionary use copy
each block into history exactly as the streaming
compress path does. The source-size hint is
derived from the input length on every call, so per-frame table
sizing tracks each frame’s actual size regardless of any earlier
hint.
A sticky dictionary set via
set_dictionary (or its variants) is primed
into every frame, mirroring ZSTD_CCtx_loadDictionary /
ZSTD_CCtx_refCDict.
§Panics
Panics on encoder error, matching Self::compress and
compress_slice_to_vec.
Sourcepub fn compress_independent_frame(&mut self, input: &[u8]) -> Vec<u8> ⓘ
pub fn compress_independent_frame(&mut self, input: &[u8]) -> Vec<u8> ⓘ
Convenience wrapper over Self::compress_independent_frame_into
that allocates and returns a fresh Vec per call. Prefer the
_into form in tight per-block-frame loops to reuse one output
buffer across frames (the CCtx-equivalent zero-per-call-alloc
output, matching C’s caller-owned dst).
use structured_zstd::encoding::{FrameCompressor, CompressionLevel};
let mut cctx: FrameCompressor = FrameCompressor::new(CompressionLevel::Default);
let frame_a = cctx.compress_independent_frame(b"first block payload");
let frame_b = cctx.compress_independent_frame(b"second block payload");
assert!(!frame_a.is_empty() && !frame_b.is_empty());Source§impl<R: Read, W: Write, M: Matcher> FrameCompressor<R, W, M>
impl<R: Read, W: Write, M: Matcher> FrameCompressor<R, W, M>
Sourcepub fn new_with_matcher(matcher: M, compression_level: CompressionLevel) -> Self
pub fn new_with_matcher(matcher: M, compression_level: CompressionLevel) -> Self
Create a new FrameCompressor with a custom matching algorithm implementation
Sourcepub fn set_magicless(&mut self, magicless: bool)
pub fn set_magicless(&mut self, magicless: bool)
Enable or disable magicless frame format (ZSTD_f_zstd1_magicless).
When set to true, emitted frames omit the 4-byte magic number
prefix. The matching decoder must be configured to expect a
magicless stream — wire-format only round-trips with a
magicless-aware decoder.
Sourcepub fn set_content_checksum(&mut self, emit: bool)
pub fn set_content_checksum(&mut self, emit: bool)
Enable or disable the trailing XXH64 content checksum
(semantics of upstream ZSTD_c_checksumFlag). Default false,
matching the upstream library default (ZSTD_c_checksumFlag = 0)
so out-of-the-box frames carry the same layout and pay the same
costs as the reference implementation.
When false, emitted frames set Content_Checksum_flag = 0 and carry
no trailing digest; such frames are valid (RFC 8878) and decode
correctly in any ContentChecksum
mode. Without the hash feature no checksum is emitted regardless of
this setting.
Sourcepub fn set_content_size_flag(&mut self, emit: bool)
pub fn set_content_size_flag(&mut self, emit: bool)
Enable or disable recording Frame_Content_Size in the frame header
when the total size is known (semantics of upstream
ZSTD_c_contentSizeFlag). Default true, matching upstream. With
the flag off the header carries a window descriptor instead (and the
single-segment layout, which requires an FCS, is disabled).
Sourcepub fn set_dictionary_id_flag(&mut self, emit: bool)
pub fn set_dictionary_id_flag(&mut self, emit: bool)
Enable or disable recording the dictionary ID in the frame header
when a dictionary is attached (semantics of upstream
ZSTD_c_dictIDFlag). Default true, matching upstream. Frames
emitted with the flag off still decode when the decoder is handed
the dictionary explicitly.
Sourcepub fn set_target_block_size(&mut self, target: Option<u32>)
pub fn set_target_block_size(&mut self, target: Option<u32>)
Set an upper bound on emitted block sizes (semantics of upstream
ZSTD_c_targetCBlockSize): every physical block’s payload is capped
at target bytes (+3-byte block header on the wire), trading some
ratio for bounded per-block latency. The value is clamped to
[MIN_TARGET_BLOCK_SIZE, MAX_BLOCK_SIZE] (the upstream bounds).
None removes the target.
Sourcepub fn set_source(&mut self, uncompressed_data: R) -> Option<R>
pub fn set_source(&mut self, uncompressed_data: R) -> Option<R>
Before calling FrameCompressor::compress you need to set the source.
This is the data that is compressed and written into the drain.
Sourcepub fn set_drain(&mut self, compressed_data: W) -> Option<W>
pub fn set_drain(&mut self, compressed_data: W) -> Option<W>
Before calling FrameCompressor::compress you need to set the drain.
As the compressor compresses data, the drain serves as a place for the output to be writte.
Sourcepub fn set_source_size_hint(&mut self, size: u64)
pub fn set_source_size_hint(&mut self, size: u64)
Provide a hint about the total uncompressed size for the next frame.
When set, the encoder selects smaller hash tables and windows for small inputs, matching the C zstd source-size-class behavior.
This hint applies only to frame payload bytes (size). Dictionary
history is primed separately and does not inflate the hinted size or
advertised frame window.
Must be called before compress.
Sourcepub fn heap_size(&self) -> usize
pub fn heap_size(&self) -> usize
Total heap bytes this compressor’s allocations hold, excluding the
inline struct: the match-finder tables / history / recycled buffers and
the primed-dictionary snapshot (via the matcher), the retained
Huffman tables (active + recycled spare), the retained dictionary
content, the cached dictionary entropy tables (literals Huffman +
LL/ML/OF FSE), and the per-block sidecar buffers. Lets a context
report its true footprint through ZSTD_sizeof_CCtx.
Sourcepub fn compress(&mut self)
pub fn compress(&mut self)
Compress the uncompressed data from the provided source as one Zstd frame and write it to the provided drain
This will repeatedly call Read::read on the source to fill up blocks until the source returns 0 on the read call.
All compressed blocks are buffered in memory so that the frame header can include the
Frame_Content_Size field (which requires knowing the total uncompressed size). The
entire frame — header, blocks, and optional checksum — is then written to the drain
at the end. This means peak memory usage is O(compressed_size).
To avoid endlessly encoding from a potentially endless source (like a network socket) you can use the
Read::take function
Per-frame setup values resolved by Self::prepare_frame and
consumed by the block loop + Self::finish_frame. Lets the
owned compress() and the borrowed one-shot path share the exact
same reset / dict-prime / entropy-seed setup and frame tail.
Sourcepub fn last_frame_emit_info(&self) -> Option<&FrameEmitInfo>
Available on crate feature lsm only.
pub fn last_frame_emit_info(&self) -> Option<&FrameEmitInfo>
lsm only.Layout of the most recently emitted frame.
Returns None if compress has not been
called yet on this compressor. After a successful compress()
the returned FrameEmitInfo describes the frame header range,
every emitted block’s offset / size / type, and the optional
trailing content-checksum range — all in frame-absolute byte
offsets matching the bytes written to the drain.
Behind the lsm Cargo feature.
Sourcepub fn enable_per_block_checksums(&mut self)
Available on crate features hash and lsm only.
pub fn enable_per_block_checksums(&mut self)
hash and lsm only.Opt in to per-block XXH64 checksum computation during
compress. Default off; zero cost when
disabled. The captured digests are accessible via
last_frame_block_checksums.
One checksum is emitted per physical FrameBlock written to
the drain: 1:1 cardinality with
last_frame_emit_info’s
blocks vector. On the post-split optimization path
(Level 16-22 with large window) the per-partition decompressed
range is hashed inside the partition loop so the digest count
still matches the emitted block count. The decoder collects
per-physical-block digests on the same granularity, so
element-wise equality holds round-trip.
Behind all(feature = "lsm", feature = "hash") — the XXH64
primitive lives behind the hash feature, so this method only
compiles when both are enabled.
Sourcepub fn last_frame_block_checksums(&self) -> Option<&[u32]>
Available on crate features hash and lsm only.
pub fn last_frame_block_checksums(&self) -> Option<&[u32]>
hash and lsm only.Per-block XXH64 (low 32 bits) digests captured during the most
recent compress() call. None unless
enable_per_block_checksums
was called before compress().
Behind all(feature = "lsm", feature = "hash").
Sourcepub fn source_mut(&mut self) -> Option<&mut R>
pub fn source_mut(&mut self) -> Option<&mut R>
Get a mutable reference to the source
Sourcepub fn take_source(&mut self) -> Option<R>
pub fn take_source(&mut self) -> Option<R>
Retrieve the source
Sourcepub fn take_drain(&mut self) -> Option<W>
pub fn take_drain(&mut self) -> Option<W>
Retrieve the drain
Sourcepub fn replace_matcher(&mut self, match_generator: M) -> M
pub fn replace_matcher(&mut self, match_generator: M) -> M
Before calling FrameCompressor::compress you can replace the matcher
Sourcepub fn set_compression_level(
&mut self,
compression_level: CompressionLevel,
) -> CompressionLevel
pub fn set_compression_level( &mut self, compression_level: CompressionLevel, ) -> CompressionLevel
Before calling FrameCompressor::compress you can replace the compression level.
This also clears any fine-grained parameter overrides installed via
set_parameters: reverting to a bare level
means plain level-based tuning, not the previous frame’s customized
strategy / LDM / log overrides. To keep overriding, call
set_parameters again with the new base level.
Sourcepub fn compression_level(&self) -> CompressionLevel
pub fn compression_level(&self) -> CompressionLevel
Get the current compression level
Sourcepub fn set_dictionary(
&mut self,
dictionary: Dictionary,
) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>
pub fn set_dictionary( &mut self, dictionary: Dictionary, ) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>
Attach a pre-parsed dictionary to be used for subsequent compressions.
In compressed modes, the dictionary id is written only when the active matcher supports dictionary priming. Uncompressed mode and non-priming matchers ignore the attached dictionary at encode time.
Sourcepub fn set_dictionary_from_bytes(
&mut self,
raw_dictionary: &[u8],
) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>
pub fn set_dictionary_from_bytes( &mut self, raw_dictionary: &[u8], ) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>
Parse and attach a serialized dictionary blob.
Parses with the encoder-only path (skips the FSE/HUF decode lookup-table build the encoder never reads); the entropy ENCODER tables — and thus the emitted frame — are identical to a full parse.
Sourcepub fn set_encoder_dictionary(
&mut self,
dictionary: EncoderDictionary,
) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>
pub fn set_encoder_dictionary( &mut self, dictionary: EncoderDictionary, ) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>
Attach an already-parsed EncoderDictionary without reparsing a raw
blob.
Accepts an EncoderDictionary produced once via
EncoderDictionary::from_bytes / EncoderDictionary::from_dictionary
or handed back by Self::clear_dictionary / the set_dictionary*
return value, so callers can reattach or reuse a prepared dictionary
across compressions without re-running the dictionary parse each time.
Returns the previously-attached dictionary, if any.
Sourcepub fn clear_dictionary(&mut self) -> Option<EncoderDictionary>
pub fn clear_dictionary(&mut self) -> Option<EncoderDictionary>
Remove the attached dictionary, returning it as an EncoderDictionary.