Skip to main content

FrameCompressor

Struct FrameCompressor 

Source
pub struct FrameCompressor<R: Read = &'static [u8], W: Write = Vec<u8>, M: Matcher = MatchGeneratorDriver> { /* private fields */ }
Expand description

An interface for compressing arbitrary data with the ZStandard compression algorithm.

FrameCompressor will generally be used by:

  1. Initializing a compressor by providing a buffer of data using FrameCompressor::new()
  2. Starting compression and writing that compression into a vec using FrameCompressor::begin

§Examples

use structured_zstd::encoding::{FrameCompressor, CompressionLevel};
let mock_data: &[_] = &[0x1, 0x2, 0x3, 0x4];
let mut output = std::vec::Vec::new();
// Initialize a compressor.
let mut compressor = FrameCompressor::new(CompressionLevel::Uncompressed);
compressor.set_source(mock_data);
compressor.set_drain(&mut output);

// `compress` writes the compressed output into the provided buffer.
compressor.compress();

Implementations§

Source§

impl<R: Read, W: Write> FrameCompressor<R, W, MatchGeneratorDriver>

Source

pub fn new(compression_level: CompressionLevel) -> Self

Create a new FrameCompressor

Source

pub fn set_parameters(&mut self, params: &CompressionParameters)

Configure fine-grained compression parameters (#27).

Resets the base CompressionLevel to the parameters’ level and installs the per-knob overrides (window/hash/chain/search logs, strategy, LDM) applied at the next frame. Pass None-equivalent (a builder that overrides nothing) to fall back to plain level-based compression.

use structured_zstd::encoding::{
    CompressionLevel, CompressionParameters, FrameCompressor, Strategy,
};
let params = CompressionParameters::builder(CompressionLevel::Level(19))
    .strategy(Strategy::Btultra2)
    .enable_long_distance_matching(true)
    .build()
    .unwrap();
let mut compressor: FrameCompressor = FrameCompressor::new(CompressionLevel::Default);
compressor.set_parameters(&params);
let compressed = compressor.compress_independent_frame(b"some data to compress");
assert!(!compressed.is_empty());
Source

pub fn compress_independent_frame_into( &mut self, input: &[u8], out: &mut Vec<u8>, )

Compress one contiguous &[u8] as a single independent Zstd frame, writing the frame bytes into out (its previous contents are replaced and its allocation reused), reusing this compressor’s heavy state across calls.

This is the reusable-compression-context (CCtx-equivalent) entry point, mirroring C ZSTD_compress2 over a reused ZSTD_CCtx: construct ONE FrameCompressor and call this in a loop to emit N independent, self-describing frames (each carrying its own header, blocks, and checksum, decodable in isolation, with no cross-frame match history). Every call resets the per-frame state via Self::prepare_frame: only the allocations are kept, so the dominant per-frame setup cost (table allocation + dictionary prime) is paid once instead of N times. Passing the same out buffer each call additionally reuses the output allocation, matching C’s caller-owned dst buffer (no per-frame output allocation).

Reusing the context + out across many small frames (the typical per-block-frame workload) is far cheaper than a fresh compress_slice_to_vec per block, which allocates and primes from scratch each time.

The input is read in place: no Self::set_source / Self::set_drain setup is required, and the input lifetime is not baked into the compressor type, so successive calls may pass slices with unrelated lifetimes. When the Fast (Simple) backend is active and no dictionary is set, the matcher references the input directly (no per-block history copy); other backends / dictionary use copy each block into history exactly as the streaming compress path does. The source-size hint is derived from the input length on every call, so per-frame table sizing tracks each frame’s actual size regardless of any earlier hint.

A sticky dictionary set via set_dictionary (or its variants) is primed into every frame, mirroring ZSTD_CCtx_loadDictionary / ZSTD_CCtx_refCDict.

§Panics

Panics on encoder error, matching Self::compress and compress_slice_to_vec.

Source

pub fn compress_independent_frame(&mut self, input: &[u8]) -> Vec<u8>

Convenience wrapper over Self::compress_independent_frame_into that allocates and returns a fresh Vec per call. Prefer the _into form in tight per-block-frame loops to reuse one output buffer across frames (the CCtx-equivalent zero-per-call-alloc output, matching C’s caller-owned dst).

use structured_zstd::encoding::{FrameCompressor, CompressionLevel};
let mut cctx: FrameCompressor = FrameCompressor::new(CompressionLevel::Default);
let frame_a = cctx.compress_independent_frame(b"first block payload");
let frame_b = cctx.compress_independent_frame(b"second block payload");
assert!(!frame_a.is_empty() && !frame_b.is_empty());
Source§

impl<R: Read, W: Write, M: Matcher> FrameCompressor<R, W, M>

Source

pub fn new_with_matcher(matcher: M, compression_level: CompressionLevel) -> Self

Create a new FrameCompressor with a custom matching algorithm implementation

Source

pub fn set_magicless(&mut self, magicless: bool)

Enable or disable magicless frame format (ZSTD_f_zstd1_magicless).

When set to true, emitted frames omit the 4-byte magic number prefix. The matching decoder must be configured to expect a magicless stream — wire-format only round-trips with a magicless-aware decoder.

Source

pub fn set_content_checksum(&mut self, emit: bool)

Enable or disable the trailing XXH64 content checksum (semantics of upstream ZSTD_c_checksumFlag). Default false, matching the upstream library default (ZSTD_c_checksumFlag = 0) so out-of-the-box frames carry the same layout and pay the same costs as the reference implementation.

When false, emitted frames set Content_Checksum_flag = 0 and carry no trailing digest; such frames are valid (RFC 8878) and decode correctly in any ContentChecksum mode. Without the hash feature no checksum is emitted regardless of this setting.

Source

pub fn set_content_size_flag(&mut self, emit: bool)

Enable or disable recording Frame_Content_Size in the frame header when the total size is known (semantics of upstream ZSTD_c_contentSizeFlag). Default true, matching upstream. With the flag off the header carries a window descriptor instead (and the single-segment layout, which requires an FCS, is disabled).

Source

pub fn set_dictionary_id_flag(&mut self, emit: bool)

Enable or disable recording the dictionary ID in the frame header when a dictionary is attached (semantics of upstream ZSTD_c_dictIDFlag). Default true, matching upstream. Frames emitted with the flag off still decode when the decoder is handed the dictionary explicitly.

Source

pub fn set_target_block_size(&mut self, target: Option<u32>)

Set an upper bound on emitted block sizes (semantics of upstream ZSTD_c_targetCBlockSize): every physical block’s payload is capped at target bytes (+3-byte block header on the wire), trading some ratio for bounded per-block latency. The value is clamped to [MIN_TARGET_BLOCK_SIZE, MAX_BLOCK_SIZE] (the upstream bounds). None removes the target.

Source

pub fn set_source(&mut self, uncompressed_data: R) -> Option<R>

Before calling FrameCompressor::compress you need to set the source.

This is the data that is compressed and written into the drain.

Source

pub fn set_drain(&mut self, compressed_data: W) -> Option<W>

Before calling FrameCompressor::compress you need to set the drain.

As the compressor compresses data, the drain serves as a place for the output to be writte.

Source

pub fn set_source_size_hint(&mut self, size: u64)

Provide a hint about the total uncompressed size for the next frame.

When set, the encoder selects smaller hash tables and windows for small inputs, matching the C zstd source-size-class behavior.

This hint applies only to frame payload bytes (size). Dictionary history is primed separately and does not inflate the hinted size or advertised frame window. Must be called before compress.

Source

pub fn heap_size(&self) -> usize

Total heap bytes this compressor’s allocations hold, excluding the inline struct: the match-finder tables / history / recycled buffers and the primed-dictionary snapshot (via the matcher), the retained Huffman tables (active + recycled spare), the retained dictionary content, the cached dictionary entropy tables (literals Huffman + LL/ML/OF FSE), and the per-block sidecar buffers. Lets a context report its true footprint through ZSTD_sizeof_CCtx.

Source

pub fn compress(&mut self)

Compress the uncompressed data from the provided source as one Zstd frame and write it to the provided drain

This will repeatedly call Read::read on the source to fill up blocks until the source returns 0 on the read call. All compressed blocks are buffered in memory so that the frame header can include the Frame_Content_Size field (which requires knowing the total uncompressed size). The entire frame — header, blocks, and optional checksum — is then written to the drain at the end. This means peak memory usage is O(compressed_size).

To avoid endlessly encoding from a potentially endless source (like a network socket) you can use the Read::take function Per-frame setup values resolved by Self::prepare_frame and consumed by the block loop + Self::finish_frame. Lets the owned compress() and the borrowed one-shot path share the exact same reset / dict-prime / entropy-seed setup and frame tail.

Source

pub fn last_frame_emit_info(&self) -> Option<&FrameEmitInfo>

Available on crate feature lsm only.

Layout of the most recently emitted frame.

Returns None if compress has not been called yet on this compressor. After a successful compress() the returned FrameEmitInfo describes the frame header range, every emitted block’s offset / size / type, and the optional trailing content-checksum range — all in frame-absolute byte offsets matching the bytes written to the drain.

Behind the lsm Cargo feature.

Source

pub fn enable_per_block_checksums(&mut self)

Available on crate features hash and lsm only.

Opt in to per-block XXH64 checksum computation during compress. Default off; zero cost when disabled. The captured digests are accessible via last_frame_block_checksums.

One checksum is emitted per physical FrameBlock written to the drain: 1:1 cardinality with last_frame_emit_info’s blocks vector. On the post-split optimization path (Level 16-22 with large window) the per-partition decompressed range is hashed inside the partition loop so the digest count still matches the emitted block count. The decoder collects per-physical-block digests on the same granularity, so element-wise equality holds round-trip.

Behind all(feature = "lsm", feature = "hash") — the XXH64 primitive lives behind the hash feature, so this method only compiles when both are enabled.

Source

pub fn last_frame_block_checksums(&self) -> Option<&[u32]>

Available on crate features hash and lsm only.

Per-block XXH64 (low 32 bits) digests captured during the most recent compress() call. None unless enable_per_block_checksums was called before compress().

Behind all(feature = "lsm", feature = "hash").

Source

pub fn source_mut(&mut self) -> Option<&mut R>

Get a mutable reference to the source

Source

pub fn drain_mut(&mut self) -> Option<&mut W>

Get a mutable reference to the drain

Source

pub fn source(&self) -> Option<&R>

Get a reference to the source

Source

pub fn drain(&self) -> Option<&W>

Get a reference to the drain

Source

pub fn take_source(&mut self) -> Option<R>

Retrieve the source

Source

pub fn take_drain(&mut self) -> Option<W>

Retrieve the drain

Source

pub fn replace_matcher(&mut self, match_generator: M) -> M

Before calling FrameCompressor::compress you can replace the matcher

Source

pub fn set_compression_level( &mut self, compression_level: CompressionLevel, ) -> CompressionLevel

Before calling FrameCompressor::compress you can replace the compression level.

This also clears any fine-grained parameter overrides installed via set_parameters: reverting to a bare level means plain level-based tuning, not the previous frame’s customized strategy / LDM / log overrides. To keep overriding, call set_parameters again with the new base level.

Source

pub fn compression_level(&self) -> CompressionLevel

Get the current compression level

Source

pub fn set_dictionary( &mut self, dictionary: Dictionary, ) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>

Attach a pre-parsed dictionary to be used for subsequent compressions.

In compressed modes, the dictionary id is written only when the active matcher supports dictionary priming. Uncompressed mode and non-priming matchers ignore the attached dictionary at encode time.

Source

pub fn set_dictionary_from_bytes( &mut self, raw_dictionary: &[u8], ) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>

Parse and attach a serialized dictionary blob.

Parses with the encoder-only path (skips the FSE/HUF decode lookup-table build the encoder never reads); the entropy ENCODER tables — and thus the emitted frame — are identical to a full parse.

Source

pub fn set_encoder_dictionary( &mut self, dictionary: EncoderDictionary, ) -> Result<Option<EncoderDictionary>, DictionaryDecodeError>

Attach an already-parsed EncoderDictionary without reparsing a raw blob.

Accepts an EncoderDictionary produced once via EncoderDictionary::from_bytes / EncoderDictionary::from_dictionary or handed back by Self::clear_dictionary / the set_dictionary* return value, so callers can reattach or reuse a prepared dictionary across compressions without re-running the dictionary parse each time. Returns the previously-attached dictionary, if any.

Source

pub fn clear_dictionary(&mut self) -> Option<EncoderDictionary>

Remove the attached dictionary, returning it as an EncoderDictionary.

Auto Trait Implementations§

§

impl<R = &'static [u8], W = Vec<u8>, M = MatchGeneratorDriver> !Freeze for FrameCompressor<R, W, M>

§

impl<R, W, M> RefUnwindSafe for FrameCompressor<R, W, M>

§

impl<R, W, M> Send for FrameCompressor<R, W, M>
where R: Send, W: Send, M: Send,

§

impl<R, W, M> Sync for FrameCompressor<R, W, M>
where R: Sync, W: Sync, M: Sync,

§

impl<R, W, M> Unpin for FrameCompressor<R, W, M>
where R: Unpin, W: Unpin, M: Unpin,

§

impl<R, W, M> UnsafeUnpin for FrameCompressor<R, W, M>

§

impl<R, W, M> UnwindSafe for FrameCompressor<R, W, M>
where R: UnwindSafe, W: UnwindSafe, M: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.