pub struct FrameDecoder { /* private fields */ }Expand description
Low level Zstandard decoder that can be used to decompress frames with fine control over when and how many bytes are decoded.
This decoder is able to decode frames only partially and gives control over how many bytes/blocks will be decoded at a time (so you don’t have to decode a 10GB file into memory all at once). It reads bytes as needed from a provided source and can be read from to collect partial results.
If you want to just read the whole frame with an io::Read without having to deal with manually calling FrameDecoder::decode_blocks
you can use the provided crate::decoding::StreamingDecoder wich wraps this FrameDecoder.
Workflow is as follows:
use structured_zstd::decoding::BlockDecodingStrategy;
use std::io::{Read, Write};
// no_std environments can use the crate's own Read traits
use structured_zstd::io::{Read, Write};
fn decode_this(mut file: impl Read) {
//Create a new decoder
let mut frame_dec = structured_zstd::decoding::FrameDecoder::new();
let mut result = Vec::new();
// Use reset or init to make the decoder ready to decode the frame from the io::Read
frame_dec.reset(&mut file).unwrap();
// Loop until the frame has been decoded completely
while !frame_dec.is_finished() {
// decode (roughly) batch_size many bytes
frame_dec.decode_blocks(&mut file, BlockDecodingStrategy::UptoBytes(1024)).unwrap();
// read from the decoder to collect bytes from the internal buffer
let bytes_read = frame_dec.read(result.as_mut_slice()).unwrap();
// then do something with it
do_something(&result[0..bytes_read]);
}
// handle the last chunk of data
while frame_dec.can_collect() > 0 {
let x = frame_dec.read(result.as_mut_slice()).unwrap();
do_something(&result[0..x]);
}
}
fn do_something(data: &[u8]) {
std::io::stdout().write_all(data).unwrap();
}Implementations§
Source§impl FrameDecoder
impl FrameDecoder
Sourcepub fn new() -> FrameDecoder ⓘ
pub fn new() -> FrameDecoder ⓘ
This will create a new decoder without allocating anything yet. init()/reset() will allocate all needed buffers if it is the first time this decoder is used else they just reset these buffers with not further allocations
Sourcepub fn enable_per_block_checksums(&mut self)
Available on crate features lsm and hash only.
pub fn enable_per_block_checksums(&mut self)
lsm and hash only.Opt in to per-block XXH64 verification during decode.
Default off; zero cost when disabled. Each block’s decompressed
bytes are XXH64-hashed (low 32 bits) and appended to
Self::computed_block_checksums as the decode progresses.
Callers compare the captured digests against externally-stored
expected values (e.g. from a per-block sidecar in the
containing application protocol).
Behind all(feature = "lsm", feature = "hash") — the XXH64
primitive lives behind the hash feature, so this method
only compiles when both are enabled.
Sourcepub fn computed_block_checksums(&self) -> &[u32]
Available on crate features lsm and hash only.
pub fn computed_block_checksums(&self) -> &[u32]
lsm and hash only.Per-block XXH64 (low 32 bits) digests captured during the
current frame’s decode. Empty unless
Self::enable_per_block_checksums was called before
Self::decode_all / Self::reset.
Reset at the start of every new frame.
Behind all(feature = "lsm", feature = "hash").
Sourcepub fn expect_dict_id(&mut self, expected: Option<u32>)
Available on crate feature lsm only.
pub fn expect_dict_id(&mut self, expected: Option<u32>)
lsm only.Pin the expected Dictionary_ID for the next frame.
When expected is set, Self::init / Self::reset
validate it against the parsed frame header BEFORE any
block decode work runs. A mismatch returns
crate::decoding::errors::FrameDecoderError::UnexpectedDictId
before any block decode and before any output is produced.
Scratch buffer allocation / reservation for the decode
pipeline happens during frame-header parsing, which is
already complete when this validation fires — the cost of
scratch sizing is paid even on a mismatched header. The
guarantee is “no block decode, no XXH64 init, no partial
output”, not “zero allocation”.
Some(0) is treated as “no dictionary expected”: a frame
whose header omits the optional Dictionary_ID field
(flag value 0) passes the check; a frame that carries an
explicit non-zero id fails.
None (default) disables the check.
Primary use case: post-AEAD-decrypt sanity check in
wire-format consumers (e.g. lsm-tree’s encrypted block
format pins the dict_id baked into the AAD against the
inner zstd frame’s dict_id to defeat dict-substitution
attacks).
NOT a replacement for AEAD authentication. NOT the same
semantic as donor ZSTD_d_windowLogMax (which is a
ceiling-style limit, separate concern).
Sourcepub fn expect_window_descriptor(&mut self, expected: Option<u8>)
Available on crate feature lsm only.
pub fn expect_window_descriptor(&mut self, expected: Option<u8>)
lsm only.Pin the expected raw Window_Descriptor byte (RFC 8878
§3.1.1.1.2 layout: (exp << 3) | mantissa) for the next
frame.
When expected is set, Self::init / Self::reset
validate it against the parsed frame header BEFORE any
block decode work runs. A mismatch returns
crate::decoding::errors::FrameDecoderError::UnexpectedWindowDescriptor.
Single-segment frames omit the Window_Descriptor byte
from the wire entirely. Setting an expectation while
receiving a single-segment frame fails the check with
found: None — there is no on-wire byte to match against,
which is reported explicitly rather than silently passing.
None (default) disables the check.
Byte-exact equality, NOT a ceiling. Donor
ZSTD_d_windowLogMax is a separate ceiling-style limit
available through the C FFI surface; this method is for
strict equality validation against a pinned expectation
(e.g. lsm-tree’s wire format pins the window descriptor
from the AAD to defeat decompression-bomb-swap attacks).
Sourcepub fn set_magicless(&mut self, magicless: bool)
pub fn set_magicless(&mut self, magicless: bool)
Enable or disable magicless frame format
(ZSTD_f_zstd1_magicless). When set to true, subsequent
[init] / [reset] calls expect the frame header to begin
directly with the frame-header descriptor — no 4-byte magic
number prefix. Default false. Must match the encoder’s
magicless setting; the format is unambiguous only when the
caller knows it out-of-band.
Note: magicless mode also disables skippable-frame detection.
The 0x184D2A50..=0x184D2A5F skippable-frame magic range is
only recognised when the 4-byte magic prefix is consumed, so
decode_all / init / reset will treat a skippable frame
at the head of a magicless stream as a malformed frame header
(bad descriptor / window-size error) instead of skipping it.
Mixed-format streams that interleave skippable frames must be
pre-split by the caller; set_magicless(true) is only safe
when the entire stream is known to be magicless zstd frames.
Sourcepub fn init(&mut self, source: impl Read) -> Result<(), FrameDecoderError>
pub fn init(&mut self, source: impl Read) -> Result<(), FrameDecoderError>
init() will allocate all needed buffers if it is the first time this decoder is used else they just reset these buffers with not further allocations
Note that all bytes currently in the decodebuffer from any previous frame will be lost. Collect them with collect()/collect_to_writer()
equivalent to reset()
Sourcepub fn init_with_dict_handle(
&mut self,
source: impl Read,
dict: &DictionaryHandle,
) -> Result<(), FrameDecoderError>
pub fn init_with_dict_handle( &mut self, source: impl Read, dict: &DictionaryHandle, ) -> Result<(), FrameDecoderError>
Initialize the decoder for a new frame using a pre-parsed dictionary handle.
If the frame header has a dictionary ID, this validates it against
dict.id() and returns FrameDecoderError::DictIdMismatch on mismatch.
If the header omits the optional dictionary ID, this still applies the provided dictionary handle.
§Warning
This method always applies dict unless the frame header contains a
non-matching dictionary ID. Callers must only use this API when they
already know the frame was encoded with the provided dictionary, even if
the frame header omits the dictionary ID or encodes an explicit
dictionary ID of 0.
Passing a dictionary for a frame that was not encoded with it can silently corrupt the decoded output.
Sourcepub fn reset(&mut self, source: impl Read) -> Result<(), FrameDecoderError>
pub fn reset(&mut self, source: impl Read) -> Result<(), FrameDecoderError>
reset() will allocate all needed buffers if it is the first time this decoder is used else they just reset these buffers with not further allocations
Note that all bytes currently in the decodebuffer from any previous frame will be lost. Collect them with collect()/collect_to_writer()
equivalent to init()
Sourcepub fn reset_with_dict_handle(
&mut self,
source: impl Read,
dict: &DictionaryHandle,
) -> Result<(), FrameDecoderError>
pub fn reset_with_dict_handle( &mut self, source: impl Read, dict: &DictionaryHandle, ) -> Result<(), FrameDecoderError>
Reset this decoder for a new frame using a pre-parsed dictionary handle.
If the frame header has a dictionary ID, this validates it against
dict.id() and returns FrameDecoderError::DictIdMismatch on mismatch.
If the header omits the optional dictionary ID, this still applies the provided dictionary handle.
§Warning
This method always applies dict unless the frame header contains a
non-matching dictionary ID. Callers must only use this API when they
already know the frame was encoded with the provided dictionary, even if
the frame header omits the dictionary ID or encodes an explicit
dictionary ID of 0.
Passing a dictionary for a frame that was not encoded with it can silently corrupt the decoded output.
Sourcepub fn add_dict(&mut self, dict: Dictionary) -> Result<(), FrameDecoderError>
pub fn add_dict(&mut self, dict: Dictionary) -> Result<(), FrameDecoderError>
Add a dictionary that can be selected dynamically by frame dictionary ID.
Returns FrameDecoderError::DictAlreadyRegistered if the ID is already
registered (either as owned or shared).
Sourcepub fn add_dict_from_bytes(
&mut self,
raw_dictionary: &[u8],
) -> Result<(), FrameDecoderError>
pub fn add_dict_from_bytes( &mut self, raw_dictionary: &[u8], ) -> Result<(), FrameDecoderError>
Parse and add a serialized dictionary blob.
Sourcepub fn add_dict_handle(
&mut self,
dict: DictionaryHandle,
) -> Result<(), FrameDecoderError>
Available on target_has_atomic=ptr only.
pub fn add_dict_handle( &mut self, dict: DictionaryHandle, ) -> Result<(), FrameDecoderError>
target_has_atomic=ptr only.Add a pre-parsed dictionary handle for reuse across decoders.
This API is available on targets with pointer-width atomics
(target_has_atomic = "ptr").
Returns FrameDecoderError::DictAlreadyRegistered if the ID is already
registered (either as owned or shared).
pub fn force_dict(&mut self, dict_id: u32) -> Result<(), FrameDecoderError>
Sourcepub fn content_size(&self) -> u64
pub fn content_size(&self) -> u64
Returns how many bytes the frame contains after decompression
Sourcepub fn get_checksum_from_data(&self) -> Option<u32>
pub fn get_checksum_from_data(&self) -> Option<u32>
Returns the checksum that was read from the data. Only available after all bytes have been read. It is the last 4 bytes of a zstd-frame
Sourcepub fn get_calculated_checksum(&self) -> Option<u32>
Available on crate feature hash only.
pub fn get_calculated_checksum(&self) -> Option<u32>
hash only.Returns the checksum that was calculated while decoding. Only a sensible value after all decoded bytes have been collected/read from the FrameDecoder
Sourcepub fn bytes_read_from_source(&self) -> u64
pub fn bytes_read_from_source(&self) -> u64
Counter for how many bytes have been consumed while decoding the frame
Sourcepub fn is_finished(&self) -> bool
pub fn is_finished(&self) -> bool
Whether the current frames last block has been decoded yet If this returns true you can call the drain* functions to get all content (the read() function will drain automatically if this returns true)
Sourcepub fn blocks_decoded(&self) -> usize
pub fn blocks_decoded(&self) -> usize
Counter for how many blocks have already been decoded
Sourcepub fn decode_blocks(
&mut self,
source: impl Read,
strat: BlockDecodingStrategy,
) -> Result<bool, FrameDecoderError>
pub fn decode_blocks( &mut self, source: impl Read, strat: BlockDecodingStrategy, ) -> Result<bool, FrameDecoderError>
Decodes blocks from a reader. It requires that the framedecoder has been initialized first. The Strategy influences how many blocks will be decoded before the function returns This is important if you want to manage memory consumption carefully. If you don’t care about that you can just choose the strategy “All” and have all blocks of the frame decoded into the buffer
Sourcepub fn collect(&mut self) -> Option<Vec<u8>>
pub fn collect(&mut self) -> Option<Vec<u8>>
Collect bytes and retain window_size bytes while decoding is still going on. After decoding of the frame (is_finished() == true) has finished it will collect all remaining bytes
Sourcepub fn collect_to_writer(&mut self, w: impl Write) -> Result<usize, Error>
pub fn collect_to_writer(&mut self, w: impl Write) -> Result<usize, Error>
Collect bytes and retain window_size bytes while decoding is still going on. After decoding of the frame (is_finished() == true) has finished it will collect all remaining bytes
Sourcepub fn can_collect(&self) -> usize
pub fn can_collect(&self) -> usize
How many bytes can currently be collected from the decodebuffer, while decoding is going on this will be lower than the actual decodbuffer size because window_size bytes need to be retained for decoding. After decoding of the frame (is_finished() == true) has finished it will report all remaining bytes
Sourcepub fn decode_from_to(
&mut self,
source: &[u8],
target: &mut [u8],
) -> Result<(usize, usize), FrameDecoderError>
pub fn decode_from_to( &mut self, source: &[u8], target: &mut [u8], ) -> Result<(usize, usize), FrameDecoderError>
Decodes as many blocks as possible from the source slice and reads from the decodebuffer into the target slice The source slice may contain only parts of a frame but must contain at least one full block to make progress
By all means use decode_blocks if you have a io.Reader available. This is just for compatibility with other decompressors which try to serve an old-style c api
Returns (read, written), if read == 0 then the source did not contain a full block and further calls with the same input will not make any progress!
Note that no kind of block can be bigger than 128kb. So to be safe use at least 128*1024 (max block content size) + 3 (block_header size) + 18 (max frame_header size) bytes as your source buffer
You may call this function with an empty source after all bytes have been decoded. This is equivalent to just call decoder.read(&mut target)
Sourcepub fn decode_all(
&mut self,
input: &[u8],
output: &mut [u8],
) -> Result<usize, FrameDecoderError>
pub fn decode_all( &mut self, input: &[u8], output: &mut [u8], ) -> Result<usize, FrameDecoderError>
Decode multiple frames into the output slice.
input must contain an exact number of frames. Skippable frames are allowed and will be
skipped during decode.
output must be large enough to hold the decompressed data. If you don’t know
how large the output will be, use FrameDecoder::decode_blocks instead.
This calls FrameDecoder::init, and all bytes currently in the decoder will be lost.
Returns the number of bytes written to output.
Sourcepub fn decode_all_with_skippable_visitor<F>(
&mut self,
input: &[u8],
output: &mut [u8],
visitor: F,
) -> Result<usize, FrameDecoderError>
Available on crate feature lsm only.
pub fn decode_all_with_skippable_visitor<F>( &mut self, input: &[u8], output: &mut [u8], visitor: F, ) -> Result<usize, FrameDecoderError>
lsm only.Decode multiple frames into the output slice, invoking visitor
for every skippable frame encountered before advancing past it.
input must contain an exact number of frames. Skippable frames
(RFC 8878 §3.1.2 magic numbers 0x184D2A50..=0x184D2A5F) are
allowed and will be both visited AND skipped: the visitor gets
(magic_variant, payload) where magic_variant is the low
nibble of the magic (magic - 0x184D2A50, range 0..=15) and
payload is a borrowed slice of the on-wire payload bytes (the
skippable frame’s Frame_Size field worth of data) into
input — no allocation.
The visitor sees skippable frames in stream order; interleaved
regular zstd frames continue to decompress into output exactly
as decode_all does.
output must be large enough to hold the decompressed data.
Returns the number of bytes written to output.
§Example
use structured_zstd::decoding::FrameDecoder;
let mut decoder = FrameDecoder::new();
let mut output = vec![0u8; 1024];
let mut collected: Vec<(u8, Vec<u8>)> = Vec::new();
let n = decoder.decode_all_with_skippable_visitor(
input,
&mut output,
|variant, payload| collected.push((variant, payload.to_vec())),
)?;Sourcepub fn decode_all_with_dict_handle(
&mut self,
input: &[u8],
output: &mut [u8],
dict: &DictionaryHandle,
) -> Result<usize, FrameDecoderError>
pub fn decode_all_with_dict_handle( &mut self, input: &[u8], output: &mut [u8], dict: &DictionaryHandle, ) -> Result<usize, FrameDecoderError>
Decode multiple frames into the output slice using a pre-parsed dictionary handle.
input must contain an exact number of frames. Skippable frames are allowed and will be
skipped during decode.
output must be large enough to hold the decompressed data. If you don’t know
how large the output will be, use FrameDecoder::decode_blocks instead.
This calls FrameDecoder::init_with_dict_handle, and all bytes currently in the
decoder will be lost.
§Warning
Each decoded frame is initialized with dict, even when a frame header
omits the optional dictionary ID. Callers must only use this API when
they already know the input frames were encoded with the provided
dictionary; otherwise decoded output can be silently corrupted.
Sourcepub fn decode_all_with_dict_bytes(
&mut self,
input: &[u8],
output: &mut [u8],
raw_dictionary: &[u8],
) -> Result<usize, FrameDecoderError>
pub fn decode_all_with_dict_bytes( &mut self, input: &[u8], output: &mut [u8], raw_dictionary: &[u8], ) -> Result<usize, FrameDecoderError>
Decode multiple frames into the output slice using a serialized dictionary.
§Warning
Each decoded frame is initialized with the parsed dictionary, even when a frame header omits the optional dictionary ID. Callers must only use this API when they already know the input frames were encoded with that dictionary; otherwise decoded output can be silently corrupted.
Sourcepub fn decode_all_to_vec(
&mut self,
input: &[u8],
output: &mut Vec<u8>,
) -> Result<(), FrameDecoderError>
pub fn decode_all_to_vec( &mut self, input: &[u8], output: &mut Vec<u8>, ) -> Result<(), FrameDecoderError>
Decode multiple frames into the extra capacity of the output vector.
input must contain an exact number of frames.
output must have enough extra capacity to hold the decompressed data.
This function reserves an additional [WILDCOPY_OVERLENGTH]
bytes on top of the caller’s capacity so the per-frame direct
decode path stays eligible — that may grow the vector by up
to that fixed amount via Vec::reserve. It will NOT grow
further to fit the decompressed payload itself; the caller’s
pre-allocated capacity must already cover the data. If you
don’t know how large the output will be, use
FrameDecoder::decode_blocks instead.
This calls FrameDecoder::init, and all bytes currently in the decoder will be lost.
The length of the output vector is updated to include the decompressed data.
The length is not changed if an error occurs. The
WILDCOPY_OVERLENGTH slack is internal — output.len() on
return is the actual decompressed size, NOT the inflated
capacity. Callers who pre-sized the Vec with
Vec::with_capacity(fcs) see no functional change beyond
the small one-time capacity bump.
Trait Implementations§
Source§impl Default for FrameDecoder
impl Default for FrameDecoder
Source§impl Read for FrameDecoder
Read bytes from the decode_buffer that are no longer needed. While the frame is not yet finished
this will retain window_size bytes, else it will drain it completely
impl Read for FrameDecoder
Read bytes from the decode_buffer that are no longer needed. While the frame is not yet finished this will retain window_size bytes, else it will drain it completely
Source§fn read(&mut self, target: &mut [u8]) -> Result<usize, Error>
fn read(&mut self, target: &mut [u8]) -> Result<usize, Error>
1.36.0 · Source§fn read_vectored(&mut self, bufs: &mut [IoSliceMut<'_>]) -> Result<usize, Error>
fn read_vectored(&mut self, bufs: &mut [IoSliceMut<'_>]) -> Result<usize, Error>
read, except that it reads into a slice of buffers. Read moreSource§fn is_read_vectored(&self) -> bool
fn is_read_vectored(&self) -> bool
can_vector)1.0.0 · Source§fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize, Error>
fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize, Error>
buf. Read more1.0.0 · Source§fn read_to_string(&mut self, buf: &mut String) -> Result<usize, Error>
fn read_to_string(&mut self, buf: &mut String) -> Result<usize, Error>
buf. Read more1.6.0 · Source§fn read_exact(&mut self, buf: &mut [u8]) -> Result<(), Error>
fn read_exact(&mut self, buf: &mut [u8]) -> Result<(), Error>
buf. Read moreSource§fn read_buf(&mut self, buf: BorrowedCursor<'_>) -> Result<(), Error>
fn read_buf(&mut self, buf: BorrowedCursor<'_>) -> Result<(), Error>
read_buf)Source§fn read_buf_exact(&mut self, cursor: BorrowedCursor<'_>) -> Result<(), Error>
fn read_buf_exact(&mut self, cursor: BorrowedCursor<'_>) -> Result<(), Error>
read_buf)cursor. Read more1.0.0 · Source§fn by_ref(&mut self) -> &mut Selfwhere
Self: Sized,
fn by_ref(&mut self) -> &mut Selfwhere
Self: Sized,
Read. Read more