pub struct ZstdDictionary { /* private fields */ }Expand description
Pre-trained zstd dictionary for improved compression of small blocks.
Zstd dictionaries significantly improve compression ratios for blocks in the 4–64 KiB range typical of LSM-trees, especially when data has recurring patterns (e.g., structured keys, repeated prefixes, JSON/MessagePack values).
The dictionary is identified by a 32-bit ID derived from its content (truncated xxh3 hash). This ID is stored alongside compressed blocks so readers can detect dictionary mismatches.
§Example
use lsm_tree::ZstdDictionary;
let samples: &[u8] = &training_data;
let dict = ZstdDictionary::new(samples);Implementations§
Source§impl ZstdDictionary
impl ZstdDictionary
Sourcepub fn new(raw: &[u8]) -> Self
pub fn new(raw: &[u8]) -> Self
Creates a new dictionary handle from raw bytes.
raw may be either:
- A finalized zstd dictionary — bytes starting with the magic
37 A4 30 EC(as produced byzstd --train; accessible viaZstdDictionary::rawfor persistence and interop). The backend parses it with the full entropy-table decoder. - A raw content dictionary — arbitrary bytes used as LZ77 history (no magic header). Useful when the caller controls the training data and does not need the full entropy-table overhead.
Both forms are accepted by [CompressionProvider::compress_with_dict]
and [CompressionProvider::decompress_with_dict].
The handle stores the full 64-bit xxh3 hash of raw internally.
Self::id returns the lower 32 bits for external consumers
(config validation, frame header); id64 (crate-internal) exposes the
full fingerprint for use as a cache key.
Sourcepub fn id(&self) -> u32
pub fn id(&self) -> u32
Returns a 32-bit fingerprint derived from the dictionary content.
The fingerprint is the lower 32 bits of the xxh3-64 hash of the raw
dictionary bytes. It is stable for a given byte sequence and is
intended for config validation (matching a CompressionType::ZstdDict
dict_id field against the supplied ZstdDictionary) and external
interop.
The value may theoretically be 0 (probability ≈ 1/2³²). Backends
that embed a dict ID in the zstd frame header (where id=0 is reserved)
are responsible for clamping to at least 1 themselves. Config
validation is unaffected: both sides derive the ID from the same bytes
and therefore agree even in the zero case.
Trait Implementations§
Source§impl Clone for ZstdDictionary
Available on zstd_any only.
impl Clone for ZstdDictionary
zstd_any only.Source§impl Debug for ZstdDictionary
Available on zstd_any only.
impl Debug for ZstdDictionary
zstd_any only.impl Eq for ZstdDictionary
zstd_any only.Source§impl PartialEq for ZstdDictionary
Available on zstd_any only.Two dictionaries are equal when their full 64-bit xxh3 fingerprints agree.
Equality is defined by the 64-bit id field; hash collisions between
dictionaries with different raw bytes are theoretically possible but
extremely unlikely given the xxh3-64 collision probability.
impl PartialEq for ZstdDictionary
zstd_any only.Two dictionaries are equal when their full 64-bit xxh3 fingerprints agree.
Equality is defined by the 64-bit id field; hash collisions between
dictionaries with different raw bytes are theoretically possible but
extremely unlikely given the xxh3-64 collision probability.
Auto Trait Implementations§
impl Freeze for ZstdDictionary
impl RefUnwindSafe for ZstdDictionary
impl Send for ZstdDictionary
impl Sync for ZstdDictionary
impl Unpin for ZstdDictionary
impl UnsafeUnpin for ZstdDictionary
impl UnwindSafe for ZstdDictionary
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more