Crate structured_zstd

Expand description

Pure-Rust Zstandard codec with a production-grade decoder, dictionary handle reuse, and an actively-improved encoder.

The crate ships:

decoding — RFC 8878 decoder (decoding::StreamingDecoder, decoding::FrameDecoder, dictionary-backed paths via decoding::DictionaryHandle).
encoding — frame compressor, streaming encoder, named and numeric compression levels (encoding::CompressionLevel).
dictionary (feature dict_builder) — COVER / FastCOVER training plus raw-to-finalized dictionary helpers.

No FFI, no cmake, no system zstd. no_std builds are supported by disabling the default std feature.

§CPU kernel features

The decode hot paths ship per-CPU-tier SIMD kernels. With std the tier is chosen at runtime (CPU-feature detection, cached on first use); on no_std it is chosen at compile time from cfg(target_feature). Each tier is gated by a cargo feature, all enabled by default (a universal binary that picks the best available tier per the above): kernel_scalar, kernel_sse2, kernel_bmi2, kernel_avx2, kernel_vbmi2 (x86) and kernel_neon, kernel_sve (aarch64). The chain mirrors the ISA dependency (kernel_avx2 implies kernel_bmi2 implies kernel_sse2; kernel_sve implies kernel_neon). The scalar kernel is always compiled, so any subset is valid; a flag is inert on architectures it doesn’t apply to. Constrained targets can shrink the binary by trimming tiers, e.g. --no-default-features --features kernel_scalar compiles out the per-tier SIMD kernel dispatch, its BMI2/AVX2/VBMI2/NEON trampolines, and the explicit SSE2/NEON intrinsics in the small fixed-size copy primitives — all of which are gated on the matching kernel_* feature. The kernel_* features control the crate’s own explicit SIMD; they do not constrain the compiler’s autovectorizer, which may still emit vector instructions from ordinary scalar code regardless of the enabled tiers.

The packaged README is included below for the docs.rs landing page; the API anchors above link straight into the per-module documentation.

§structured-zstd

Pure-Rust Zstandard codec with a production-grade decoder, dictionary handle reuse, and an actively-improved encoder. Builds with plain cargo — no cmake, no system zstd, no FFI. no_std ready for embedded.

§Quick start

cargo add structured-zstd

use structured_zstd::encoding::{compress_to_vec, CompressionLevel};

let compressed = compress_to_vec(&b"hello world"[..], CompressionLevel::from_level(7));

For no_std builds disable the default features:

cargo add structured-zstd --no-default-features

The decoder ships per-CPU-tier SIMD kernels, each behind a cargo feature (all on by default; the tier is picked at runtime with std, or at compile time from target_feature on no_std): kernel_scalar, kernel_sse2, kernel_bmi2, kernel_avx2, kernel_vbmi2 (x86) and kernel_neon, kernel_sve (aarch64). The scalar kernel is always compiled (it is the mandatory fallback), so kernel_scalar is a marker that gates no code; disabling the SIMD tiers is what trims the binary. A scalar-only build — --no-default-features (or, equivalently, naming the marker explicitly) — compiles out the per-tier SIMD kernel dispatch, its BMI2/AVX2/VBMI2/NEON trampolines, and the explicit SSE2/NEON intrinsics in the small fixed-size copy primitives — all gated on the matching kernel_* feature. These features control the crate’s own explicit SIMD only; the compiler’s autovectorizer may still emit vector instructions from ordinary scalar code regardless:

cargo add structured-zstd --no-default-features --features kernel_scalar

Release notes for every version live in zstd/CHANGELOG.md (maintained by release-plz).

§Status

§Decoder — production-ready

Complete RFC 8878 implementation, including dictionary-backed streams, raw / RLE / compressed blocks, and the full Zstandard frame format with optional content checksums.

§Encoder — full level range, active parity work

All standard compression levels are wired and produce valid Zstandard frames decodable by both this crate and upstream C zstd:

Named presets: Fastest (≈1), Default (≈3), Better (≈7), Best (≈11)
Numeric levels: 0..=22 and negative ultra-fast levels via CompressionLevel::from_level(n) — C zstd-compatible numbering
Fine-grained parameters: override individual knobs (windowLog, hashLog, chainLog, searchLog, minMatch, targetLength, strategy) and activate long-distance matching via CompressionParameters::builder(...), the drop-in equivalent of C zstd’s ZSTD_CCtx_setParameter surface
Streaming encoder via std::io::Write
Dictionary compression with the same dictionary format C zstd consumes
Frame Content Size — FrameCompressor writes FCS automatically; StreamingEncoder requires set_pledged_content_size() before the first write
Content checksums opt-in

The encoder is undergoing an architectural rewrite — see #111 for the roadmap.

§Dictionary training

Behind the dict_builder feature flag, the dictionary module can:

build raw dictionaries with COVER (create_raw_dict_from_source)
build raw dictionaries with FastCOVER (create_fastcover_raw_dict_from_source)
finalize raw content into the full zstd dictionary format (finalize_raw_dict)
train + finalize in one pure-Rust flow (create_fastcover_dict_from_source)

Internal: compression strategy backends

Level range	Strategy	Backend
1-2	`Fast`	`Simple` matcher
3-4	`Dfast`	`Dfast` two-tier hash
5	`Greedy`	`Row` matcher (`lazy_depth=0`)
6-15	`Lazy` / `Lazy2`	`HashChain` (`lazy_depth=1` or `2`)
16-17	`BtOpt`	`HashChain` candidates + `btopt` price parser
18	`BtUltra`	`HashChain` candidates + `btultra` price parser
19-22	`BtUltra2`	`HashChain` candidates + `btultra2` dual-profile parse

The level → strategy column matches donor ZSTD_defaultCParameters[0] at zstd/lib/compress/clevels.h:25-50 (srcSize > 256 KiB tier). Donor routes greedy/lazy/lazy2 through its row-based matchfinder when windowLog > 14; we route Greedy through the row matcher (matches donor) but Lazy/Lazy2 through the hash-chain matcher — an intentional architectural difference, not an oversight.

§Performance

Per-merge benchmarks publish to GitHub Pages: structured-world.github.io/structured-zstd/dev/bench.

The CI matrix covers x86_64-linux-gnu, i686-linux-gnu, and x86_64-musl; the dashboard exposes per-target / stage / scenario / level filtering. The encoder architecture rewrite (#111) is the active surface for compression-side work; the public benchmark report tracks the delta vs upstream C zstd over time. A dedicated dashboard section also tracks the WebAssembly build (simd128 + scalar) against the most popular npm wasm zstd, @bokuweb/zstd-wasm, over time.

See BENCHMARKS.md for the methodology — small payloads, entropy extremes, a 100 MiB large-stream scenario, repository corpus fixtures, and optional local Silesia corpora.

§Usage

§Compression

use structured_zstd::encoding::{compress, compress_to_vec, CompressionLevel};

let data: &[u8] = b"hello world";
// Named level
let compressed = compress_to_vec(data, CompressionLevel::Fastest);
// Numeric level (C zstd compatible: 0 = default, 1-22, negative for ultra-fast)
let compressed = compress_to_vec(data, CompressionLevel::from_level(7));

use structured_zstd::encoding::{CompressionLevel, StreamingEncoder};
use std::io::Write;

let mut out = Vec::new();
let mut encoder = StreamingEncoder::new(&mut out, CompressionLevel::Fastest);
encoder.write_all(b"hello ")?;
encoder.write_all(b"world")?;
encoder.finish()?;

§Fine-grained parameters

Override individual compression knobs (the drop-in equivalent of C zstd’s ZSTD_CCtx_setParameter). Every knob left unset inherits the base level’s default, so a parameter set that overrides nothing reproduces plain level-based compression. Long-distance matching is off at every level preset and is activated only here:

use structured_zstd::encoding::{
    compress_with_parameters, CompressionLevel, CompressionParameters, Strategy,
};

let data: &[u8] = b"hello world";
let params = CompressionParameters::builder(CompressionLevel::Level(19))
    .window_log(22)
    .strategy(Strategy::Btultra2)
    .enable_long_distance_matching(true)
    .build()
    .expect("parameters within bounds");

let compressed = compress_with_parameters(data, &params);

Each parameter’s valid range is queryable via CParameter::bounds() (the analogue of ZSTD_cParam_getBounds); the builder validates every set knob.

§Decompression

use structured_zstd::decoding::StreamingDecoder;
use structured_zstd::io::Read;

let compressed_data: Vec<u8> = vec![];
let mut source: &[u8] = &compressed_data;
let mut decoder = StreamingDecoder::new(&mut source).unwrap();

let mut result = Vec::new();
decoder.read_to_end(&mut result).unwrap();

§Dictionary-backed decompression

use structured_zstd::decoding::{DictionaryHandle, FrameDecoder, StreamingDecoder};
use structured_zstd::io::Read;

let compressed: Vec<u8> = vec![];
let dict_bytes: Vec<u8> = vec![];
let mut output = vec![0u8; 1024];

// Parse dictionary once, then reuse handle.
let handle = DictionaryHandle::decode_dict(&dict_bytes).unwrap();
let mut decoder = FrameDecoder::new();
let _written = decoder
    .decode_all_with_dict_handle(compressed.as_slice(), &mut output, &handle)
    .unwrap();

// Compatibility path: pass raw dictionary bytes directly.
let mut decoder = FrameDecoder::new();
let _written = decoder
    .decode_all_with_dict_bytes(compressed.as_slice(), &mut output, &dict_bytes)
    .unwrap();

// Streaming helpers exist for both handle- and bytes-based paths.
let mut source: &[u8] = &compressed;
let mut stream = StreamingDecoder::new_with_dictionary_handle(&mut source, &handle).unwrap();
let mut sink = Vec::new();
stream.read_to_end(&mut sink).unwrap();

§Storage-format extensions

Behind the lsm Cargo feature (default off), structured-zstd exposes a typed SkippableFrame API (structured_zstd::skippable) for storage-format authors who need to interleave application metadata with zstd data, plus a block-subset partial decoder: FrameDecoder::decode_blocks_partial(src, start_block, end_block, resume, emit_resume) decodes only the inner blocks covering a requested range (skipping the trailing ones) and preserves the clean prefix on a corrupt block, while FrameEmitInfo::decompressed_byte_range(block_index) returns the decompressed byte range of a given block, so a range query can locate which inner blocks cover a target byte window. For incremental / resumable decoding, pass emit_resume = true to capture a ResumeState (cross-block entropy tables + repcode history + next-block coordinates) in PartialDecode::resume_state, then feed it back via the resume argument (ResumeInput { window_prime, state }) to continue from a later block WITHOUT re-decompressing the prefix — even across a dropped (cold) decoder. Enable on the command line:

cargo add structured-zstd --features lsm

or in Cargo.toml:

[dependencies]
structured-zstd = { version = "0", features = ["lsm"] }

The ecosystem registry of allocated skippable-frame magic variants and the allocation policy live in docs/SKIPPABLE_MAGIC_ALLOCATIONS.md.

§WebAssembly / npm

JavaScript / TypeScript consumers can use the codec from npm — no native addons, no build step:

npm install @structured-world/structured-zstd

import { compress, decompress } from "@structured-world/structured-zstd";
const framed = await compress(new TextEncoder().encode("hello"), 19);
const plain = await decompress(framed);

The package ships two WebAssembly payloads — one built with the simd128 SIMD tier, one scalar — and selects the fast one at runtime from the host engine’s capabilities. Pure ESM, strict TypeScript types. Frames interoperate with native zstd. Source lives in zstd-wasm/; see the package README.

§Project relationship

Maintained fork of KillingSpark/zstd-rs (ruzstd) by the Structured World Foundation. We sync periodically with upstream but maintain an independent development trajectory focused on the CoordiNode database engine’s per-label dictionary needs.

§Support the project

USDT TRC-20 Donation QR Code

USDT (TRC-20): TFDsezHa1cBkoeZT5q2T49Wp66K8t2DmdA

§License

Apache License 2.0. Contributions will be published under the same Apache 2.0 license.

Re-exports§

pub use io_std as io;std

Modules§

decoding: RFC 8878 Zstandard decoder.
dictionarydict_builder: Code for creating a separate content dictionary.
encoding: Zstandard encoder — frame compression, streaming, dictionary support.
io_stdstd: Re-exports of std traits or local reimplementations if std is not available
skippablelsm: Typed Rust API for zstd skippable frames (RFC 8878 §3.1).

Constants§

MIN_TARGET_BLOCK_SIZE: Smallest accepted block-size target (the ZSTD_TARGETCBLOCKSIZE_MIN bound): the single source of truth shared by the Rust setters (set_target_block_size) and the C ABI parameter surface. Smallest accepted block-size target (upstream ZSTD_TARGETCBLOCKSIZE_MIN): below this the per-block header overhead dominates any latency benefit. Re-exported at the crate root as the single source of truth; the C ABI parameter bounds import it from there.
WILDCOPY_OVERLENGTH: SIMD wildcopy overshoot slack carried by every decoder backend (currently 32 bytes). Sized so the AVX2 chunked kernel in simd_copy::copy_bytes_overshooting (32-byte stride on x86-64) can fire on tail copies near the end of a fixed-capacity output buffer. Donor zstd’s WILDCOPY_OVERLENGTH is also 32 bytes today; this matches that contract.

Functions§

active_cpu_kernel_name: Name of the active CPU kernel tier (entropy / sequence hot paths) for this process — for diagnostics and benchmark/dashboard reporting. See cpu_kernel::active_cpu_kernel_name. Name of the CPU kernel tier this process selected for the entropy / sequence hot paths: decode (literals + FSE sequence decode) and encode (entropy) share this dispatch (see #247). Returned as a stable lowercase string for diagnostics and benchmark/dashboard reporting; the value is what the runtime CPU-feature detection (or compile-time target_feature on no_std) actually resolves to on this machine, so a dashboard can attribute a measurement to the kernel that produced it.