Expand description
Search compressed data without full decompression.
ziftsieve extracts literal bytes from compressed blocks and builds bloom
filters over them. This allows skipping decompression for blocks that
provably cannot contain a search pattern.
§What this crate does
This crate provides a high-performance streaming decompression partial-parser.
Instead of fully decompressing streams (which requires resolving all back-references
and dictionaries), ziftsieve rapidly extracts only the raw literal bytes and
constructs per-block Bloom filters over them.
§Why use it
By indexing literals into a Bloom filter, tools can rapidly scan massive compressed archives (like PCAPs, database dumps, or logs) and skip full decompression for any block that provably does not contain the target byte pattern. For large-scale data ingestion and security scanning, this yields orders of magnitude speedups.
§How to get started in 3 lines
use ziftsieve::{CompressedIndexBuilder, CompressionFormat};
let index = CompressedIndexBuilder::new(CompressionFormat::Lz4).build_from_bytes(b"...").unwrap();
if !index.candidate_blocks(b"my_secret").is_empty() { /* decompress and verify */ }§Supported Formats
- Gzip: Supports standard
.gzfiles and DEFLATE streams. - LZ4: Supports both the LZ4 frame format and raw block format.
- Snappy: Supports the Snappy framing format (common in database logs).
- Zstd: Supports Zstandard frames.
Each format is available as an optional crate feature.
Re-exports§
pub use builder::CompressedIndexBuilder;pub use builder::StreamingIndexBuilder;pub use extract::extract_from_bytes;pub use extract::CompressedBlock;pub use index::BloomStats;pub use index::CompressedIndex;
Modules§
- bloom
- Bloom filter implementation for fast set membership testing.
- builder
- Builders for constructing compressed indexes.
- detect
- Logic for identifying and describing compression formats.
- extract
- Orchestration of literal extraction across supported formats.
- index
- Core index structure for compressed literal search.
- lz4
- LZ4 literal extraction without full decompression.
Enums§
- Compression
Format - Compression formats supported for literal extraction.
- Zift
Error - Errors returned while parsing compressed data or building indexes.