vortex_btrblocks/lib.rs
1// SPDX-License-Identifier: Apache-2.0
2// SPDX-FileCopyrightText: Copyright the Vortex contributors
3
4#![deny(missing_docs)]
5
6//! Vortex's [BtrBlocks]-inspired adaptive compression framework.
7//!
8//! This crate provides a sophisticated multi-level compression system that adaptively selects
9//! optimal compression schemes based on data characteristics. The compressor analyzes arrays
10//! to determine the best encoding strategy, supporting cascaded compression with multiple
11//! encoding layers for maximum efficiency.
12//!
13//! # Key Features
14//!
15//! - **Adaptive Compression**: Automatically selects the best compression scheme based on data
16//! patterns.
17//! - **Unified Scheme Trait**: A single [`Scheme`] trait covers all data types (integers, floats,
18//! strings, etc.) with a [`SchemeId`] for identity.
19//! - **Cascaded Encoding**: Multiple compression layers can be applied for optimal results.
20//! - **Statistical Analysis**: Uses data sampling and statistics to predict compression ratios.
21//! - **Recursive Structure Handling**: Compresses nested structures like structs and lists.
22//!
23//! # How It Works
24//!
25//! [`BtrBlocksCompressor::compress()`] takes an `&ArrayRef` and returns an `ArrayRef` that may
26//! use a different encoding. It first canonicalizes the input, then dispatches by type.
27//! Primitives and strings go through `choose_and_compress`, which evaluates every enabled
28//! [`Scheme`] and picks the one with the best compression ratio. Compound types like structs
29//! and lists recurse into their fields and elements.
30//!
31//! Each `Scheme` implementation declares whether it [`matches`](Scheme::matches) a given
32//! canonical form and, if so, estimates the compression ratio (often by compressing a ~1%
33//! sample). There is no dynamic registry — the set of schemes is fixed at build time via
34//! [`ALL_SCHEMES`].
35//!
36//! Schemes can produce arrays that are themselves further compressed (e.g. FoR then BitPacking),
37//! up to [`MAX_CASCADE`] (3) layers deep. Descendant exclusion rules for of [`SchemeId`] prevents
38//! the same scheme from being applied twice in a chain.
39//!
40//! # Example
41//!
42//! ```rust
43//! use vortex_btrblocks::{BtrBlocksCompressor, BtrBlocksCompressorBuilder, Scheme, SchemeExt};
44//! use vortex_btrblocks::schemes::integer::IntDictScheme;
45//!
46//! // Default compressor with all schemes enabled.
47//! let compressor = BtrBlocksCompressor::default();
48//!
49//! // Remove specific schemes using the builder.
50//! let compressor = BtrBlocksCompressorBuilder::default()
51//! .exclude_schemes([IntDictScheme.id()])
52//! .build();
53//! ```
54//!
55//! [BtrBlocks]: https://www.cs.cit.tum.de/fileadmin/w00cfj/dis/papers/btrblocks.pdf
56
57mod builder;
58mod canonical_compressor;
59/// Compression scheme implementations.
60pub mod schemes;
61
62// Re-export framework types from vortex-compressor for backwards compatibility.
63// Btrblocks-specific exports.
64pub use builder::ALL_SCHEMES;
65pub use builder::BtrBlocksCompressorBuilder;
66pub use canonical_compressor::BtrBlocksCompressor;
67pub use schemes::patches::compress_patches;
68pub use vortex_compressor::CascadingCompressor;
69pub use vortex_compressor::ctx::CompressorContext;
70pub use vortex_compressor::ctx::MAX_CASCADE;
71pub use vortex_compressor::scheme::Scheme;
72pub use vortex_compressor::scheme::SchemeExt;
73pub use vortex_compressor::scheme::SchemeId;
74pub use vortex_compressor::stats::ArrayAndStats;
75pub use vortex_compressor::stats::BoolStats;
76pub use vortex_compressor::stats::FloatStats;
77pub use vortex_compressor::stats::GenerateStatsOptions;
78pub use vortex_compressor::stats::IntegerStats;
79pub use vortex_compressor::stats::StringStats;