vortex_btrblocks/lib.rs
1// SPDX-License-Identifier: Apache-2.0
2// SPDX-FileCopyrightText: Copyright the Vortex contributors
3
4#![deny(missing_docs)]
5
6//! Vortex's [BtrBlocks]-inspired adaptive compression framework.
7//!
8//! This crate provides a sophisticated multi-level compression system that adaptively selects
9//! optimal compression schemes based on data characteristics. The compressor analyzes arrays
10//! to determine the best encoding strategy, supporting cascaded compression with multiple
11//! encoding layers for maximum efficiency.
12//!
13//! # Key Features
14//!
15//! - **Adaptive Compression**: Automatically selects the best compression scheme based on data patterns
16//! - **Type-Specific Compressors**: Specialized compression for integers, floats, strings, and temporal data
17//! - **Cascaded Encoding**: Multiple compression layers can be applied for optimal results
18//! - **Statistical Analysis**: Uses data sampling and statistics to predict compression ratios
19//! - **Recursive Structure Handling**: Compresses nested structures like structs and lists
20//!
21//! # How It Works
22//!
23//! [`BtrBlocksCompressor::compress()`] takes an `&ArrayRef` and returns an `ArrayRef` that may
24//! use a different encoding. It first canonicalizes the input, then dispatches by type.
25//! Primitives go to a type-specific `Compressor` (integer, float, or string). Compound types
26//! like structs and lists recurse into their fields and elements.
27//!
28//! Each type-specific compressor holds a static list of `Scheme` implementations (e.g.
29//! BitPacking, ALP, Dict). There is no dynamic registry. The compressor evaluates each scheme by
30//! compressing a ~1% sample and measuring the ratio, then picks the best. See `SchemeExt` for
31//! details on how sampling works.
32//!
33//! Schemes can produce arrays that are themselves further compressed (e.g. FoR then BitPacking),
34//! up to `MAX_CASCADE` (3) layers deep. An `Excludes` set prevents the same scheme from being
35//! applied twice in a chain.
36//!
37//! # Example
38//!
39//! ```rust
40//! use vortex_btrblocks::{BtrBlocksCompressor, BtrBlocksCompressorBuilder, IntCode};
41//! use vortex_array::DynArray;
42//!
43//! // Default compressor with all schemes enabled
44//! let compressor = BtrBlocksCompressor::default();
45//!
46//! // Configure with builder to exclude specific schemes
47//! let compressor = BtrBlocksCompressorBuilder::default()
48//! .exclude_int([IntCode::Dict])
49//! .build();
50//! ```
51//!
52//! [BtrBlocks]: https://www.cs.cit.tum.de/fileadmin/w00cfj/dis/papers/btrblocks.pdf
53
54pub use compressor::float::FloatCode;
55use compressor::float::FloatCompressor;
56pub use compressor::integer::IntCode;
57use compressor::integer::IntCompressor;
58pub use compressor::string::StringCode;
59use compressor::string::StringCompressor;
60
61mod builder;
62mod canonical_compressor;
63mod compressor;
64mod ctx;
65mod sample;
66mod scheme;
67mod stats;
68
69pub use builder::BtrBlocksCompressorBuilder;
70pub use canonical_compressor::BtrBlocksCompressor;
71pub use canonical_compressor::CanonicalCompressor;
72use compressor::Compressor;
73use compressor::CompressorExt;
74use compressor::MAX_CASCADE;
75pub use compressor::integer::IntegerStats;
76pub use compressor::integer::dictionary::dictionary_encode as integer_dictionary_encode;
77use ctx::CompressorContext;
78use ctx::Excludes;
79use scheme::Scheme;
80use scheme::SchemeExt;
81pub use stats::CompressorStats;
82pub use stats::GenerateStatsOptions;