1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// SPDX-License-Identifier: Apache-2.0
// SPDX-FileCopyrightText: Copyright the Vortex contributors
//! Vortex's [BtrBlocks]-inspired adaptive compression framework.
//!
//! This crate provides a sophisticated multi-level compression system that adaptively selects
//! optimal compression schemes based on data characteristics. The compressor analyzes arrays
//! to determine the best encoding strategy, supporting cascaded compression with multiple
//! encoding layers for maximum efficiency.
//!
//! # Key Features
//!
//! - **Adaptive Compression**: Automatically selects the best compression scheme based on data
//! patterns.
//! - **Unified Scheme Trait**: A single [`Scheme`] trait covers all data types (integers, floats,
//! strings, etc.) with a [`SchemeId`] for identity.
//! - **Cascaded Encoding**: Multiple compression layers can be applied for optimal results.
//! - **Statistical Analysis**: Uses data sampling and statistics to predict compression ratios.
//! - **Recursive Structure Handling**: Compresses nested structures like structs and lists.
//!
//! # How It Works
//!
//! [`BtrBlocksCompressor::compress()`] takes an `&ArrayRef` plus a mutable execution context and
//! returns an `ArrayRef` that may use a different encoding. It first canonicalizes the input, then dispatches by type.
//! Primitives and strings go through `choose_and_compress`, which evaluates every enabled
//! [`Scheme`] and picks the one with the best compression ratio. Compound types like structs
//! and lists recurse into their fields and elements.
//!
//! Each `Scheme` implementation declares whether it [`matches`](Scheme::matches) a given
//! canonical form and, if so, estimates the compression ratio (often by compressing a ~1%
//! sample). There is no dynamic registry — the set of schemes is fixed at build time via
//! [`ALL_SCHEMES`].
//!
//! Schemes can produce arrays that are themselves further compressed (e.g. FoR then BitPacking),
//! up to [`MAX_CASCADE`] (3) layers deep. Descendant exclusion rules for of [`SchemeId`] prevents
//! the same scheme from being applied twice in a chain.
//!
//! # Example
//!
//! ```rust
//! use vortex_btrblocks::{BtrBlocksCompressor, BtrBlocksCompressorBuilder, Scheme, SchemeExt};
//! use vortex_btrblocks::schemes::integer::IntDictScheme;
//!
//! // Default compressor with all schemes enabled.
//! let compressor = BtrBlocksCompressor::default();
//!
//! // Remove specific schemes using the builder.
//! let compressor = BtrBlocksCompressorBuilder::default()
//! .exclude_schemes([IntDictScheme.id()])
//! .build();
//! ```
//!
//! [BtrBlocks]: https://www.cs.cit.tum.de/fileadmin/w00cfj/dis/papers/btrblocks.pdf
/// Compression scheme implementations.
// Re-export framework types from vortex-compressor for backwards compatibility.
// Btrblocks-specific exports.
pub use ALL_SCHEMES;
pub use BtrBlocksCompressorBuilder;
pub use BtrBlocksCompressor;
pub use compress_patches;
pub use CascadingCompressor;
pub use CompressorContext;
pub use MAX_CASCADE;
pub use Scheme;
pub use SchemeExt;
pub use SchemeId;
pub use ArrayAndStats;
pub use BoolStats;
pub use FloatStats;
pub use GenerateStatsOptions;
pub use IntegerStats;
pub use StringStats;