Struct q_compress::Prefix
source · #[non_exhaustive]pub struct Prefix<T: NumberLike> {
pub count: usize,
pub code: Vec<bool>,
pub lower: T,
pub upper: T,
pub run_len_jumpstart: Option<usize>,
pub gcd: T::Unsigned,
}
Expand description
A pairing of a Huffman code with a numerical range.
Quantile Compression works by splitting the distribution of numbers
into ranges and associating a Huffman code (a short sequence of bits)
with each range.
The combination of these pieces of information, plus a couple others,
is called a Prefix
.
When compressing a number, the compressor finds the prefix containing
it, then writes out its Huffman code, optionally the number of
consecutive repetitions of that number if run_length_jumpstart
is
available, and then the exact offset within the range for the number.
Fields (Non-exhaustive)§
This struct is marked as non-exhaustive
Struct { .. }
syntax; cannot be matched against without a wildcard ..
; and struct update syntax will not work.count: usize
The count of numbers in the chunk that fall into this Prefix’s range. Not available in wrapped mode.
code: Vec<bool>
The Huffman code for this prefix. Collectively, all the prefixes for a chunk form a binary search tree (BST) over these Huffman codes. The BST over Huffman codes is different from the BST over numerical ranges.
lower: T
The lower bound for this prefix’s numerical range.
upper: T
The upper bound (inclusive) for this prefix’s numerical range.
run_len_jumpstart: Option<usize>
A parameter used for the most common prefix in a sparse distribution.
For instance, if 90% of a chunk’s numbers are exactly 7, then the
prefix for the range [7, 7]
will have a run_len_jumpstart
.
The jumpstart value tunes the varint encoding of the number of
consecutive repetitions of the prefix.
gcd: T::Unsigned
The greatest common divisor of all numbers belonging to this prefix (in the data type’s corresponding unsigned integer).