Expand description
ALP (Adaptive Lossless floating-Point) codec for f64 columns.
Most real-world float metrics originate as fixed-point decimals (e.g.,
23.5, 99.99). ALP finds the optimal power-of-10 multiplier that
converts them to integers losslessly: 23.5 × 10 = 235 round-trips
exactly via 235 / 10 = 23.5. The resulting integers have tiny bit-widths
and compress 3-6x better than Gorilla with SIMD-friendly bit-packing.
Values that don’t round-trip exactly (true arbitrary doubles, NaN, Inf) are stored as exceptions — their original f64 bits are preserved separately.
Wire format:
[4 bytes] total value count (LE u32)
[1 byte] exponent e (power of 10: factor = 10^e, range 0-18)
[1 byte] factor index f (combined encode/decode factor)
[4 bytes] exception count (LE u32)
[exception_count × 12 bytes] exceptions: (index: u32, value_bits: u64)
[N bytes] FastLanes-packed encoded integers (FOR + bit-pack)Reference: “ALP: Adaptive Lossless floating-Point Compression” (Afroozeh et al., SIGMOD 2023)
Functions§
- alp_
encodability - Check what fraction of values can be ALP-encoded at the best exponent.
- decode
- Decode ALP-compressed bytes back to f64 values.
- encode
- Encode a slice of f64 values using ALP compression.