Expand description
§Byte Stream Split (BSS) Miniblock Format
Byte Stream Split is a data transformation technique optimized for floating-point data compression. It improves compression ratios by reorganizing data to group similar byte patterns together.
§How It Works
BSS splits floating-point values by byte position, creating separate streams for each byte position across all values. This transformation exploits the fact that floating-point data often has patterns in specific byte positions (e.g., similar exponents or mantissa patterns).
§Example
Input data (f32): [1.0, 2.0, 3.0, 4.0]
In little-endian bytes:
- 1.0 =
[00, 00, 80, 3F] - 2.0 =
[00, 00, 00, 40] - 3.0 =
[00, 00, 40, 40] - 4.0 =
[00, 00, 80, 40]
After BSS transformation:
- Byte stream 0:
[00, 00, 00, 00](all first bytes) - Byte stream 1:
[00, 00, 00, 00](all second bytes) - Byte stream 2:
[80, 00, 40, 80](all third bytes) - Byte stream 3:
[3F, 40, 40, 40](all fourth bytes)
Output: [00, 00, 00, 00, 00, 00, 00, 00, 80, 00, 40, 80, 3F, 40, 40, 40]
§Compression Benefits
BSS itself doesn’t compress data - it reorders it. The compression benefit comes when BSS is combined with general-purpose compression (e.g., LZ4):
- Timestamps: Sequential timestamps have similar high-order bytes
- Sensor data: Readings often vary in a small range, sharing exponent bits
- Financial data: Prices may cluster around certain values
§Supported Types
- 32-bit floating point (f32)
- 64-bit floating point (f64)
§Chunk Handling
- Maximum chunk size depends on data type:
- f32: 1024 values (4KB per chunk)
- f64: 512 values (4KB per chunk)
- All chunks share a single global buffer
- Non-last chunks always contain power-of-2 values
Structs§
- Byte
Stream Split Decompressor - Byte Stream Split decompressor
- Byte
Stream Split Encoder - Byte Stream Split encoder for floating point values