Module split_comparison

Source
Expand description

Analyzes compression efficiency of different field arrangements in bit-packed structures.

Compares compression metrics between different field groupings, primarily focusing on interleaved vs. separated layouts (e.g., RGBRGBRGB vs. RRRGGGBBB).

§Core Types

§Example

split_groups:
  - name: colors
    group_1: [colors]                    # RGBRGBRGB
    group_2: [color_r, color_g, color_b] # RRRGGGBBB

Use make_split_comparison_result to generate comparison metrics for two field arrangements.

Each comparison tracks:

  • Entropy and LZ matches (data redundancy measures)
  • Sizes (original, estimated compression, actual zstd compression)

§Usage Notes

  • Ensure compared groups have equal total bits
  • Field ordering can significantly impact compression
  • zstd compression time dominates performance

Structs§

FieldComparisonMetrics
Represents the statistics for the individual fields which were used to create the individual combined group or split.
SplitComparisonResult
The result of comparing 2 arbitrary groups of fields based on the schema.

Functions§

make_split_comparison_result
Calculates the compression statistics of two splits (of the same data) and returns them as a SplitComparisonResult object. This can also be used for generic two-way compares.