Expand description
Adaptive selection representations for TidyView.
A TidyView describes a subset of rows from a base DataFrame. Until v2,
the subset was stored as a single BitMask regardless of density. The
adaptive engine picks one of five deterministic representations based on
result density, so that sparse predicates do not pay the cost of a full
bitscan and dense predicates retain the existing fast O(nrows/64) path.
§Modes
Empty— no rows selected (zero allocation)All— every row selected (zero allocation; nrows stored)SelectionVector— ascendingVec<u32>of selected row indices (sparse)VerbatimMask— backingBitMaskfor the dense path (≥30% density)Hybrid— chunked, locally-classified perHYBRID_CHUNK_SIZE-row block. Active for mid-density results whennrows >= 2 * HYBRID_CHUNK_SIZE.
§Determinism
- Iteration order is always ascending row index for every arm.
- Density classification uses pure integer arithmetic; thresholds are bit-stable across platforms.
intersect/unionproduce a single deterministic arm choice for any pair of inputs.
Enums§
- Adaptive
Selection - A row-selection representation chosen adaptively by density.
- Hybrid
Chunk - A per-chunk selection state. Each chunk represents
HYBRID_CHUNK_SIZErows (the final chunk may be partial — seeHybridChunk::partial_size). - Hybrid
Inner - Per-chunk iteration state.
- Selection
Indices - An ascending iterator over selected row indices. One variant per
AdaptiveSelectionarm so the hot path stays monomorphic and inlines.
Constants§
- HYBRID_
CHUNK_ SIZE - Hybrid chunk size in rows. 4096 rows = 64 u64 words per dense chunk = 512 B.