Skip to main content

Module adaptive_selection

Module adaptive_selection 

Source
Expand description

Adaptive selection representations for TidyView.

A TidyView describes a subset of rows from a base DataFrame. Until v2, the subset was stored as a single BitMask regardless of density. The adaptive engine picks one of five deterministic representations based on result density, so that sparse predicates do not pay the cost of a full bitscan and dense predicates retain the existing fast O(nrows/64) path.

§Modes

  • Empty — no rows selected (zero allocation)
  • All — every row selected (zero allocation; nrows stored)
  • SelectionVector — ascending Vec<u32> of selected row indices (sparse)
  • VerbatimMask — backing BitMask for the dense path (≥30% density)
  • Hybrid — chunked, locally-classified per HYBRID_CHUNK_SIZE-row block. Active for mid-density results when nrows >= 2 * HYBRID_CHUNK_SIZE.

§Determinism

  • Iteration order is always ascending row index for every arm.
  • Density classification uses pure integer arithmetic; thresholds are bit-stable across platforms.
  • intersect/union produce a single deterministic arm choice for any pair of inputs.

Enums§

AdaptiveSelection
A row-selection representation chosen adaptively by density.
HybridChunk
A per-chunk selection state. Each chunk represents HYBRID_CHUNK_SIZE rows (the final chunk may be partial — see HybridChunk::partial_size).
HybridInner
Per-chunk iteration state.
SelectionIndices
An ascending iterator over selected row indices. One variant per AdaptiveSelection arm so the hot path stays monomorphic and inlines.

Constants§

HYBRID_CHUNK_SIZE
Hybrid chunk size in rows. 4096 rows = 64 u64 words per dense chunk = 512 B.