Skip to main content

analyze_batch

Function analyze_batch 

Source
pub fn analyze_batch(batch: &RecordBatch) -> ColumnStatistics
Expand description

Compute column statistics from a single RecordBatch.

row_count and null_count are exact. min_value / max_value are computed by walking the column once; distinct_count uses a HashSet up to EXACT_NDV_CAP and falls back to None above the cap (callers should re-run via analyze_record_batches with a larger memory budget if they need approximate NDV).