Skip to main content

Module aggregate

Module aggregate 

Source
Expand description

§SQL Aggregation Executor

Hash-aggregation operator for GROUP BY and aggregate functions.

Supported aggregates: COUNT(*), COUNT(col), COUNT(DISTINCT col), SUM, AVG, MIN, MAX, MEDIAN, STDDEV (sample, n-1, matching R’s sd() and DuckDB’s stddev).

§Pipeline

input rows (post-WHERE)
  └─> group keys evaluated per row ──> hash table of group states
        └─> accumulators updated per row
              └─> finalize: one synthesized row per group
                    └─> HAVING filter ─> ORDER BY ─> OFFSET/LIMIT ─> projection

Semantics notes:

  • NULL inputs are skipped by all aggregates except COUNT(*) (SQL standard).
  • An ungrouped aggregate over zero rows yields exactly one row (COUNT = 0, other aggregates NULL); a grouped aggregate over zero rows yields zero rows.
  • Non-aggregate SELECT columns that are not in GROUP BY resolve to the first value seen in the group (lenient mode, like SQLite / MySQL with ONLY_FULL_GROUP_BY disabled).

Enums§

AggFn
Recognized aggregate functions.

Functions§

compare_values
Total ordering across SochValue for grouping/sorting.
execute_aggregate
Execute aggregation over materialized input rows (already WHERE-filtered).
is_aggregate_query
Returns true if the SELECT needs the aggregation operator.
render_expr_name
Human-readable name for an expression, used for output column naming and canonical aggregate keys.