Expand description
Aggregate Function packages for DataFusion.
This crate contains a collection of various aggregate function packages for DataFusion, implemented using the extension API. Users may wish to control which functions are available to control the binary size of their application as well as use dialect specific implementations of functions (e.g. Spark vs Postgres)
Each package is implemented as a separate module, activated by a feature flag.
§Available Packages
See the list of modules in this crate for available packages.
§Using A Package
You can register all functions in all packages using the register_all function.
Each package also exports an expr_fn submodule to help create Exprs that invoke
functions using a fluent style. For example:
§Implementing A New Package
To add a new package to this crate, you should follow the model of existing packages. The high level steps are:
- 
Create a new module with the appropriate AggregateUDF implementations.
 - 
Use the macros in
macrosto create standard entry points. - 
Add a new feature to
Cargo.toml, with any optional dependencies - 
Use the
make_package!macro to expose the module when the feature is enabled. 
Modules§
- approx_
distinct  - Defines physical expressions that can evaluated at runtime during query execution
 - approx_
median  - Defines physical expressions for APPROX_MEDIAN that can be evaluated MEDIAN at runtime during query execution
 - approx_
percentile_ cont  - approx_
percentile_ cont_ with_ weight  - array_
agg  ARRAY_AGGaggregate implementation:ArrayAgg- average
 - Defines 
Avg&Meanaggregate & accumulators - bit_
and_ or_ xor  - Defines 
BitAnd,BitOr,BitXorandBitXor DISTINCTaggregate accumulators - bool_
and_ or  - Defines physical expressions that can evaluated at runtime during query execution
 - correlation
 Correlation: correlation sample aggregations.- count
 - covariance
 CovarianceSample: covariance sample aggregations.- expr_fn
 - Fluent-style API for creating 
Exprs - first_
last  - Defines the FIRST_VALUE/LAST_VALUE aggregations.
 - grouping
 - Defines physical expressions that can evaluated at runtime during query execution
 - hyperloglog
 - HyperLogLog
 - macros
 - median
 - min_max
 MaxandMaxAccumulatoraccumulator for themaxfunctionMinandMinAccumulatoraccumulator for theminfunction- nth_
value  - Defines NTH_VALUE aggregate expression which may specify ordering requirement that can evaluated at runtime during query execution
 - planner
 - SQL planning extensions like 
AggregateFunctionPlanner - regr
 - Defines physical expressions that can evaluated at runtime during query execution
 - stddev
 - Defines physical expressions that can evaluated at runtime during query execution
 - string_
agg  StringAggaccumulator for thestring_aggfunction- sum
 - Defines 
SUMandSUM DISTINCTaggregate accumulators - variance
 VarianceSample: variance sample aggregations.VariancePopulation: variance population aggregations.
Macros§
Functions§
- all_
default_ aggregate_ functions  - Returns all default aggregate functions
 - register_
all  - Registers all enabled packages with a 
FunctionRegistry