Expand description
Aggregate Function packages for DataFusion.
This crate contains a collection of various aggregate function packages for DataFusion, implemented using the extension API. Users may wish to control which functions are available to control the binary size of their application as well as use dialect specific implementations of functions (e.g. Spark vs Postgres)
Each package is implemented as a separate module, activated by a feature flag.
§Available Packages
See the list of modules in this crate for available packages.
§Using A Package
You can register all functions in all packages using the register_all function.
Each package also exports an expr_fn submodule to help create Exprs that invoke
functions using a fluent style. For example:
§Implementing A New Package
To add a new package to this crate, you should follow the model of existing packages. The high level steps are:
- 
Create a new module with the appropriate AggregateUDF implementations. 
- 
Use the macros in macrosto create standard entry points.
- 
Add a new feature to Cargo.toml, with any optional dependencies
- 
Use the make_package!macro to expose the module when the feature is enabled.
Modules§
- approx_distinct 
- Defines physical expressions that can evaluated at runtime during query execution
- approx_median 
- Defines physical expressions for APPROX_MEDIAN that can be evaluated MEDIAN at runtime during query execution
- approx_percentile_ cont 
- approx_percentile_ cont_ with_ weight 
- array_agg 
- ARRAY_AGGaggregate implementation:- ArrayAgg
- average
- Defines Avg&Meanaggregate & accumulators
- bit_and_ or_ xor 
- Defines BitAnd,BitOr,BitXorandBitXor DISTINCTaggregate accumulators
- bool_and_ or 
- Defines physical expressions that can evaluated at runtime during query execution
- correlation
- Correlation: correlation sample aggregations.
- count
- covariance
- CovarianceSample: covariance sample aggregations.
- expr_fn
- Fluent-style API for creating Exprs
- first_last 
- Defines the FIRST_VALUE/LAST_VALUE aggregations.
- grouping
- Defines physical expressions that can evaluated at runtime during query execution
- hyperloglog
- HyperLogLog
- macros
- median
- min_max
- Maxand- MaxAccumulatoraccumulator for the- maxfunction- Minand- MinAccumulatoraccumulator for the- minfunction
- nth_value 
- Defines NTH_VALUE aggregate expression which may specify ordering requirement that can evaluated at runtime during query execution
- planner
- SQL planning extensions like AggregateFunctionPlanner
- regr
- Defines physical expressions that can evaluated at runtime during query execution
- stddev
- Defines physical expressions that can evaluated at runtime during query execution
- string_agg 
- StringAggaccumulator for the- string_aggfunction
- sum
- Defines SUMandSUM DISTINCTaggregate accumulators
- variance
- VarianceSample: variance sample aggregations.- VariancePopulation: variance population aggregations.
Macros§
Functions§
- all_default_ aggregate_ functions 
- Returns all default aggregate functions
- register_all 
- Registers all enabled packages with a FunctionRegistry