Trait datafusion_expr::AggregateUDFImpl

source ·
pub trait AggregateUDFImpl: Debug + Send + Sync {
    // Required methods
    fn as_any(&self) -> &dyn Any;
    fn name(&self) -> &str;
    fn signature(&self) -> &Signature;
    fn return_type(&self, arg_types: &[DataType]) -> Result<DataType>;
    fn accumulator(&self, arg: &DataType) -> Result<Box<dyn Accumulator>>;
    fn state_type(&self, return_type: &DataType) -> Result<Vec<DataType>>;

    // Provided methods
    fn groups_accumulator_supported(&self) -> bool { ... }
    fn create_groups_accumulator(&self) -> Result<Box<dyn GroupsAccumulator>> { ... }
    fn aliases(&self) -> &[String] { ... }
}
Expand description

Trait for implementing AggregateUDF.

This trait exposes the full API for implementing user defined aggregate functions and can be used to implement any function.

See advanced_udaf.rs for a full example with complete implementation and AggregateUDF for other available options.

§Basic Example

#[derive(Debug, Clone)]
struct GeoMeanUdf {
  signature: Signature
};

impl GeoMeanUdf {
  fn new() -> Self {
    Self {
      signature: Signature::uniform(1, vec![DataType::Float64], Volatility::Immutable)
     }
  }
}

/// Implement the AggregateUDFImpl trait for GeoMeanUdf
impl AggregateUDFImpl for GeoMeanUdf {
   fn as_any(&self) -> &dyn Any { self }
   fn name(&self) -> &str { "geo_mean" }
   fn signature(&self) -> &Signature { &self.signature }
   fn return_type(&self, args: &[DataType]) -> Result<DataType> {
     if !matches!(args.get(0), Some(&DataType::Float64)) {
       return plan_err!("add_one only accepts Float64 arguments");
     }
     Ok(DataType::Float64)
   }
   // This is the accumulator factory; DataFusion uses it to create new accumulators.
   fn accumulator(&self, _arg: &DataType) -> Result<Box<dyn Accumulator>> { unimplemented!() }
   fn state_type(&self, _return_type: &DataType) -> Result<Vec<DataType>> {
       Ok(vec![DataType::Float64, DataType::UInt32])
   }
}

// Create a new AggregateUDF from the implementation
let geometric_mean = AggregateUDF::from(GeoMeanUdf::new());

// Call the function `geo_mean(col)`
let expr = geometric_mean.call(vec![col("a")]);

Required Methods§

source

fn as_any(&self) -> &dyn Any

Returns this object as an Any trait object

source

fn name(&self) -> &str

Returns this function’s name

source

fn signature(&self) -> &Signature

Returns the function’s Signature for information about what input types are accepted and the function’s Volatility.

source

fn return_type(&self, arg_types: &[DataType]) -> Result<DataType>

What DataType will be returned by this function, given the types of the arguments

source

fn accumulator(&self, arg: &DataType) -> Result<Box<dyn Accumulator>>

Return a new Accumulator that aggregates values for a specific group during query execution.

source

fn state_type(&self, return_type: &DataType) -> Result<Vec<DataType>>

Return the type used to serialize the Accumulator’s intermediate state. See Accumulator::state() for more details

Provided Methods§

source

fn groups_accumulator_supported(&self) -> bool

If the aggregate expression has a specialized GroupsAccumulator implementation. If this returns true, [Self::create_groups_accumulator] will be called.

source

fn create_groups_accumulator(&self) -> Result<Box<dyn GroupsAccumulator>>

Return a specialized GroupsAccumulator that manages state for all groups.

For maximum performance, a GroupsAccumulator should be implemented in addition to Accumulator.

source

fn aliases(&self) -> &[String]

Returns any aliases (alternate names) for this function.

Note: aliases should only include names other than Self::name. Defaults to [] (no aliases)

Implementors§