Trait udf::BasicUdf

source ·
pub trait BasicUdf: Sized {
    type Returns<'a>
       where Self: 'a;

    // Required methods
    fn init(
        cfg: &UdfCfg<Init>,
        args: &ArgList<'_, Init>
    ) -> Result<Self, String>;
    fn process<'a>(
        &'a mut self,
        cfg: &UdfCfg<Process>,
        args: &ArgList<'_, Process>,
        error: Option<NonZeroU8>
    ) -> Result<Self::Returns<'a>, ProcessError>;
Expand description

This trait specifies the functions needed for a standard (non-aggregate) UDF

Implement this on any struct in order to create a UDF. That struct can either be empty (usually the case for simple functions), or contain data that will be shared among all the UDF functions.

If the UDF is basic (non-aggregate), the process is:

  • Caller (SQL server) calls init() with basic argument information
  • init() function (defined here) validates the arguments, does configuration (if needed), and configures and returns the Self struct
  • For each row, the caller calls process(...) with the relevant arguments
  • process() function (defined here) accepts an instance of self (created during init) and updates it as needed, and produces a result for that row

The UDF specification also calls out a deinit() function to deallocate any memory, but this is not needed here (handled by this wrapper).

Required Associated Types§


type Returns<'a> where Self: 'a

This type represents the return type of the UDF function.

There are a lot of options, with some rules to follow. Warning! tedious explanation below, just skip to the next section if you don’t need the details.

  • f64 (real), i64 (integer), and [u8] (string/blob) are the three fundamental types
  • Any Return can be an Option<something> if the result is potentially nullable
  • There is no meaningful difference between String, Vec<u8>, str, and [u8] - return whichever is most convenient (following the below rules). Any of these types are acceptable for returning string or decimal types.
  • Out of these buffer options, prefer returning &'static str or &'static [u8] where possible. These are usable when only returning const/static values.
  • “Owned allocated” types (String, Vec<u8>) are the next preference for buffer types, and can be used whenever
  • If you have an owned type that updates itself, you can store the relevant String or Vec<u8> in your struct and return a &'a str or &'a [u8] that references them. This is useful for something like a concat function that updates its result string with each call (GATs allow this to work).

Choosing a type may seem tricky at first but anything that successfully compiles will likely work. The flow chart below helps clarify some of the decisions making:

    Desired                Use Option<T> if the result may be null
  Return Type
│   integer   ├─> i64 / Option<i64>
│    float    ├─> f64 / Option<f64>
╭─────────────╮   │  static   ├─> &'static str / Option<&'static str>
│ utf8 string ├─> │           │
╰─────────────╯   │           │   ╭───────────────╮
                  │  dynamic  ├─> │  independent  ├─> String / Option<String>
                  ╰───────────╯   │               │
                                  │ self-updating ├─> &'a str / Option<&'a str>
╭─────────────╮   ╭───────────╮
│  non utf8   │   │  static   ├─> &'static [u8] / Option<&'static [u8]>
│ string/blob ├─> │           │
╰─────────────╯   │           │   ╭───────────────╮
                  │  dynamic  ├─> │  independent  ├─> Vec<u8> / Option<Vec<u8>>
                  ╰───────────╯   │               │
                                  │ self-updating ├─> &'a [u8] / Option<&'a [u8]>

Required Methods§


fn init(cfg: &UdfCfg<Init>, args: &ArgList<'_, Init>) -> Result<Self, String>

This is the initialization function

It is expected that this function do the following:

  • Check that arguments are the proper type
  • Check whether the arguments are const and have a usable value (can provide some optimizations)

If your function is not able to work with the given arguments, return a helpful error message explaining why. Max error size is MYSQL_ERRMSG_SIZE (512) bits, and will be truncated if any longer.

MySql recommends keeping these error messages under 80 characters to fit in a terminal, but personal I’d prefer a helpful message over something useless that fits in one line.

Error handling options are limited in all other functions, so make sure you check thoroughly for any possible errors that may arise, to the best of your ability. These may include:

  • Incorrect argument quantity or position
  • Incorrect argument types
  • Values that are maybe_null() when you cannot accept them

fn process<'a>( &'a mut self, cfg: &UdfCfg<Process>, args: &ArgList<'_, Process>, error: Option<NonZeroU8> ) -> Result<Self::Returns<'a>, ProcessError>

Process the actual values and return a result

If you are unfamiliar with Rust, don’t worry too much about the 'a you see thrown around a lot. They are lifetime annotations and more or less say, “self lives at least as long as my return type does so I can return a reference to it, but args may not last as long so I cannot return a reference to that”.

  • args: Iterable list of arguments of the Process type
  • error: This is only applicable when using aggregate functions and can otherwise be ignored. If using aggregate functions, this provides the current error value as described in AggregateUdf::add().
§Return Value

Assuming success, this function must return something of type Self::Returns. This will be the value for the row (standard functions) or for the entire group (aggregate functions).


If there is some sort of unrecoverable problem at this point, just return a ProcessError. This will make the SQL server return NULL. As mentioned, there really aren’t any good error handling options at this point other than that, so try to catch all possible errors in BasicUdf::init.

ProcessError is just an empty type.

Object Safety§

This trait is not object safe.