[][src]Attribute Macro multiversion::multiversion

#[multiversion]

Provides function multiversioning.

Functions are selected in order, calling the first matching target. The function tagged by the attribute is the generic implementation that does not require any specific architecture or features.

Helper attributes

  • #[clone]
    • Clones the function for the specified target.
    • Arguments:
      • target: the target specification of the clone
  • #[specialize]
    • Specializes the function for the specified target with another function.
    • Arguments:
      • target: the target specification of the specialization
      • fn: path to the function specializing the tagged function
      • unsafe (optional): indicates whether the specialization function is unsafe, but safe to call for this target. Functions tagged with the target attribute must be unsafe, so marking unsafe = true indicates that the safety contract is fulfilled andfunction is safe to call on the specified target. If function is unsafe for any other reason, remember to mark the tagged function unsafe as well.
  • #[crate_path]
    • Specifies the location of the multiversion crate (useful for re-exporting).
    • Arguments:
      • path: the path to the multiversion crate

Examples

Cloning

The following compiles square three times, once for each target and once for the generic target. Calling square selects the appropriate version at runtime.

use multiversion::multiversion;

#[multiversion]
#[clone(target = "[x86|x86_64]+avx")]
#[clone(target = "x86+sse")]
fn square(x: &mut [f32]) {
    for v in x {
        *v *= *v
    }
}

Specialization

This example creates a function where_am_i that prints the detected CPU feature.

use multiversion::multiversion;

fn where_am_i_avx() {
    println!("avx");
}

fn where_am_i_sse() {
    println!("sse");
}

fn where_am_i_neon() {
    println!("neon");
}

#[multiversion]
#[specialize(target = "[x86|x86_64]+avx", fn  = "where_am_i_avx")]
#[specialize(target = "x86+sse", fn = "where_am_i_sse")]
#[specialize(target = "[arm|aarch64]+neon", fn = "where_am_i_neon")]
fn where_am_i() {
    println!("generic");
}

Making target_feature functions safe

This example is the same as the above example, but calls unsafe specialized functions. Note that the where_am_i function is still safe, since we know we are only calling specialized functions on supported CPUs.

use multiversion::{multiversion, target};

#[target("[x86|x86_64]+avx")]
unsafe fn where_am_i_avx() {
    println!("avx");
}

#[target("x86+sse")]
unsafe fn where_am_i_sse() {
    println!("sse");
}

#[target("[arm|aarch64]+neon")]
unsafe fn where_am_i_neon() {
    println!("neon");
}

#[multiversion]
#[specialize(target = "[x86|x86_64]+avx", fn = "where_am_i_avx", unsafe = true)]
#[specialize(target = "x86+sse", fn = "where_am_i_sse", unsafe = true)]
#[specialize(target = "[arm|aarch64]+neon", fn = "where_am_i_neon")]
fn where_am_i() {
    println!("generic");
}

Static dispatching

The multiversion attribute allows functions called inside the function to be statically dispatched. Additionally, functions created with this attribute can themselves be statically dispatched. See static dispatching for more information.

Conditional compilation

The multiversion attribute supports conditional compilation with the #[target_cfg] helper attribute. See conditional compilation for more information.

Function name mangling

The functions created by this macro are mangled as {ident}_{features}_version, where ident is the name of the multiversioned function, and features is either default (for the default version with no features enabled) or the list of features, sorted alphabetically. Dots (.) in the feature names are removed.

The following creates two functions, foo_avx_sse41_version and foo_default_version.

#[multiversion::multiversion]
#[clone(target = "[x86|x86_64]+sse4.1+avx")]
fn foo() {}

#[multiversion::target("[x86|x86_64]+sse4.1+avx")]
unsafe fn call_foo_avx() {
    foo_avx_sse41_version();
}

fn call_foo_default() {
    foo_default_version();
}

Implementation details

The function version dispatcher consists of a function selector and an atomic function pointer. Initially the function pointer will point to the function selector. On invocation, this selector will then choose an implementation, store a pointer to it in the atomic function pointer for later use and then pass on control to the chosen function. On subsequent calls, the chosen function will be called without invoking the function selector.

Some comments on the benefits of this implementation:

  • The function selector is only invoked once. Subsequent calls are reduced to an atomic load and indirect function call (for non-generic, non-async functions). Generic and async functions cannot be stored in the atomic function pointer, which may result in additional branches.
  • If called in multiple threads, there is no contention. It is possible for two threads to hit the same function before function selection has completed, which results in each thread invoking the function selector, but the atomic ensures that these are synchronized correctly.