Module tune

Source
Expand description

Autotune module

§Autotuning

Autotuning allows running different kernels or comptime parameters to find the fastest one for any given input. Kernels must implement TuneFn (see below).

§Example

#[derive(AutotuneKey)]
struct KernelKey {
    size: u32
}

fn run_kernel_tuned(lhs: Tensor, rhs: Tensor) -> Tensor {
    static TUNER: LocalTuner<String, KernelKey> = local_tuner!();
     
    let tunables = TUNER.init(|| {
        TunableSet::new(KernelKey::new, |_key, lhs, rhs| (lhs.clone(), rhs.clone()))
            .with(Tunable::new(kernel_1))
            .with(Tunable::new(kernel_2.ok()))
            .with(Tunable::new(kernel_3)
    });
    
    TUNER.execute("hello".to_string(), &lhs.client, &tunables, (lhs, rhs));
}

§Tunable

TuneFn is implemented automatically for all functions and closures that take a set of cloneable inputs, and return a Result<Out, impl Into<AutotuneError>>. If the kernel does not return a Result, use kernel_fn.ok() to wrap it in Ok and turn it into a tunable.

§Implementation details

To implement TuneFn for all valid tunable functions, a set of patterns is employed. TuneFn functions don’t directly implement TuneFn, they implement IntoTuneFn instead. The reason for this is that the Rust trait resolver can’t detect that traits like Fn(A, B) and Fn(A) are mutually exclusive. This means trying to implement TuneFn for both would cause conflicting implementations. To solve this problem, a Marker generic is employed, that stores a dummy type (like IsFunction), along with the equivalent function pointer of the signature (which is a type, not a trait), allowing the trait resolver to correctly identify the implementations as distinct. However, since different kinds of TuneFn will have different Marker generics, the IntoTuneFn trait is needed to erase the marker. This way, only Tunable::new requires the marker as a generic, which it then erases by calling IntoTuneFn::into_tunable. The same technique is used for KeyGenerator and InputGenerator.

The last set of traits are AsFunctionTunable and AsFunctionTunableResult. These traits are directly implemented by all tunable functions and allow us to annotate function-like tunables specifically, to allow things like overriding the name, wrapping the return type in Ok (AsFunctionTunable::ok), and other things. They also help with error messages. This is done by using #[diagnostic::on_unimplemented(...)].

Macros§

local_tuner
Create a local tuner with the provided name.

Structs§

AutotuneOutcome
The measured outcome for a given autotune invocation.
FunctionInputGenerator
An input generator implemented by an Fn
FunctionKeyGenerator
A key generator implemented by an Fn
FunctionTunable
Tunable implemented as a function or closure
FunctionTunableResultMap
Tunable implemented as a function or closure that returns a plain value, wrapped in Ok.
LocalTuner
A local tuner allows to create a tuner for a specific key that can be different from the server key.
Tunable
A tunable wraps a function that can be included in multiple groups.
TunableSet
Groups operations of the same type for autotune
TuneBenchmark
A benchmark that runs on server handles
TuneGroup
A tune group encapsulates a priority that can be calculated based on an autotune key.
Tuner
Executes autotune benchmarking and caching

Enums§

AutotuneError
Error from running autotune.
TuneCacheResult
Result of the cache try

Traits§

AsFunctionTunable
A function that can be turned into a tunable.
AsFunctionTunableResult
An infallible function that can be turned into a tunable.
AutotuneKey
Trait alias with support for persistent caching
AutotuneOutput
The trait to be implemented by an autotune output.
FunctionInputGen
A function that can be turned into an input generator for Inputs
FunctionKeygen
An Fn that can act as a key generator
InputGenerator
A function that generates the input for autotuning passes
IntoInputGenerator
Something that can be turned into an input generator
IntoKeyGenerator
Something that can be turned into a key generator
IntoTuneFn
Something that can be turned into a Tunable
KeyGenerator
A generator that creates a key for a given set of inputs
TuneFn
A tunable entry in a tunable set

Functions§

anchor
Anchor a number to a power of the provided base.
compute_checksum
Default checksum for an operation set