Expand description
Autotune module
§Autotuning
Autotuning allows running different kernels or comptime parameters to find the fastest one
for any given input. Kernels must implement TuneFn
(see below).
§Example
#[derive(AutotuneKey)]
struct KernelKey {
size: u32
}
fn run_kernel_tuned(lhs: Tensor, rhs: Tensor) -> Tensor {
static TUNER: LocalTuner<String, KernelKey> = local_tuner!();
let tunables = TUNER.init(|| {
TunableSet::new(KernelKey::new, |_key, lhs, rhs| (lhs.clone(), rhs.clone()))
.with(Tunable::new(kernel_1))
.with(Tunable::new(kernel_2.ok()))
.with(Tunable::new(kernel_3)
});
TUNER.execute("hello".to_string(), &lhs.client, &tunables, (lhs, rhs));
}
§Tunable
TuneFn
is implemented automatically for all functions and closures
that take a set of cloneable inputs, and return a Result<Out, impl Into<AutotuneError>>
. If the
kernel does not return a Result
, use kernel_fn.ok()
to wrap it in Ok
and turn it into a
tunable.
§Implementation details
To implement TuneFn
for all valid tunable functions, a set of patterns is employed.
TuneFn functions don’t directly implement TuneFn
, they implement IntoTuneFn
instead. The
reason for this is that the Rust trait resolver can’t detect that traits like Fn(A, B)
and Fn(A)
are mutually exclusive. This means trying to implement TuneFn
for both would
cause conflicting implementations. To solve this problem, a Marker
generic is employed, that
stores a dummy type (like IsFunction
), along with the equivalent function pointer of the
signature (which is a type, not a trait), allowing the trait resolver to correctly identify
the implementations as distinct. However, since different kinds of TuneFn
will have different
Marker
generics, the IntoTuneFn
trait is needed to erase the marker.
This way, only Tunable::new
requires the
marker as a generic, which it then erases by calling
IntoTuneFn::into_tunable
.
The same technique is used for KeyGenerator
and
InputGenerator
.
The last set of traits are AsFunctionTunable
and
AsFunctionTunableResult
. These traits are directly
implemented by all tunable functions and allow us to annotate function-like
tunables specifically, to allow things like overriding the name, wrapping the return type in
Ok
(AsFunctionTunable::ok
), and other things. They also help with error messages. This is
done by using #[diagnostic::on_unimplemented(...)]
.
Macros§
- local_
tuner - Create a local tuner with the provided name.
Structs§
- Autotune
Outcome - The measured outcome for a given autotune invocation.
- Function
Input Generator - An input generator implemented by an
Fn
- Function
KeyGenerator - A key generator implemented by an
Fn
- Function
Tunable - Tunable implemented as a function or closure
- Function
Tunable Result Map - Tunable implemented as a function or closure that returns a plain value, wrapped in
Ok
. - Local
Tuner - A local tuner allows to create a tuner for a specific key that can be different from the server key.
- Tunable
- A tunable wraps a function that can be included in multiple groups.
- Tunable
Set - Groups operations of the same type for autotune
- Tune
Benchmark - A benchmark that runs on server handles
- Tune
Group - A tune group encapsulates a priority that can be calculated based on an autotune key.
- Tuner
- Executes autotune benchmarking and caching
Enums§
- Autotune
Error - Error from running autotune.
- Tune
Cache Result - Result of the cache try
Traits§
- AsFunction
Tunable - A function that can be turned into a tunable.
- AsFunction
Tunable Result - An infallible function that can be turned into a tunable.
- Autotune
Key - Trait alias with support for persistent caching
- Autotune
Output - The trait to be implemented by an autotune output.
- Function
Input Gen - A function that can be turned into an input generator for
Inputs
- Function
Keygen - An
Fn
that can act as a key generator - Input
Generator - A function that generates the input for autotuning passes
- Into
Input Generator - Something that can be turned into an input generator
- Into
KeyGenerator - Something that can be turned into a key generator
- Into
Tune Fn - Something that can be turned into a Tunable
- KeyGenerator
- A generator that creates a key for a given set of inputs
- TuneFn
- A tunable entry in a tunable set
Functions§
- anchor
- Anchor a number to a power of the provided base.
- compute_
checksum - Default checksum for an operation set