1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
//! # Autotuning
//!
//! Autotuning allows running different kernels or comptime parameters to find the fastest one
//! for any given input. Kernels must implement [`TuneFn`](crate::tune::TuneFn) (see below).
//!
//! # Example
//!
//! ```ignore
//! #[derive(AutotuneKey)]
//! struct KernelKey {
//! size: u32
//! }
//!
//! fn run_kernel_tuned(lhs: Tensor, rhs: Tensor) -> Tensor {
//! static TUNER: LocalTuner<String, KernelKey> = local_tuner!();
//!
//! let tunables = TUNER.init(|| {
//! TunableSet::new(KernelKey::new, |_key, lhs, rhs| (lhs.clone(), rhs.clone()))
//! .with(Tunable::new(kernel_1))
//! .with(Tunable::new(kernel_2.ok()))
//! .with(Tunable::new(kernel_3))
//! });
//!
//! TUNER.execute("hello".to_string(), &lhs.client, &tunables, (lhs, rhs));
//! }
//! ```
//!
//! # Tunable
//!
//! [`TuneFn`](crate::tune::TuneFn) is implemented automatically for all functions and closures
//! that take a set of cloneable inputs, and return a `Result<Out, impl Into<AutotuneError>>`. If the
//! kernel does not return a [`Result`], use `kernel_fn.ok()` to wrap it in `Ok` and turn it into a
//! tunable.
//!
//! ## Implementation details
//!
//! To implement `TuneFn` for all valid tunable functions, a set of patterns is employed.
//! TuneFn functions don't directly implement `TuneFn`, they implement `IntoTuneFn` instead. The
//! reason for this is that the Rust trait resolver can't detect that traits like `Fn(A, B)`
//! and `Fn(A)` are mutually exclusive. This means trying to implement `TuneFn` for both would
//! cause conflicting implementations. To solve this problem, a `Marker` generic is employed, that
//! stores a dummy type (like `IsFunction`), along with the equivalent function pointer of the
//! signature (which is a type, not a trait), allowing the trait resolver to correctly identify
//! the implementations as distinct. However, since different kinds of `TuneFn` will have different
//! `Marker` generics, the `IntoTuneFn` trait is needed to erase the marker.
//! This way, only [`Tunable::new`](crate::tune::Tunable::new) requires the
//! marker as a generic, which it then erases by calling
//! [`IntoTuneFn::into_tunable`](crate::tune::IntoTuneFn::into_tunable).
//! The same technique is used for [`KeyGenerator`](crate::tune::KeyGenerator) and
//! [`InputGenerator`](crate::tune::InputGenerator).
//!
//! The last set of traits are [`AsFunctionTunable`](crate::tune::AsFunctionTunable) and
//! [`AsFunctionTunableResult`](crate::tune::AsFunctionTunableResult). These traits are directly
//! implemented by all tunable functions and allow us to annotate function-like
//! tunables specifically, to allow things like overriding the name, wrapping the return type in
//! `Ok` ([`AsFunctionTunable::ok`](crate::tune::AsFunctionTunable::ok)), and other things. They also help with error messages. This is
//! done by using [`#[diagnostic::on_unimplemented(...)]`](https://doc.rust-lang.org/reference/attributes/diagnostics.html#the-diagnosticon_unimplemented-attribute).
pub use *;
pub use *;
pub use *;
pub use *;
pub use *;
pub use *;
pub use AutotuneOutput;
pub use *;
pub use *;
pub use *;
pub use *;