multiversion/lib.rs
1#![allow(clippy::needless_doctest_main)]
2//! This crate provides the [`multiversion`] attribute for implementing function multiversioning.
3//!
4//! Many CPU architectures have a variety of instruction set extensions that provide additional
5//! functionality. Common examples are single instruction, multiple data (SIMD) extensions such as
6//! SSE and AVX on x86/x86-64 and NEON on ARM/AArch64. When available, these extended features can
7//! provide significant speed improvements to some functions. These optional features cannot be
8//! haphazardly compiled into programs–executing an unsupported instruction will result in a
9//! crash.
10//!
11//! **Function multiversioning** is the practice of compiling multiple versions of a function
12//! with various features enabled and safely detecting which version to use at runtime.
13//!
14//! # Cargo features
15//! There is one cargo feature, `std`, enabled by default. When enabled, [`multiversion`] will
16//! use CPU feature detection at runtime to dispatch the appropriate function. Disabling this
17//! feature will only allow compile-time function dispatch using `#[cfg(target_feature)]` and can
18//! be used in `#[no_std]` crates.
19//!
20//! # Capabilities
21//! The intention of this crate is to allow nearly any function to be multiversioned.
22//! The following cases are not supported:
23//! * functions that use `self` or `Self`
24//! * `impl Trait` return types (arguments are fine)
25//!
26//! If any other functions do not work please file an issue on GitHub.
27//!
28//! # Target specification strings
29//! Targets are specified as a combination of architecture (as specified in [`target_arch`]) and
30//! feature (as specified in [`target_feature`]).
31//!
32//! A target can be specified as:
33//! * `"arch"`
34//! * `"arch+feature"`
35//! * `"arch+feature1+feature2"`
36//!
37//! A particular CPU can also be specified with a slash:
38//! * `"arch/cpu"`
39//! * `"arch/cpu+feature"`
40//!
41//! The following are some valid target specification strings:
42//! * `"x86"` (matches the `"x86"` architecture)
43//! * `"x86_64+avx+avx2"` (matches the `"x86_64"` architecture with the `"avx"` and `"avx2"`
44//! features)
45//! * `"x86_64/x86-64-v2"` (matches the `"x86_64"` architecture with the `"x86-64-v2"` CPU)
46//! * `"x86/i686+avx"` (matches the `"x86"` architecture with the `"i686"` CPU and `"avx"`
47//! feature)
48//! * `"arm+neon"` (matches the `arm` architecture with the `"neon"` feature
49//!
50//! A complete list of available target features and CPUs is available in the [`target-features`
51//! crate documentation](target_features::docs).
52//!
53//! [`target`]: attr.target.html
54//! [`multiversion`]: attr.multiversion.html
55//! [`target_arch`]: https://doc.rust-lang.org/reference/conditional-compilation.html#target_arch
56//! [`target_feature`]: https://doc.rust-lang.org/reference/conditional-compilation.html#target_feature
57
58/// Provides function multiversioning.
59///
60/// The annotated function is compiled multiple times, once for each target, and the
61/// best target is selected at runtime.
62///
63/// Options:
64/// * `targets`
65/// * Takes a list of targets, such as `targets("x86_64+avx2", "x86_64+sse4.1")`.
66/// * Target priority is first to last. The first matching target is used.
67/// * May also take a special value `targets = "simd"` to automatically multiversion for common
68/// SIMD target features.
69/// * `attrs`
70/// * Takes a list of attributes to attach to each target clone function.
71/// * `dispatcher`
72/// * Selects the preferred dispatcher. Defaults to `default`.
73/// * `default`: If the `std` feature is enabled, uses either `direct` or `indirect`,
74/// attempting to choose the fastest choice. If the `std` feature is not enabled, uses `static`.
75/// * `static`: Detects features at compile time from the enabled target features.
76/// * `indirect`: Detect features at runtime, and dispatches with an indirect function call.
77/// Cannot be used for generic functions, `async` functions, or functions that take or return an
78/// `impl Trait`. This is usually the default.
79/// * `direct`: Detects features at runtime, and dispatches with direct function calls. This is
80/// the default on functions that do not support indirect dispatch, or in the presence of
81/// indirect branch exploit mitigations such as retpolines.
82///
83/// # Example
84/// This function is a good candidate for optimization using SIMD.
85/// The following compiles `square` three times, once for each target and once for the generic
86/// target. Calling `square` selects the appropriate version at runtime.
87///
88/// ```
89/// use multiversion::multiversion;
90///
91/// #[multiversion(targets("x86_64+avx", "x86+sse"))]
92/// fn square(x: &mut [f32]) {
93/// for v in x {
94/// *v *= *v
95/// }
96/// }
97/// ```
98///
99/// This example is similar, but targets all supported SIMD instruction sets (not just the two shown above):
100///
101/// ```
102/// use multiversion::multiversion;
103///
104/// #[multiversion(targets = "simd")]
105/// fn square(x: &mut [f32]) {
106/// for v in x {
107/// *v *= *v
108/// }
109/// }
110/// ```
111///
112/// # Notes on dispatcher performance
113///
114/// ### Feature detection is performed only once
115/// The `direct` and `indirect` dispatchers perform function selection on the first invocation.
116/// This is implemented with a static atomic variable containing the selected function.
117///
118/// This implementation has a few benefits:
119/// * The function selector is typically only invoked once. Subsequent calls are reduced to an
120/// atomic load.
121/// * If called in multiple threads, there is no contention. Both threads may perform feature
122/// detection, but the atomic ensures these are synchronized correctly.
123///
124/// ### Dispatcher elision
125/// If the optimal set of features is already known to exist at compile time, the entire dispatcher
126/// is elided. For example, if the highest priority target requires `avx512f` and the function is
127/// compiled with `RUSTFLAGS=-Ctarget-cpu=skylake-avx512`, the function is not multiversioned and
128/// the highest priority target is used.
129///
130/// [`target`]: attr.target.html
131/// [`multiversion`]: attr.multiversion.html
132pub use multiversion_macros::multiversion;
133
134/// Provides a less verbose equivalent to the `cfg(target_arch)` and `target_feature` attributes.
135///
136/// A function tagged with `#[target("x86_64+avx+avx2")]`, for example, is equivalent to a
137/// function tagged with each of:
138/// * `#[cfg(target_arch = "x86_64")]`
139/// * `#[target_feature(enable = "avx")]`
140/// * `#[target_feature(enable = "avx2")]`
141///
142/// The [`target`] attribute is intended to be used in tandem with the [`multiversion`] attribute
143/// to produce hand-written multiversioned functions.
144///
145/// [`target`]: attr.target.html
146/// [`multiversion`]: attr.multiversion.html
147pub use multiversion_macros::target;
148
149/// Inherit the `target_feature` attributes of the selected target in a multiversioned function.
150///
151/// # Example
152/// ```
153/// use multiversion::{multiversion, inherit_target};
154/// #[multiversion(targets = "simd")]
155/// fn select_sum() -> unsafe fn(x: &mut[f32]) -> f32 {
156/// #[inherit_target]
157/// unsafe fn sum(x: &mut[f32]) -> f32 {
158/// x.iter().sum()
159/// }
160/// sum as unsafe fn(&mut[f32]) -> f32
161/// }
162pub use multiversion_macros::inherit_target;
163
164/// Information related to the current target.
165pub mod target {
166 // used by docs
167 #[allow(unused)]
168 use super::*;
169
170 /// Get the selected target in a multiversioned function.
171 ///
172 /// Returns the selected target as a [`Target`].
173 ///
174 /// This macro only works in a function marked with [`multiversion`].
175 ///
176 /// # Example
177 /// ```
178 /// use multiversion::{multiversion, target::selected_target};
179 ///
180 /// #[multiversion(targets = "simd")]
181 /// fn foo() {
182 /// if selected_target!().supports_feature_str("avx") {
183 /// println!("AVX detected");
184 /// } else {
185 /// println!("AVX not detected");
186 /// }
187 /// }
188 pub use multiversion_macros::selected_target;
189
190 /// Equivalent to `#[cfg]`, but considers `target_feature`s detected at runtime.
191 ///
192 /// This macro only works in a function marked with [`multiversion`].
193 pub use multiversion_macros::target_cfg;
194
195 /// Equivalent to `#[cfg_attr]`, but considers `target_feature`s detected at runtime.
196 ///
197 /// This macro only works in a function marked with [`multiversion`].
198 pub use multiversion_macros::target_cfg_attr;
199
200 /// Match the selected target.
201 ///
202 /// Matching is done at compile time, as if by `#[cfg]`. Target matching considers both
203 /// detected features and statically-enabled features. Arms that do not match are not
204 /// compiled.
205 ///
206 /// This macro only works in a function marked with [`multiversion`].
207 ///
208 /// # Example
209 /// ```
210 /// use multiversion::{multiversion, target::match_target};
211 ///
212 /// #[multiversion(targets = "simd")]
213 /// fn foo() {
214 /// match_target! {
215 /// "x86_64+avx" => println!("x86-64 with AVX"),
216 /// "aarch64+neon" => println!("AArch64 with Neon"),
217 /// _ => println!("another architecture"),
218 /// }
219 /// }
220 /// ```
221 pub use multiversion_macros::match_target;
222
223 /// Equivalent to `cfg!`, but considers `target_feature`s detected at runtime.
224 ///
225 /// This macro only works in a function marked with [`multiversion`].
226 pub use multiversion_macros::target_cfg_f;
227
228 #[doc(hidden)]
229 pub use multiversion_macros::{
230 match_target_impl, target_cfg_attr_impl, target_cfg_f_impl, target_cfg_impl,
231 };
232
233 #[doc(no_inline)]
234 pub use target_features::Target;
235}
236
237#[doc(hidden)]
238pub use target_features;