Expand description
This crate aims to implement minimally costly optimization barriers for
every architecture that has asm!()
support (currently x86(_64),
32-bit ARM, AArch64 and RISC-V, more possible on nightly via the
asm_experimental_arch
unstable feature).
You can use these barriers to prevent the compiler from optimizing out selected redundant or unnecessary computations in situations where such optimization is undesirable. The most typical usage scenario is microbenchmarking, but there might also be applications to cryptography or low-level development, where optimization must also be controlled.
§Implementations
The barriers will be implemented for any type from core/std that either…
- Can be shoved into CPU registers, with “natural” target registers dictated by normal ABI calling conventions.
- Can be losslessly converted back and forth to a set of values that have this property, in a manner that is easily optimized out.
Any type which is not directly supported can still be subjected to an
optimization barrier by taking a reference to it and subjecting that
reference to an optimization barrier, at the cost of causing the value to
be spilled to memory. If the nightly default_impl
feature is enabled, the
crate will provide a default Pessimize
impl that does this for you.
You can tell which types implement Pessimize
on your compiler target by
running cargo doc
and checking the implementor list of Pessimize
and
BorrowPessimize
.
To implement Pessimize
for your own types, you should consider
implementing PessimizeCast
and BorrowPessimize
, which make the job a
bit easier. Pessimize
is automatically implemented for any type that
implements BorrowPessimize
.
§Semantics
For pointer-like entities, optimization barriers other than hide
can
have the side-effect of causing the compiler to assume that global and
thread-local variables might have been accessed using similar semantics as
the pointer itself. This will reduce applicable compiler optimizations for
such variables, so the use of hide
should be favored whenever global or
thread-local variables are used (or you don’t know if they are used).
In general, barriers other than hide
have more avenues for surprising
behavior (see their documentation for details), so you should strive to do
what you want with hide
if possible, and only reach for other barriers
where the extra expressive power of these primitives is truly needed.
While the barriers will accept zero-sized types such as PhantomData
, they
will only be effective for those that access global or thread-local state,
like std::alloc::System
does. That is because without such external state,
zero-sized objects do not own or provide access to any information, so the
compiler can trivially infer that the optimization barrier cannot read or
modify any internal state. Implementations of Pessimize
on such types are
only provided to ease automatic derivation of Pessimize
like tuples (and
hopefully custom structs too in the future).
The documentation of the top-level functions (hide
, assume_read
,
consume
, assume_accessed
and assume_accessed_imut
) contain more
details on the optimization barrier that is being implemented.
§When to use this crate
You should consider use of this crate over core::hint::black_box
, or
third party cousins thereof, because…
- It works on stable Rust
- It has a better-defined API contract with stronger guarantees (unlike
core::hint::black_box
, where “do nothing” is a valid implementation). - It exposes finer-grained operations, which clarify your code’s intent and reduce harmful side-effects.
The main drawbacks of this crate’s approach being that…
- It only works on selected hardware architectures (though they are the ones on which you are most likely to run benchmarks, and it should get better over time as more inline assembly architectures get stabilized).
- It needs a lot of tricky unsafe code.
Modules§
- arch
- Hardware-specific functionality
Traits§
- Borrow
Pessimize - Extract references to
Pessimize
values from references toSelf
(Pessimize
impl helper) - Pessimize
- Optimization barriers provided by this crate
- Pessimize
Cast - Convert
Self
back and forth to aPessimize
impl (Pessimize
impl helper)
Functions§
- assume_
accessed - Force the compiler to assume that any data transitively reachable via a pointer/reference has been read, and modified if Rust rules allow for it.
- assume_
accessed_ imut - Variant of
assume_accessed
for internally mutable types - assume_
globals_ accessed - Assume that all global and thread-local variables have been read and modified
- assume_
globals_ read - Assume that all global and thread-local variables have been read
- assume_
read - Force the compiler to assume that a value, and data transitively reachable via that value (for pointers/refs), is being used if Rust rules allow for it.
- consume
- Like
assume_read
, but by value - hide
- Re-emit the input value as its output (identity function), but force the compiler to assume that it is a completely different value.
- impl_
assume_ accessed_ via_ extract_ pessimized - Implementation of
BorrowPessimize::assume_accessed_impl
for types where there is a way to get aT::Pessimized
from an&mut T
- impl_
assume_ accessed_ via_ extract_ self - Implementation of
BorrowPessimize::assume_accessed_impl
for types where there is a cheap way to extract the innerT
from an&mut T
- impl_
with_ pessimize_ via_ copy - Implementation of
BorrowPessimize::with_pessimize
forCopy
types