pub struct Strategy2x4;Expand description
An inner loop implementation strategy using 2 parallel instances of the schema accumulator with a manual inner loop unroll of 4.
Trait Implementations§
Source§impl MainLoop for Strategy2x4
impl MainLoop for Strategy2x4
Source§unsafe fn main<S, L, R, A>(
loader: &Loader<S, L, R, A>,
trip_count: usize,
epilogues: usize,
) -> S::Accumulatorwhere
A: Architecture,
S: SIMDSchema<L, R, A>,
unsafe fn main<S, L, R, A>(
loader: &Loader<S, L, R, A>,
trip_count: usize,
epilogues: usize,
) -> S::Accumulatorwhere
A: Architecture,
S: SIMDSchema<L, R, A>,
The implementation here has a global unroll of 4, but the unroll factor of the main loop is actually 8.
There is a single peeled iteration at the end that handles the last group of 4 if needed.
Source§const BLOCK_SIZE: usize = 4
const BLOCK_SIZE: usize = 4
The effective number of unrolling (in terms of SIMD vectors) performed by this
kernel. For example, if
BLOCK_SIZE = 4 and the SIMD width is 8, than each iteration
of the main loop will process 4 * 8 = 32 elements. Read moreAuto Trait Implementations§
impl Freeze for Strategy2x4
impl RefUnwindSafe for Strategy2x4
impl Send for Strategy2x4
impl Sync for Strategy2x4
impl Unpin for Strategy2x4
impl UnsafeUnpin for Strategy2x4
impl UnwindSafe for Strategy2x4
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more