pub struct PipelinedSingleTableU32U8Lookup<'a> { /* private fields */ }Expand description
Pipelined single table lookup kernel - u32 to u8 lookup table kernel with prefetch pipelining
This version pipelines prefetch operations with the actual lookup work to hide memory latency. The algorithm works as follows:
- Read values from current chunk addresses
- Prefetch next chunk addresses while processing current values
- Call SIMD lookup function on current values
- Loop to next chunk
This pipelining allows memory prefetch latency to be hidden behind computation work.
Results: End Nov 2025: On Intel boxes this gives slight advantage - maybe up to 5%. Regular scalar reads though, even with large random Lookup tables, are done well enough that this doesn’t help much.
Implementations§
Source§impl<'a> PipelinedSingleTableU32U8Lookup<'a>
impl<'a> PipelinedSingleTableU32U8Lookup<'a>
pub fn new(lookup_table: &'a [u8]) -> Self
Sourcepub fn lookup_func<F>(&self, values: &[u32], f: &mut F)
pub fn lookup_func<F>(&self, values: &[u32], f: &mut F)
Pipelined lookup function that prefetches the next chunk while processing the current one
The pipelining strategy:
- Process chunks of 16 u32 values at a time
- For each chunk: prefetch next chunk addresses, then process current chunk
- This hides prefetch latency behind the lookup computation work
Sourcepub fn lookup_into_vec(&self, values: &[u32], buffer: &mut Vec<u8>)
pub fn lookup_into_vec(&self, values: &[u32], buffer: &mut Vec<u8>)
Convenience function which does lookup and writes the results into a Vec
Trait Implementations§
Source§impl<'a> Clone for PipelinedSingleTableU32U8Lookup<'a>
impl<'a> Clone for PipelinedSingleTableU32U8Lookup<'a>
Source§fn clone(&self) -> PipelinedSingleTableU32U8Lookup<'a>
fn clone(&self) -> PipelinedSingleTableU32U8Lookup<'a>
Returns a duplicate of the value. Read more
1.0.0§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl<'a> Freeze for PipelinedSingleTableU32U8Lookup<'a>
impl<'a> RefUnwindSafe for PipelinedSingleTableU32U8Lookup<'a>
impl<'a> Send for PipelinedSingleTableU32U8Lookup<'a>
impl<'a> Sync for PipelinedSingleTableU32U8Lookup<'a>
impl<'a> Unpin for PipelinedSingleTableU32U8Lookup<'a>
impl<'a> UnwindSafe for PipelinedSingleTableU32U8Lookup<'a>
Blanket Implementations§
§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§unsafe fn clone_to_uninit(&self, dest: *mut u8)
unsafe fn clone_to_uninit(&self, dest: *mut u8)
🔬This is a nightly-only experimental API. (
clone_to_uninit)