pub struct SimdDualTableU32U8LookupV2<'a> { /* private fields */ }Expand description
Dual table lookup kernel - u32 to u8 lookup table kernel with custom SIMD function for combining the results. It tries to eliminate thrashing by using internally the single table kernel to write results out first to a local temporary buffer, which is saved, then it looks up the second table, only if the first table returns nonzero results - thus minimizing the number of reads from the second table. By sequencing in this order, we hope to minimize the cache thrashing.
The user is responsible for generating the lookup tables - so this can be used for different use cases, including CASE..WHEN and bitmasking/filtering. NOTE: this is not as efficient as the newer CascadingTableU32U8Lookup kernel. TODO: deprecate and remove this kernel. It doesn’t really have advantages over the others.
Implementations§
Source§impl<'a> SimdDualTableU32U8LookupV2<'a>
impl<'a> SimdDualTableU32U8LookupV2<'a>
pub fn new(lookup_table1: &'a [u8], lookup_table2: &'a [u8]) -> Self
Sourcepub fn lookup_func<F>(&mut self, values1: &[u32], values2: &[u32], f: &mut F)
pub fn lookup_func<F>(&mut self, values1: &[u32], values2: &[u32], f: &mut F)
Given two slices of equal length &u32 indices, looks up each one and calls the user given function on assembled u8x16 results.
- lookup_table1 is used for the first slice, lookup_table2 is used for the second slice.
- Only if the u8 from the first lookup table is nonzero, will the second lookup table be read.
- The user function is passed (lookedup_values1: u8x16, lookedup_values2: u8x16, start_idx: usize), where start_idx is 0 for the first chunk call, 16 for the next one, etc.
- If the slices do not divide evenly into 16-item chunks, the rest is handled by filling missing values in the u8x16 with zeroes. Thus, the lookup assumes the zero is basically a NOP.
The lookup function is passed these arguments: (lookedup_values1: u8x16, lookedup_values2: u8x16, num_bytes)
- num_bytes: usually 16, but may be less for the last/remainder chunk.
Trait Implementations§
Source§impl<'a> Clone for SimdDualTableU32U8LookupV2<'a>
impl<'a> Clone for SimdDualTableU32U8LookupV2<'a>
Source§fn clone(&self) -> SimdDualTableU32U8LookupV2<'a>
fn clone(&self) -> SimdDualTableU32U8LookupV2<'a>
1.0.0§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreAuto Trait Implementations§
impl<'a> Freeze for SimdDualTableU32U8LookupV2<'a>
impl<'a> RefUnwindSafe for SimdDualTableU32U8LookupV2<'a>
impl<'a> Send for SimdDualTableU32U8LookupV2<'a>
impl<'a> Sync for SimdDualTableU32U8LookupV2<'a>
impl<'a> Unpin for SimdDualTableU32U8LookupV2<'a>
impl<'a> UnwindSafe for SimdDualTableU32U8LookupV2<'a>
Blanket Implementations§
§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§unsafe fn clone_to_uninit(&self, dest: *mut u8)
unsafe fn clone_to_uninit(&self, dest: *mut u8)
clone_to_uninit)