pub trait FilterSimd<'a, T>{
// Required method
fn filter_simd<F>(&self, f: F) -> SimdFilter<'a, T, F>
where F: Fn(&T) -> bool + 'a;
}Required Methods§
fn filter_simd<F>(&self, f: F) -> SimdFilter<'a, T, F>
Dyn Compatibility§
This trait is not dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety".
Implementations on Foreign Types§
Source§impl<'a, T> FilterSimd<'a, T> for Iter<'a, T>
impl<'a, T> FilterSimd<'a, T> for Iter<'a, T>
Source§fn filter_simd<F>(&self, f: F) -> SimdFilter<'a, T, F>
fn filter_simd<F>(&self, f: F) -> SimdFilter<'a, T, F>
This is the least optimal of all functions. current implementation relies on sparsity of elems.
This kind of pattern is fast:
[0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,0,0,0,0,0,0]
This kind of pattern is slow (similar to scalar speed):
[1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1]
The speed comes from checking if a chunks contains any wanted element.
(0..10000).collect_vec().iter().filter_simd(|x| *x % 100 == 0).collect::<Vec<i32>>()
is ~4x faster on x86 with avx2
(0..10000).collect_vec().iter().filter_simd(|x| *x % 10 == 0).collect::<Vec<i32>>()
is ~2x faster on x86 with avx2
(0..10000).collect_vec().iter().filter_simd(|x| *x % 1 == 0).collect::<Vec<i32>>()
is 30% slower than scalar on x86 with avx2
Something like this works well on all patterns on x86: