pub struct ReduceKernel { /* private fields */ }Expand description
Reduction kernel for parallel reduction operations
Implementations§
Source§impl ReduceKernel
impl ReduceKernel
Sourcepub fn with_workgroup_size(self, size: u32) -> Self
pub fn with_workgroup_size(self, size: u32) -> Self
Set the workgroup size
Sourcepub fn execute_u8(&self, _device: &GpuDevice, input: &[u8]) -> Result<Vec<u8>>
pub fn execute_u8(&self, _device: &GpuDevice, input: &[u8]) -> Result<Vec<u8>>
Execute the reduction operation on u8 data (CPU fallback).
§Output encoding
| Operation | Output format |
|---|---|
| Sum | 8-byte little-endian u64 |
| Min / Max | 1 byte |
| Mean | 4-byte little-endian f32 |
MinMax | 2 bytes [min, max] |
CountNonZero | 8-byte little-endian u64 |
| Histogram | 256 × 4-byte little-endian u32 counts |
§Arguments
_device- GPU device (CPU fallback: unused)input- Input data buffer
§Errors
Returns an error only on internal logic failures (currently infallible).
Sourcepub fn execute_f32(
&self,
_device: &GpuDevice,
input: &[f32],
) -> Result<Vec<f32>>
pub fn execute_f32( &self, _device: &GpuDevice, input: &[f32], ) -> Result<Vec<f32>>
Execute the reduction operation on f32 data (CPU fallback).
§Output encoding
| Operation | Output (Vec<f32>) |
|---|---|
| Sum | [total_sum] |
| Min / Max | [value] |
| Mean | [mean] |
MinMax | [min, max] |
CountNonZero | [count as f32] |
| Histogram | empty (not meaningful for f32) |
§Arguments
_device- GPU device (CPU fallback: unused)input- Input data buffer
§Errors
Returns an error only on internal logic failures (currently infallible).
Sourcepub fn workgroup_size(&self) -> u32
pub fn workgroup_size(&self) -> u32
Get the workgroup size
Sourcepub fn passes_required(&self, input_size: usize) -> u32
pub fn passes_required(&self, input_size: usize) -> u32
Calculate the number of reduction passes needed
Sourcepub fn estimate_flops(input_size: usize, operation: ReduceOp) -> u64
pub fn estimate_flops(input_size: usize, operation: ReduceOp) -> u64
Estimate FLOPS for the reduction
Auto Trait Implementations§
impl Freeze for ReduceKernel
impl RefUnwindSafe for ReduceKernel
impl Send for ReduceKernel
impl Sync for ReduceKernel
impl Unpin for ReduceKernel
impl UnsafeUnpin for ReduceKernel
impl UnwindSafe for ReduceKernel
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more