pub struct CastSubBytePlan<TIn: DeviceRepr + Copy + 'static, TOut: DeviceRepr + Copy + 'static> { /* private fields */ }Expand description
Sub-byte cast plan.
TIn is the input element type, TOut is the output element type.
Both are bounded by baracuda_types::DeviceRepr only (not
baracuda_kernels_types::Element) so the dtype set can include
S4, U4, Fp8E4M3, Fp8E5M2, and Bool alongside the classic
fp / int element types.
Coverage: see the crate-level module docs for the full
supported (TIn, TOut) pair list. A select-time check rejects
any pair outside the explicit table with Error::Unsupported.
Workspace: none.
Implementations§
Source§impl<TIn: DeviceRepr + Copy + 'static, TOut: DeviceRepr + Copy + 'static> CastSubBytePlan<TIn, TOut>
impl<TIn: DeviceRepr + Copy + 'static, TOut: DeviceRepr + Copy + 'static> CastSubBytePlan<TIn, TOut>
Sourcepub fn select(
_stream: &Stream,
desc: &CastSubByteDescriptor,
_pref: PlanPreference,
) -> Result<Self>
pub fn select( _stream: &Stream, desc: &CastSubByteDescriptor, _pref: PlanPreference, ) -> Result<Self>
Pick a kernel for desc.
Sourcepub fn can_implement(&self, args: &CastSubByteArgs<'_, TIn, TOut>) -> Result<()>
pub fn can_implement(&self, args: &CastSubByteArgs<'_, TIn, TOut>) -> Result<()>
Validate args. Checks numel agreement and buffer sizing (the S4 /
U4 packed-buffer accounting collapses to numel / 2 bytes,
which is numel / 2 packed-slot elements at the buffer layer).
Sourcepub fn workspace_size(&self) -> usize
pub fn workspace_size(&self) -> usize
Workspace size in bytes. Always 0.
Sourcepub fn precision_guarantee(&self) -> PrecisionGuarantee
pub fn precision_guarantee(&self) -> PrecisionGuarantee
Numerical guarantees for this plan’s kernel.