pub struct DynamicRangeQuantizeArgs<'a, TIn: Element, TOut: IntElement> {
pub input: TensorRef<'a, TIn, 2>,
pub scale_out: TensorMut<'a, TIn, 1>,
pub output: TensorMut<'a, TOut, 2>,
}Expand description
Args bundle for a dynamic_range_quantize launch.
Compared with super::QuantizePerTokenArgs, the caller does NOT
supply scale / zero_point — those are computed by the kernel
from the runtime dynamic range. The plan writes scale[N] into the
caller-supplied scale_out buffer so a downstream dequantize step
has access to the same scale.
zero_point is implicit (= 0 for symmetric) and is not materialized.
Fields§
§input: TensorRef<'a, TIn, 2>Input [N, D] in FP.
scale_out: TensorMut<'a, TIn, 1>Per-row scale [N] in FP — written by the kernel.
output: TensorMut<'a, TOut, 2>Output [N, D] in int.
Auto Trait Implementations§
impl<'a, TIn, TOut> !UnwindSafe for DynamicRangeQuantizeArgs<'a, TIn, TOut>
impl<'a, TIn, TOut> Freeze for DynamicRangeQuantizeArgs<'a, TIn, TOut>
impl<'a, TIn, TOut> RefUnwindSafe for DynamicRangeQuantizeArgs<'a, TIn, TOut>where
TIn: RefUnwindSafe,
TOut: RefUnwindSafe,
impl<'a, TIn, TOut> Send for DynamicRangeQuantizeArgs<'a, TIn, TOut>
impl<'a, TIn, TOut> Sync for DynamicRangeQuantizeArgs<'a, TIn, TOut>
impl<'a, TIn, TOut> Unpin for DynamicRangeQuantizeArgs<'a, TIn, TOut>
impl<'a, TIn, TOut> UnsafeUnpin for DynamicRangeQuantizeArgs<'a, TIn, TOut>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more