Skip to main content

SmoothQuantLinearArgs

Struct SmoothQuantLinearArgs 

Source
pub struct SmoothQuantLinearArgs<'a, TIn: Element, TWQ: IntElement> {
    pub act_q: TensorRef<'a, S8, 2>,
    pub weight_q: TensorRef<'a, TWQ, 2>,
    pub weight_scale: TensorRef<'a, TIn, 1>,
    pub output: TensorMut<'a, TIn, 2>,
    pub act_scale_scratch: TensorMut<'a, TIn, 1>,
}
Expand description

Args bundle for a SmoothQuant linear launch.

act_scale_scratch is a caller-owned [M] FP scratch buffer used to broadcast the descriptor’s per-tensor act_scale into the per-row form the underlying quantized_linear_w8a8 kernel consumes. Caller-owned so it can be reused across launches without re-allocation — the Plan’s workspace_size() returns 0.

Fields§

§act_q: TensorRef<'a, S8, 2>

Pre-quantized int8 activation [M, K].

§weight_q: TensorRef<'a, TWQ, 2>

Pre-smoothed-then-quantized int8 weight [N, K].

§weight_scale: TensorRef<'a, TIn, 1>

Per-output-channel weight scale [N] in FP.

§output: TensorMut<'a, TIn, 2>

FP output [M, N].

§act_scale_scratch: TensorMut<'a, TIn, 1>

Scratch for the per-row broadcast of act_scale. [M] FP. Caller-owned; reused across launches. Populated by the plan before the matmul launch.

Auto Trait Implementations§

§

impl<'a, TIn, TWQ> !UnwindSafe for SmoothQuantLinearArgs<'a, TIn, TWQ>

§

impl<'a, TIn, TWQ> Freeze for SmoothQuantLinearArgs<'a, TIn, TWQ>

§

impl<'a, TIn, TWQ> RefUnwindSafe for SmoothQuantLinearArgs<'a, TIn, TWQ>
where TWQ: RefUnwindSafe, TIn: RefUnwindSafe,

§

impl<'a, TIn, TWQ> Send for SmoothQuantLinearArgs<'a, TIn, TWQ>
where TWQ: Sync, TIn: Sync + Send,

§

impl<'a, TIn, TWQ> Sync for SmoothQuantLinearArgs<'a, TIn, TWQ>
where TWQ: Sync, TIn: Sync,

§

impl<'a, TIn, TWQ> Unpin for SmoothQuantLinearArgs<'a, TIn, TWQ>

§

impl<'a, TIn, TWQ> UnsafeUnpin for SmoothQuantLinearArgs<'a, TIn, TWQ>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.