Struct SmoothQuantLinearDescriptor

Source

#[non_exhaustive]pub struct SmoothQuantLinearDescriptor {
    pub m: i32,
    pub n: i32,
    pub k: i32,
    pub act_scale: f32,
    pub activation_element: ElementKind,
    pub weight_element: ElementKind,
    pub output_element: ElementKind,
}

Expand description

Descriptor for a SmoothQuant linear op.

The per-tensor activation scale lives in the descriptor (not the args) because in the SmoothQuant flow it’s part of the model’s frozen quantization metadata — it doesn’t change between launches for the same layer.

Fields (Non-exhaustive)§

This struct is marked as non-exhaustive

Non-exhaustive structs could have additional fields added in future. Therefore, non-exhaustive structs cannot be constructed in external crates using the traditional Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.

§m: i32

Number of token rows in the activation (and rows of the output).

§n: i32

Number of output channels (rows of weight_q, cols of output).

§k: i32

Inner reduction dim (cols of act_q and weight_q).

§act_scale: f32

Per-tensor activation scale produced by the offline SmoothQuant Python flow. Always f32 regardless of TIn — the underlying quantized_linear_w8a8 kernel does the scale multiply in float space irrespective of output dtype.

§activation_element: ElementKind

Activation int element kind. Today wired only for S8.

§weight_element: ElementKind

Weight int element kind. Today wired only for S8.

§output_element: ElementKind

Output FP element kind. Must match TIn::KIND.

Struct SmoothQuantLinearDescriptor Copy item path

Fields (Non-exhaustive)§

Implementations§

impl SmoothQuantLinearDescriptor

pub fn new<TIn: Element>(m: i32, n: i32, k: i32, act_scale: f32) -> Self

Trait Implementations§

impl Clone for SmoothQuantLinearDescriptor

fn clone(&self) -> SmoothQuantLinearDescriptor

fn clone_from(&mut self, source: &Self)

impl Copy for SmoothQuantLinearDescriptor

impl Debug for SmoothQuantLinearDescriptor

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Auto Trait Implementations§

impl Freeze for SmoothQuantLinearDescriptor

impl RefUnwindSafe for SmoothQuantLinearDescriptor

impl Send for SmoothQuantLinearDescriptor

impl Sync for SmoothQuantLinearDescriptor

impl Unpin for SmoothQuantLinearDescriptor

impl UnsafeUnpin for SmoothQuantLinearDescriptor

impl UnwindSafe for SmoothQuantLinearDescriptor

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Struct SmoothQuantLinearDescriptor

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,