Skip to main content

DequantizePerTokenArgs

baracuda_kernels::quantize::dequantize_per_token

Struct DequantizePerTokenArgs

pub struct DequantizePerTokenArgs<'a, TIn: Element, TOut: IntElement> {
    pub input: TensorRef<'a, TOut, 2>,
    pub scale: TensorRef<'a, TIn, 1>,
    pub zero_point: TensorRef<'a, i32, 1>,
    pub output: TensorMut<'a, TIn, 2>,
}

Expand description

Args bundle for the dequant-per-token launch.

The TIn / TOut type parameters mirror the FW plan to keep the type vocabulary consistent: TIn is the FP type the FW consumed (and the BW + dequant produce), TOut is the int storage type the FW produced (and the dequant consumes).

Fields§

§input: TensorRef<'a, TOut, 2>

Quantized input [N, D] in int.

§scale: TensorRef<'a, TIn, 1>

Per-row scale [N] in FP.

§zero_point: TensorRef<'a, i32, 1>

Per-row zero-point [N] in i32.

§output: TensorMut<'a, TIn, 2>

Output [N, D] in FP.

Auto Trait Implementations§

impl<'a, TIn, TOut> !UnwindSafe for DequantizePerTokenArgs<'a, TIn, TOut>

impl<'a, TIn, TOut> Freeze for DequantizePerTokenArgs<'a, TIn, TOut>

impl<'a, TIn, TOut> RefUnwindSafe for DequantizePerTokenArgs<'a, TIn, TOut>
where TOut: RefUnwindSafe, TIn: RefUnwindSafe,

impl<'a, TIn, TOut> Send for DequantizePerTokenArgs<'a, TIn, TOut>
where TOut: Sync, TIn: Sync + Send,

impl<'a, TIn, TOut> Sync for DequantizePerTokenArgs<'a, TIn, TOut>
where TOut: Sync, TIn: Sync,

impl<'a, TIn, TOut> Unpin for DequantizePerTokenArgs<'a, TIn, TOut>

impl<'a, TIn, TOut> UnsafeUnpin for DequantizePerTokenArgs<'a, TIn, TOut>

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.