Skip to main content

IDynamicQuantizeLayer

Struct IDynamicQuantizeLayer 

Source
pub struct IDynamicQuantizeLayer { /* private fields */ }
Expand description

IDynamicQuantizeLayer

A network layer to perform dynamic quantization.

This layer accepts a floating-point input tensor and computes the block scale factors needed to quantize the input’s data. It outputs the quantized tensor as its first output and the scale factors as its second output.

Use ILayer::setInput to add an input for the double-quantization scale factor.

Only symmetric quantization is supported. The input tensor for this layer must not be a scalar.

Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.

Implementations§

Source§

impl IDynamicQuantizeLayer

Source

pub fn setToType(self: Pin<&mut IDynamicQuantizeLayer>, toType: DataType)

Set DynamicQuantizeLayer’s quantized output type.

  • toType The data type of the quantized output tensor.

Set the type of the dynamic quantization layer’s quantized output.If the network is strongly typed, setToType must be used to set the output type, and use of setOutputType is an error. Otherwise, types passed to setOutputType and setToType must be the same. Valid values for toType are DataType::kFP4 (NVFP4 quantization) and DataType::kFP8 (MXFP8 quantization).

See NetworkDefinitionCreationFlag::kSTRONGLY_TYPED

Source

pub fn getToType(self: &IDynamicQuantizeLayer) -> DataType

Return DynamicQuantizeLayer’s quantized output type.

toType parameter set during layer creation or by setToType().

The return value is the type of the quantized output tensor. The default value is DataType::kFP4.

Source

pub fn setScaleType(self: Pin<&mut IDynamicQuantizeLayer>, scaleType: DataType)

Set the data type of the scale factors used to quantize the data.

  • scaleType The scale factors data type.

Set the scale-factors type. Valid values are DataType::kFP8, DataType::kE8M0 or DataType::kFLOAT.

Source

pub fn getScaleType(self: &IDynamicQuantizeLayer) -> DataType

Return the scale factors data type.

scaleType parameter set during layer creation or by setScaleType().

The return value is the type of the scale factors used to quantize the dynamic data. The default value is DataType::kFP8.

Source

pub fn setAxis(self: Pin<&mut IDynamicQuantizeLayer>, axis: i32)

Set the axis along which block quantization occurs.

The axis must be the last dimension or second to last dimension. The input’s shape along the axis must be constant.

See [getAxis()]

Source

pub fn getAxis(self: &IDynamicQuantizeLayer) -> i32

Get the axis along which blocking occurs.

See [setAxis()]

Source

pub fn setBlockSize(self: Pin<&mut IDynamicQuantizeLayer>, size: i32)

Set the size of the quantization block.

Note: The block size must divide the input in the blocked axis without remainder. Valid values are 16 (NVFP4 quantization) and 32 (MXFP8 quantization).

See [getBlockSize()]

Source

pub fn getBlockSize(self: &IDynamicQuantizeLayer) -> i32

Get the size of the quantization block.

See [setBlockSize()]

Source

pub fn setBlockShape(self: Pin<&mut IDynamicQuantizeLayer>, blockShape: &Dims64)

Set the shape of the quantization block.

Note: The block shape rank must match the input rank. The default value is empty Dims.

See [getBlockShape()]

Source

pub fn getBlockShape(self: &IDynamicQuantizeLayer) -> Dims64

Get the shape of the quantization block.

The default value is empty Dims.

See [setBlockShape()]

Trait Implementations§

Source§

impl AsLayer for IDynamicQuantizeLayer

Source§

fn as_layer(&self) -> &ILayer

Source§

fn as_layer_pin_mut(&mut self) -> Pin<&mut ILayer>

Source§

impl AsLayerTyped for IDynamicQuantizeLayer

Source§

const TYPE: LayerType = LayerType::kDYNAMIC_QUANTIZE

Source§

impl AsRef<ILayer> for IDynamicQuantizeLayer

Source§

fn as_ref(self: &IDynamicQuantizeLayer) -> &ILayer

Converts this type into a shared reference of the (usually inferred) input type.
Source§

impl ExternType for IDynamicQuantizeLayer

Source§

type Id = (n, v, i, n, f, e, r, _1, (), I, D, y, n, a, m, i, c, Q, u, a, n, t, i, z, e, L, a, y, e, r)

A type-level representation of the type’s C++ namespace and type name. Read more
Source§

type Kind = Opaque

Source§

impl MakeCppStorage for IDynamicQuantizeLayer

Source§

unsafe fn allocate_uninitialized_cpp_storage() -> *mut IDynamicQuantizeLayer

Allocates heap space for this type in C++ and return a pointer to that space, but do not initialize that space (i.e. do not yet call a constructor). Read more
Source§

unsafe fn free_uninitialized_cpp_storage(arg0: *mut IDynamicQuantizeLayer)

Frees a C++ allocation which has not yet had a constructor called. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.