Skip to main content

QuantizationHardwareFeatures

Struct QuantizationHardwareFeatures 

Source
pub struct QuantizationHardwareFeatures {
    pub supports_int8_simd: bool,
    pub supports_int4_packed: bool,
    pub supports_vnni: bool,
    pub supports_dp4a: bool,
    pub supports_tensor_cores: bool,
    pub supports_mixed_precision: bool,
    pub max_parallel_ops: usize,
}
Expand description

Hardware-specific quantization features available on the current device

This struct encapsulates the hardware capabilities available for quantization operations, enabling the system to choose optimal implementations based on what the hardware supports.

Fields§

§supports_int8_simd: bool

Supports INT8 SIMD operations

Indicates whether the hardware can perform vectorized INT8 operations, which significantly accelerates quantized computations.

§supports_int4_packed: bool

Supports packed INT4 operations

Some hardware can efficiently handle sub-byte quantization formats like INT4, where multiple values are packed into single bytes.

§supports_vnni: bool

Supports VNNI (Vector Neural Network Instructions)

Intel’s VNNI instructions provide hardware acceleration for neural network workloads, particularly beneficial for quantized models.

§supports_dp4a: bool

Supports DP4A (4-element dot product and accumulate)

NVIDIA’s DP4A instruction performs 4-element dot products in a single operation, ideal for quantized matrix operations on CUDA devices.

§supports_tensor_cores: bool

Supports tensor core operations

Modern GPUs include specialized tensor cores for mixed-precision and quantized neural network computations.

§supports_mixed_precision: bool

Supports mixed precision operations

Hardware capability to efficiently mix different quantization precisions within the same computation.

§max_parallel_ops: usize

Maximum number of parallel operations

The optimal number of parallel operations for this hardware, used for scheduling and batching decisions.

Implementations§

Source§

impl QuantizationHardwareFeatures

Source

pub fn detect_for_device(device: &Device) -> Self

Detect hardware features for the given device

Performs runtime detection of available hardware acceleration features and returns a capabilities structure.

§Arguments
  • device - The target device to analyze
§Returns

A QuantizationHardwareFeatures struct with detected capabilities

Source

pub fn supports_dtype_efficiently(&self, dtype: &QuantizedDType) -> bool

Check if the hardware supports a specific quantization data type efficiently

§Arguments
  • dtype - The quantization data type to check
§Returns

true if the hardware can efficiently process this data type

Source

pub fn optimal_block_size(&self) -> usize

Get the optimal block size for parallel operations

Returns the recommended block size for batching operations based on hardware characteristics and parallelism capabilities.

Source

pub fn performance_ranking(&self) -> Vec<QuantizationScheme>

Get the performance preference ranking for quantization schemes

Returns quantization schemes ordered by expected performance on this hardware, with the fastest schemes first.

Trait Implementations§

Source§

impl Clone for QuantizationHardwareFeatures

Source§

fn clone(&self) -> QuantizationHardwareFeatures

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for QuantizationHardwareFeatures

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for QuantizationHardwareFeatures

Source§

fn default() -> Self

Conservative default hardware features

Returns a conservative set of capabilities that should work on any hardware without advanced acceleration features.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V