pub struct QuantizationHardwareFeatures {
pub supports_int8_simd: bool,
pub supports_int4_packed: bool,
pub supports_vnni: bool,
pub supports_dp4a: bool,
pub supports_tensor_cores: bool,
pub supports_mixed_precision: bool,
pub max_parallel_ops: usize,
}Expand description
Hardware-specific quantization features available on the current device
This struct encapsulates the hardware capabilities available for quantization operations, enabling the system to choose optimal implementations based on what the hardware supports.
Fields§
§supports_int8_simd: boolSupports INT8 SIMD operations
Indicates whether the hardware can perform vectorized INT8 operations, which significantly accelerates quantized computations.
supports_int4_packed: boolSupports packed INT4 operations
Some hardware can efficiently handle sub-byte quantization formats like INT4, where multiple values are packed into single bytes.
supports_vnni: boolSupports VNNI (Vector Neural Network Instructions)
Intel’s VNNI instructions provide hardware acceleration for neural network workloads, particularly beneficial for quantized models.
supports_dp4a: boolSupports DP4A (4-element dot product and accumulate)
NVIDIA’s DP4A instruction performs 4-element dot products in a single operation, ideal for quantized matrix operations on CUDA devices.
supports_tensor_cores: boolSupports tensor core operations
Modern GPUs include specialized tensor cores for mixed-precision and quantized neural network computations.
supports_mixed_precision: boolSupports mixed precision operations
Hardware capability to efficiently mix different quantization precisions within the same computation.
max_parallel_ops: usizeMaximum number of parallel operations
The optimal number of parallel operations for this hardware, used for scheduling and batching decisions.
Implementations§
Source§impl QuantizationHardwareFeatures
impl QuantizationHardwareFeatures
Sourcepub fn detect_for_device(device: &Device) -> Self
pub fn detect_for_device(device: &Device) -> Self
Sourcepub fn supports_dtype_efficiently(&self, dtype: &QuantizedDType) -> bool
pub fn supports_dtype_efficiently(&self, dtype: &QuantizedDType) -> bool
Sourcepub fn optimal_block_size(&self) -> usize
pub fn optimal_block_size(&self) -> usize
Get the optimal block size for parallel operations
Returns the recommended block size for batching operations based on hardware characteristics and parallelism capabilities.
Sourcepub fn performance_ranking(&self) -> Vec<QuantizationScheme>
pub fn performance_ranking(&self) -> Vec<QuantizationScheme>
Get the performance preference ranking for quantization schemes
Returns quantization schemes ordered by expected performance on this hardware, with the fastest schemes first.
Trait Implementations§
Source§impl Clone for QuantizationHardwareFeatures
impl Clone for QuantizationHardwareFeatures
Source§fn clone(&self) -> QuantizationHardwareFeatures
fn clone(&self) -> QuantizationHardwareFeatures
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for QuantizationHardwareFeatures
impl Debug for QuantizationHardwareFeatures
Auto Trait Implementations§
impl Freeze for QuantizationHardwareFeatures
impl RefUnwindSafe for QuantizationHardwareFeatures
impl Send for QuantizationHardwareFeatures
impl Sync for QuantizationHardwareFeatures
impl Unpin for QuantizationHardwareFeatures
impl UnsafeUnpin for QuantizationHardwareFeatures
impl UnwindSafe for QuantizationHardwareFeatures
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more