pub struct QuantizedMatmulIdParams {
pub m: u32,
pub k: u32,
pub n: u32,
pub group_size: u32,
pub bits: u32,
pub n_expert_used: u32,
pub num_experts: u32,
}Expand description
Parameters describing the expert-routed quantized matmul dimensions.
Fields§
§m: u32Number of input rows (tokens).
k: u32Inner dimension (shared between input and weight).
n: u32Number of output columns per expert.
group_size: u32Number of consecutive values sharing one scale/bias pair.
bits: u32Quantization bit width (4, 6, or 8).
n_expert_used: u32Number of experts each token is routed to (top-k).
num_experts: u32Total number of experts in the weight tensor.
Trait Implementations§
Source§impl Clone for QuantizedMatmulIdParams
impl Clone for QuantizedMatmulIdParams
Source§fn clone(&self) -> QuantizedMatmulIdParams
fn clone(&self) -> QuantizedMatmulIdParams
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for QuantizedMatmulIdParams
impl Debug for QuantizedMatmulIdParams
impl Copy for QuantizedMatmulIdParams
Auto Trait Implementations§
impl Freeze for QuantizedMatmulIdParams
impl RefUnwindSafe for QuantizedMatmulIdParams
impl Send for QuantizedMatmulIdParams
impl Sync for QuantizedMatmulIdParams
impl Unpin for QuantizedMatmulIdParams
impl UnsafeUnpin for QuantizedMatmulIdParams
impl UnwindSafe for QuantizedMatmulIdParams
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more