Skip to main content

ModelConfig

Struct ModelConfig 

Source
pub struct ModelConfig {
    pub name: String,
    pub num_parameters: u64,
    pub num_active_parameters: Option<u64>,
    pub num_layers: u32,
    pub hidden_dim: u32,
    pub num_heads: u32,
    pub num_kv_heads: Option<u32>,
    pub max_seq_len: u32,
    pub sliding_window: Option<u32>,
    pub num_sliding_layers: Option<u32>,
    pub kv_cache_bytes_per_token: u64,
}

Fields§

§name: String

Model name

§num_parameters: u64

Total parameters in the model (all parameters, including inactive experts in MoE)

§num_active_parameters: Option<u64>

Active parameters used during inference (for MoE models with sparse activation) If not specified, defaults to num_parameters (dense models)

§num_layers: u32

Number of transformer layers

§hidden_dim: u32

Hidden dimension

§num_heads: u32

Number of attention heads

§num_kv_heads: Option<u32>

Number of KV heads (for GQA/MQA). If not specified, defaults to num_heads (MHA)

§max_seq_len: u32

Maximum sequence length supported

§sliding_window: Option<u32>

Sliding window size for sliding window attention layers (None = no sliding window) Only applies to layers marked as using sliding window attention

§num_sliding_layers: Option<u32>

Number of layers using sliding window attention (rest use full attention) If not specified, defaults to 0 (all layers use full attention)

§kv_cache_bytes_per_token: u64

KV cache size per token per layer (in bytes) For GQA: 2 * num_kv_heads * head_dim * bytes_per_param * num_layers For MHA: 2 * num_heads * head_dim * bytes_per_param * num_layers

Implementations§

Source§

impl ModelConfig

Source

pub fn active_parameters(&self) -> u64

Get the number of active parameters (defaults to total parameters for dense models)

Source

pub fn compute_kv_cache_size(&mut self, bytes_per_param: u32)

Calculate and set the KV cache size per token For models with sliding window attention, this calculates an average based on typical usage

Source

pub fn with_kv_cache_size(self, bytes_per_param: u32) -> Self

Initialize with KV cache size pre-computed

Source

pub fn kv_cache_size_for_sequence(&self, seq_len: u32) -> u64

Calculate total KV cache size for a sequence, accounting for sliding window

Trait Implementations§

Source§

impl Clone for ModelConfig

Source§

fn clone(&self) -> ModelConfig

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ModelConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for ModelConfig

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,