pub struct ModelConfig {Show 14 fields
pub model_id: ModelId,
pub model_path: String,
pub model_type: ModelType,
pub dtype: DataType,
pub device: Device,
pub max_batch_size: usize,
pub max_sequence_length: usize,
pub tensor_parallel_size: Option<usize>,
pub pipeline_parallel_size: Option<usize>,
pub quantization: Option<QuantizationConfig>,
pub use_flash_attention: bool,
pub use_paged_attention: bool,
pub enable_cuda_graphs: bool,
pub extra_config: HashMap<String, Value>,
}Expand description
Model configuration for runtime
Fields§
§model_id: ModelIdModel identifier
model_path: StringPath to model files
model_type: ModelTypeModel type/architecture
dtype: DataTypeData type to use for inference
device: DeviceTarget device
max_batch_size: usizeMaximum batch size
max_sequence_length: usizeMaximum sequence length
tensor_parallel_size: Option<usize>Tensor parallelism size
pipeline_parallel_size: Option<usize>Pipeline parallelism size
quantization: Option<QuantizationConfig>Quantization configuration
use_flash_attention: boolUse flash attention if available
use_paged_attention: boolUse paged attention for KV cache
enable_cuda_graphs: boolEnable CUDA graphs for optimization
extra_config: HashMap<String, Value>Additional configuration parameters
Implementations§
Trait Implementations§
Source§impl Clone for ModelConfig
impl Clone for ModelConfig
Source§fn clone(&self) -> ModelConfig
fn clone(&self) -> ModelConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for ModelConfig
impl Debug for ModelConfig
Source§impl<'de> Deserialize<'de> for ModelConfig
impl<'de> Deserialize<'de> for ModelConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for ModelConfig
impl RefUnwindSafe for ModelConfig
impl Send for ModelConfig
impl Sync for ModelConfig
impl Unpin for ModelConfig
impl UnsafeUnpin for ModelConfig
impl UnwindSafe for ModelConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more