pub struct Config {
pub models: HashMap<String, ModelConfig>,
pub policy: PolicyConfig,
pub port: u16,
pub metrics_port: u16,
pub admin_port: Option<u16>,
pub vllm_command: String,
pub checkpoint: Option<CheckpointConfig>,
pub warmup: bool,
}Expand description
Top-level configuration
Fields§
§models: HashMap<String, ModelConfig>Models to manage
policy: PolicyConfigSwitch policy configuration
port: u16Proxy port
metrics_port: u16Metrics port (0 to disable)
admin_port: Option<u16>Admin/control API port (None to disable)
vllm_command: StringCommand to use for spawning vLLM processes (default: “vllm”) Can be overridden for testing with mock-vllm
checkpoint: Option<CheckpointConfig>CRIU/CUDA checkpoint configuration. Required when any model uses CudaSuspend or Checkpoint process strategy.
warmup: boolWhether to warm up all models before accepting traffic.
When enabled, the daemon sequentially starts each model, runs a warmup inference, then sleeps it using the model’s configured eviction policy. After warmup, every model is in its warm sleeping state, so the first real request triggers a fast wake rather than a cold start.
Implementations§
Source§impl Config
impl Config
Sourcepub fn validate(&self)
pub fn validate(&self)
Validate configuration, warning about common misconfigurations.
Checks that models using CudaSuspend or Checkpoint process strategies with tensor parallelism have the required vLLM flags.
Sourcepub fn build_onwards_targets(&self) -> Result<Targets>
pub fn build_onwards_targets(&self) -> Result<Targets>
Build onwards Targets from model configs