pub struct Orchestrator { /* private fields */ }Expand description
Orchestrator manages vLLM process lifecycle
Implementations§
Source§impl Orchestrator
impl Orchestrator
Sourcepub fn new(configs: HashMap<String, ModelConfig>) -> Self
pub fn new(configs: HashMap<String, ModelConfig>) -> Self
Create a new orchestrator with the given model configurations
Sourcepub fn with_command(
configs: HashMap<String, ModelConfig>,
vllm_command: String,
) -> Self
pub fn with_command( configs: HashMap<String, ModelConfig>, vllm_command: String, ) -> Self
Create a new orchestrator with a custom command for spawning processes
This is useful for testing with mock-vllm
Sourcepub fn with_options(
configs: HashMap<String, ModelConfig>,
vllm_command: String,
checkpoint_config: Option<CheckpointConfig>,
) -> Self
pub fn with_options( configs: HashMap<String, ModelConfig>, vllm_command: String, checkpoint_config: Option<CheckpointConfig>, ) -> Self
Create a new orchestrator with full options including checkpoint config
Sourcepub async fn process_state(&self, model: &str) -> Option<ProcessState>
pub async fn process_state(&self, model: &str) -> Option<ProcessState>
Get the current state of a model’s process
Sourcepub fn registered_models(&self) -> Vec<String>
pub fn registered_models(&self) -> Vec<String>
Get all registered model names
Sourcepub fn model_port(&self, model: &str) -> Option<u16>
pub fn model_port(&self, model: &str) -> Option<u16>
Get the configured port for a model
Sourcepub fn model_path(&self, model: &str) -> Option<String>
pub fn model_path(&self, model: &str) -> Option<String>
Get the configured model path (HuggingFace ID or local path) for a model
Sourcepub fn eviction_policy_for(&self, model: &str) -> Option<EvictionPolicy>
pub fn eviction_policy_for(&self, model: &str) -> Option<EvictionPolicy>
Get the effective eviction policy for a model (override > config)
Sourcepub fn is_checkpointed(&self, model: &str) -> bool
pub fn is_checkpointed(&self, model: &str) -> bool
Check if a model is currently in the Checkpointed state
Sourcepub fn set_eviction_policy(&self, model: &str, policy: EvictionPolicy)
pub fn set_eviction_policy(&self, model: &str, policy: EvictionPolicy)
Override the eviction policy for a model at runtime
Sourcepub async fn ensure_running(&self, model: &str) -> Result<(), OrchestratorError>
pub async fn ensure_running(&self, model: &str) -> Result<(), OrchestratorError>
Ensure a model’s process is running and ready
This will:
- Start the process if not started
- Wait for it to become healthy
- Return once the model is ready to serve requests
Sourcepub async fn wake_model(&self, model: &str) -> Result<(), OrchestratorError>
pub async fn wake_model(&self, model: &str) -> Result<(), OrchestratorError>
Wake a model from sleep
Sourcepub async fn set_checkpointed(
&self,
model: &str,
images_dir: PathBuf,
eviction: EvictionPolicy,
) -> Result<(), OrchestratorError>
pub async fn set_checkpointed( &self, model: &str, images_dir: PathBuf, eviction: EvictionPolicy, ) -> Result<(), OrchestratorError>
Put a model to sleep using the given eviction policy. Set a model’s state to Checkpointed (for CLI –restore-detached and checkpoint_path config).
Used to indicate a pre-existing checkpoint on disk so that
wake_model will run CRIU restore instead of starting fresh.
Sourcepub async fn sleep_model(
&self,
model: &str,
eviction: EvictionPolicy,
) -> Result<(), OrchestratorError>
pub async fn sleep_model( &self, model: &str, eviction: EvictionPolicy, ) -> Result<(), OrchestratorError>
The sleep sequence is:
- Apply weight strategy (vLLM sleep API if Offload/Discard)
- Apply process strategy (CudaSuspend, Checkpoint, Stop, or KeepRunning)
Sourcepub async fn force_sleep(&self, model: &str, eviction: EvictionPolicy)
pub async fn force_sleep(&self, model: &str, eviction: EvictionPolicy)
Force a model to sleep, escalating to Stop if the initial policy fails.
This is a guaranteed-cleanup method: it logs errors but never returns Err.
Used to clean up partially-woken models that hold GPU memory.