Module autoscale_controller

Expand description

AutoscaleController - Connects autoscaling decisions to container scaling

This module provides an AutoscaleController that bridges the scheduler’s autoscaling logic with the agent’s ServiceManager to automatically scale services based on resource utilization.

It drives three orthogonal control loops over the same metrics tick:

Horizontal (replica count) — the original adaptive autoscaler that grows/shrinks replicas to hit CPU / memory / RPS targets.
Scale-to-zero (Phase 2) — when an adaptive service declares idle_window and min == 0, the controller reaps every replica after the service has been idle (no meaningful CPU / RPS) for the window. The proxy activator wakes it again by calling AutoscaleController::mark_active on the next inbound request.
Vertical (right-sizing, Phase 3) — when an adaptive service declares a vertical block, a VpaEngine observes per-replica usage and emits CPU-millis / memory-MiB recommendations. In Recommend mode they are logged; in Auto mode they are applied via Runtime::update_container_resources, with a rolling restart fallback when the runtime cannot live-update a running container’s cgroup.

§Architecture

┌────────────────────────────────────────────────────────────────────┐
│                     AutoscaleController                            │
│  ┌─────────────────┐  ┌────────────┐  ┌──────────────────┐       │
│  │ CgroupsMetrics  │  │ Autoscaler │  │ ServiceManager   │       │
│  │    Source       │──│  + VpaEngine│──│  (scaling)       │       │
│  └─────────────────┘  └────────────┘  └──────────────────┘       │
└────────────────────────────────────────────────────────────────────┘

§Example

use zlayer_agent::autoscale_controller::AutoscaleController;
use zlayer_agent::{ServiceManager, RuntimeConfig, create_runtime};
use std::sync::Arc;
use std::time::Duration;

// Create runtime and service manager
let runtime = create_runtime(RuntimeConfig::Mock).await?;
let manager = Arc::new(ServiceManager::new(runtime.clone()));

// Create autoscale controller
let controller = AutoscaleController::new(
    manager.clone(),
    runtime.clone(),
    Duration::from_secs(10),
);

// Register services with adaptive scaling
controller.register_service("api", &scale_spec, 2).await;

// Run the autoscaling loop (in background)
let handle = tokio::spawn(async move {
    controller.run_loop().await
});

// Later, shutdown
controller.shutdown();

Structs§

AutoscaleController: Controller that connects autoscaling decisions to actual container scaling
VpaEngine: Per-container vertical-pod-autoscaler engine.
VpaRecommendation: A vertical right-sizing recommendation for a single container: the target CPU allotment in millicores and the target memory in MiB.

Constants§

DEFAULT_AUTOSCALE_INTERVAL: Default autoscaling evaluation interval

Functions§

has_adaptive_scaling: Check if any service in a deployment has adaptive scaling

Module autoscale_controller

Module autoscale_controller Copy item path

§Architecture

§Example

Structs§

Constants§

Functions§

Module autoscale_controller