Skip to main content

Module autoscale_controller

Module autoscale_controller 

Source
Expand description

AutoscaleController - Connects autoscaling decisions to container scaling

This module provides an AutoscaleController that bridges the scheduler’s autoscaling logic with the agent’s ServiceManager to automatically scale services based on resource utilization.

It drives three orthogonal control loops over the same metrics tick:

  1. Horizontal (replica count) — the original adaptive autoscaler that grows/shrinks replicas to hit CPU / memory / RPS targets.
  2. Scale-to-zero (Phase 2) — when an adaptive service declares idle_window and min == 0, the controller reaps every replica after the service has been idle (no meaningful CPU / RPS) for the window. The proxy activator wakes it again by calling AutoscaleController::mark_active on the next inbound request.
  3. Vertical (right-sizing, Phase 3) — when an adaptive service declares a vertical block, a VpaEngine observes per-replica usage and emits CPU-millis / memory-MiB recommendations. In Recommend mode they are logged; in Auto mode they are applied via Runtime::update_container_resources, with a rolling restart fallback when the runtime cannot live-update a running container’s cgroup.

§Architecture

┌────────────────────────────────────────────────────────────────────┐
│                     AutoscaleController                            │
│  ┌─────────────────┐  ┌────────────┐  ┌──────────────────┐       │
│  │ CgroupsMetrics  │  │ Autoscaler │  │ ServiceManager   │       │
│  │    Source       │──│  + VpaEngine│──│  (scaling)       │       │
│  └─────────────────┘  └────────────┘  └──────────────────┘       │
└────────────────────────────────────────────────────────────────────┘

§Example

use zlayer_agent::autoscale_controller::AutoscaleController;
use zlayer_agent::{ServiceManager, RuntimeConfig, create_runtime};
use std::sync::Arc;
use std::time::Duration;

// Create runtime and service manager
let runtime = create_runtime(RuntimeConfig::Mock).await?;
let manager = Arc::new(ServiceManager::new(runtime.clone()));

// Create autoscale controller
let controller = AutoscaleController::new(
    manager.clone(),
    runtime.clone(),
    Duration::from_secs(10),
);

// Register services with adaptive scaling
controller.register_service("api", &scale_spec, 2).await;

// Run the autoscaling loop (in background)
let handle = tokio::spawn(async move {
    controller.run_loop().await
});

// Later, shutdown
controller.shutdown();

Structs§

AutoscaleController
Controller that connects autoscaling decisions to actual container scaling
VpaEngine
Per-container vertical-pod-autoscaler engine.
VpaRecommendation
A vertical right-sizing recommendation for a single container: the target CPU allotment in millicores and the target memory in MiB.

Constants§

DEFAULT_AUTOSCALE_INTERVAL
Default autoscaling evaluation interval

Functions§

has_adaptive_scaling
Check if any service in a deployment has adaptive scaling