Struct EndpointUpdateInput

Source

pub struct EndpointUpdateInput {Show 16 fields
    pub allowed_cuda_versions: Option<Vec<CudaVersion>>,
    pub cpu_flavor_ids: Option<Vec<CpuFlavorId>>,
    pub data_center_ids: Option<Vec<DataCenterId>>,
    pub execution_timeout_ms: Option<i32>,
    pub flashboot: Option<bool>,
    pub gpu_count: Option<i32>,
    pub gpu_type_ids: Option<Vec<GpuTypeId>>,
    pub idle_timeout: Option<i32>,
    pub name: Option<String>,
    pub network_volume_id: Option<String>,
    pub scaler_type: Option<ScalerType>,
    pub scaler_value: Option<i32>,
    pub template_id: Option<String>,
    pub vcpu_count: Option<i32>,
    pub workers_max: Option<i32>,
    pub workers_min: Option<i32>,
}

Expand description

Input parameters for updating an existing serverless endpoint.

This struct allows you to modify endpoint configuration and trigger a rolling release that updates all workers with the new settings. All fields are optional, allowing you to update only the properties you want to change.

§Rolling Release Process

When an endpoint is updated:

Validation: New configuration is validated for compatibility
Version Increment: Endpoint version number is incremented
Rolling Update: Workers are gradually replaced with new configuration
Traffic Migration: Requests are routed to updated workers as they become available
Cleanup: Old workers are terminated once traffic migration is complete

§Important Notes

Zero Downtime: Updates are performed without service interruption
Gradual Rollout: Workers are updated in batches to maintain availability
Rollback: Previous versions can be restored if issues are detected
Template Changes: Updating template_id deploys new container images

§Examples

use runpod_sdk::model::{EndpointUpdateInput, ScalerType};

// Scale up for increased traffic
let scale_up = EndpointUpdateInput {
    workers_max: Some(20),      // Double capacity
    scaler_value: Some(2),      // More aggressive scaling
    idle_timeout: Some(10),     // Keep workers longer
    ..Default::default()
};

// Enable flash boot for better performance
let performance_upgrade = EndpointUpdateInput {
    flashboot: Some(true),
    execution_timeout_ms: Some(60000), // Reduce timeout
    ..Default::default()
};

// Switch to cost-optimized scaling
let cost_optimization = EndpointUpdateInput {
    scaler_type: Some(ScalerType::RequestCount),
    scaler_value: Some(10),     // 1 worker per 10 requests
    workers_min: Some(0),       // No reserved capacity
    flashboot: Some(false),     // Standard startup
    ..Default::default()
};

Fields§

§allowed_cuda_versions: Option<Vec<CudaVersion>>

If the endpoint is a GPU endpoint, acceptable CUDA versions for workers.

Updates the CUDA version constraints for worker allocation. Triggers rolling release to ensure all workers use compatible CUDA versions.

Note: Set to None to keep current setting unchanged.

§cpu_flavor_ids: Option<Vec<CpuFlavorId>>

If the endpoint is a CPU endpoint, list of CPU flavors for workers.

Updates the available CPU configurations for workers. The order determines rental priority for new workers.

CPU endpoints only: Ignored for GPU endpoints Note: Set to None to keep current setting unchanged.

§data_center_ids: Option<Vec<DataCenterId>>

List of data center IDs where workers can be located.

Updates the geographic distribution of workers. Existing workers in removed data centers will be gradually replaced.

Note: Set to None to keep current setting unchanged.

§execution_timeout_ms: Option<i32>

Maximum execution time in milliseconds for individual requests.

Updates the timeout for request processing. Affects new requests immediately, existing requests continue with previous timeout.

Range: 1,000ms to 3,600,000ms (1 second to 1 hour) Note: Set to None to keep current setting unchanged.

§flashboot: Option<bool>

Whether to enable flash boot for faster worker startup.

Updates the startup optimization for new workers. Affects cold start performance and per-request costs.

Trade-off: Higher per-request cost for faster startup Note: Set to None to keep current setting unchanged.

§gpu_count: Option<i32>

If the endpoint is a GPU endpoint, number of GPUs per worker.

Updates GPU allocation for new workers. Triggers rolling release to deploy workers with the new GPU configuration.

GPU endpoints only: Ignored for CPU endpoints Range: 1-8 depending on GPU type availability Note: Set to None to keep current setting unchanged.

§gpu_type_ids: Option<Vec<GpuTypeId>>

If the endpoint is a GPU endpoint, list of GPU types for workers.

Updates available GPU hardware types for workers. The order determines rental priority for new workers.

GPU endpoints only: Ignored for CPU endpoints Note: Set to None to keep current setting unchanged.

§idle_timeout: Option<i32>

Number of seconds workers can be idle before scaling down.

Updates the idle timeout for worker lifecycle management. Affects cost optimization and cold start frequency.

Range: 1-3600 seconds (1 second to 1 hour) Note: Set to None to keep current setting unchanged.

§name: Option<String>

A user-defined name for the endpoint.

Updates the display name used in dashboards and API responses. This change is applied immediately without triggering a rolling release.

Max length: 191 characters Note: Set to None to keep current name unchanged.

§network_volume_id: Option<String>

The unique ID of a network volume to attach to workers.

Updates the persistent storage attached to workers. Triggers rolling release to mount/unmount volumes on all workers.

Requirements: Volume must exist in same data centers as workers Note: Set to None to keep current volume unchanged.

§scaler_type: Option<ScalerType>

The scaling strategy for managing worker count.

Updates the auto-scaling algorithm used for worker management. Change takes effect immediately for new scaling decisions.

Strategies:

QueueDelay: Scale based on request wait time
RequestCount: Scale based on queue depth

Note: Set to None to keep current strategy unchanged.

§scaler_value: Option<i32>

The scaling sensitivity parameter.

Updates the scaling behavior sensitivity. Change takes effect immediately for new scaling decisions.

For QueueDelay: Maximum seconds requests can wait For RequestCount: Target requests per worker Range: 1-3600 Note: Set to None to keep current value unchanged.

§template_id: Option<String>

The unique ID of the template used to create the endpoint.

Updates the container image and environment configuration. Triggers rolling release to deploy all workers with the new template.

Impact: Changes container image, environment, resource allocation Rolling Release: All workers are gradually replaced Note: Set to None to keep current template unchanged.

§vcpu_count: Option<i32>

If the endpoint is a CPU endpoint, number of vCPUs per worker.

Updates CPU allocation for new workers. Triggers rolling release to deploy workers with the new CPU configuration.

CPU endpoints only: Ignored for GPU endpoints Range: 1-32 vCPUs depending on CPU flavor Note: Set to None to keep current setting unchanged.

§workers_max: Option<i32>

Maximum number of workers that can run simultaneously.

Updates the scaling limit for worker count. Change takes effect immediately for new scaling decisions.

Range: 0-1000+ depending on account limits Note: Set to None to keep current limit unchanged.

§workers_min: Option<i32>

Minimum number of workers that always remain running.

Updates the reserved capacity for immediate availability. Change triggers immediate scaling to meet the new minimum.

Range: 0-100 depending on account limits Billing: Reserved workers are always charged (at reduced rate) Note: Set to None to keep current minimum unchanged.

Struct EndpointUpdateInput Copy item path

§Rolling Release Process

§Important Notes

§Examples

Fields§

Trait Implementations§

impl Clone for EndpointUpdateInput

fn clone(&self) -> EndpointUpdateInput

fn clone_from(&mut self, source: &Self)

impl Debug for EndpointUpdateInput

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for EndpointUpdateInput

fn default() -> EndpointUpdateInput

impl<'de> Deserialize<'de> for EndpointUpdateInput

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

impl Serialize for EndpointUpdateInput

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where __S: Serializer,

Auto Trait Implementations§

impl Freeze for EndpointUpdateInput

impl RefUnwindSafe for EndpointUpdateInput

impl Send for EndpointUpdateInput

impl Sync for EndpointUpdateInput

impl Unpin for EndpointUpdateInput

impl UnwindSafe for EndpointUpdateInput

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> PolicyExt for Twhere T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>where T: Policy<B, E>, P: Policy<B, E>,

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>where S: Into<Dispatch>,

fn with_current_subscriber(self) -> WithDispatch<Self>

impl<T> DeserializeOwned for Twhere T: for<'de> Deserialize<'de>,

Struct EndpointUpdateInput

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,