EndpointUpdateInput

Struct EndpointUpdateInput 

Source
pub struct EndpointUpdateInput {
Show 16 fields pub allowed_cuda_versions: Option<Vec<CudaVersion>>, pub cpu_flavor_ids: Option<Vec<CpuFlavorId>>, pub data_center_ids: Option<Vec<DataCenterId>>, pub execution_timeout_ms: Option<i32>, pub flashboot: Option<bool>, pub gpu_count: Option<i32>, pub gpu_type_ids: Option<Vec<GpuTypeId>>, pub idle_timeout: Option<i32>, pub name: Option<String>, pub network_volume_id: Option<String>, pub scaler_type: Option<ScalerType>, pub scaler_value: Option<i32>, pub template_id: Option<String>, pub vcpu_count: Option<i32>, pub workers_max: Option<i32>, pub workers_min: Option<i32>,
}
Expand description

Input parameters for updating an existing serverless endpoint.

This struct allows you to modify endpoint configuration and trigger a rolling release that updates all workers with the new settings. All fields are optional, allowing you to update only the properties you want to change.

§Rolling Release Process

When an endpoint is updated:

  1. Validation: New configuration is validated for compatibility
  2. Version Increment: Endpoint version number is incremented
  3. Rolling Update: Workers are gradually replaced with new configuration
  4. Traffic Migration: Requests are routed to updated workers as they become available
  5. Cleanup: Old workers are terminated once traffic migration is complete

§Important Notes

  • Zero Downtime: Updates are performed without service interruption
  • Gradual Rollout: Workers are updated in batches to maintain availability
  • Rollback: Previous versions can be restored if issues are detected
  • Template Changes: Updating template_id deploys new container images

§Examples

use runpod_sdk::model::{EndpointUpdateInput, ScalerType};

// Scale up for increased traffic
let scale_up = EndpointUpdateInput {
    workers_max: Some(20),      // Double capacity
    scaler_value: Some(2),      // More aggressive scaling
    idle_timeout: Some(10),     // Keep workers longer
    ..Default::default()
};

// Enable flash boot for better performance
let performance_upgrade = EndpointUpdateInput {
    flashboot: Some(true),
    execution_timeout_ms: Some(60000), // Reduce timeout
    ..Default::default()
};

// Switch to cost-optimized scaling
let cost_optimization = EndpointUpdateInput {
    scaler_type: Some(ScalerType::RequestCount),
    scaler_value: Some(10),     // 1 worker per 10 requests
    workers_min: Some(0),       // No reserved capacity
    flashboot: Some(false),     // Standard startup
    ..Default::default()
};

Fields§

§allowed_cuda_versions: Option<Vec<CudaVersion>>

If the endpoint is a GPU endpoint, acceptable CUDA versions for workers.

Updates the CUDA version constraints for worker allocation. Triggers rolling release to ensure all workers use compatible CUDA versions.

Note: Set to None to keep current setting unchanged.

§cpu_flavor_ids: Option<Vec<CpuFlavorId>>

If the endpoint is a CPU endpoint, list of CPU flavors for workers.

Updates the available CPU configurations for workers. The order determines rental priority for new workers.

CPU endpoints only: Ignored for GPU endpoints Note: Set to None to keep current setting unchanged.

§data_center_ids: Option<Vec<DataCenterId>>

List of data center IDs where workers can be located.

Updates the geographic distribution of workers. Existing workers in removed data centers will be gradually replaced.

Note: Set to None to keep current setting unchanged.

§execution_timeout_ms: Option<i32>

Maximum execution time in milliseconds for individual requests.

Updates the timeout for request processing. Affects new requests immediately, existing requests continue with previous timeout.

Range: 1,000ms to 3,600,000ms (1 second to 1 hour) Note: Set to None to keep current setting unchanged.

§flashboot: Option<bool>

Whether to enable flash boot for faster worker startup.

Updates the startup optimization for new workers. Affects cold start performance and per-request costs.

Trade-off: Higher per-request cost for faster startup Note: Set to None to keep current setting unchanged.

§gpu_count: Option<i32>

If the endpoint is a GPU endpoint, number of GPUs per worker.

Updates GPU allocation for new workers. Triggers rolling release to deploy workers with the new GPU configuration.

GPU endpoints only: Ignored for CPU endpoints Range: 1-8 depending on GPU type availability Note: Set to None to keep current setting unchanged.

§gpu_type_ids: Option<Vec<GpuTypeId>>

If the endpoint is a GPU endpoint, list of GPU types for workers.

Updates available GPU hardware types for workers. The order determines rental priority for new workers.

GPU endpoints only: Ignored for CPU endpoints Note: Set to None to keep current setting unchanged.

§idle_timeout: Option<i32>

Number of seconds workers can be idle before scaling down.

Updates the idle timeout for worker lifecycle management. Affects cost optimization and cold start frequency.

Range: 1-3600 seconds (1 second to 1 hour) Note: Set to None to keep current setting unchanged.

§name: Option<String>

A user-defined name for the endpoint.

Updates the display name used in dashboards and API responses. This change is applied immediately without triggering a rolling release.

Max length: 191 characters Note: Set to None to keep current name unchanged.

§network_volume_id: Option<String>

The unique ID of a network volume to attach to workers.

Updates the persistent storage attached to workers. Triggers rolling release to mount/unmount volumes on all workers.

Requirements: Volume must exist in same data centers as workers Note: Set to None to keep current volume unchanged.

§scaler_type: Option<ScalerType>

The scaling strategy for managing worker count.

Updates the auto-scaling algorithm used for worker management. Change takes effect immediately for new scaling decisions.

Strategies:

  • QueueDelay: Scale based on request wait time
  • RequestCount: Scale based on queue depth

Note: Set to None to keep current strategy unchanged.

§scaler_value: Option<i32>

The scaling sensitivity parameter.

Updates the scaling behavior sensitivity. Change takes effect immediately for new scaling decisions.

For QueueDelay: Maximum seconds requests can wait For RequestCount: Target requests per worker Range: 1-3600 Note: Set to None to keep current value unchanged.

§template_id: Option<String>

The unique ID of the template used to create the endpoint.

Updates the container image and environment configuration. Triggers rolling release to deploy all workers with the new template.

Impact: Changes container image, environment, resource allocation Rolling Release: All workers are gradually replaced Note: Set to None to keep current template unchanged.

§vcpu_count: Option<i32>

If the endpoint is a CPU endpoint, number of vCPUs per worker.

Updates CPU allocation for new workers. Triggers rolling release to deploy workers with the new CPU configuration.

CPU endpoints only: Ignored for GPU endpoints Range: 1-32 vCPUs depending on CPU flavor Note: Set to None to keep current setting unchanged.

§workers_max: Option<i32>

Maximum number of workers that can run simultaneously.

Updates the scaling limit for worker count. Change takes effect immediately for new scaling decisions.

Range: 0-1000+ depending on account limits Note: Set to None to keep current limit unchanged.

§workers_min: Option<i32>

Minimum number of workers that always remain running.

Updates the reserved capacity for immediate availability. Change triggers immediate scaling to meet the new minimum.

Range: 0-100 depending on account limits Billing: Reserved workers are always charged (at reduced rate) Note: Set to None to keep current minimum unchanged.

Trait Implementations§

Source§

impl Clone for EndpointUpdateInput

Source§

fn clone(&self) -> EndpointUpdateInput

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for EndpointUpdateInput

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for EndpointUpdateInput

Source§

fn default() -> EndpointUpdateInput

Returns the “default value” for a type. Read more
Source§

impl<'de> Deserialize<'de> for EndpointUpdateInput

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for EndpointUpdateInput

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,