Skip to main content

Deployment

inference_core::deployment

Struct Deployment

pub struct Deployment {
    pub name: String,
    pub model: String,
    pub runtime: Option<RuntimeKind>,
    pub runtime_config: Option<RuntimeConfig>,
    pub gpus: Option<u32>,
    pub replicas: u32,
    pub serving: Serving,
    pub budget: Option<Budget>,
    pub idempotent: bool,
}

Expand description

A model deployment. The runtime field selects the backend; every other field has a runtime-agnostic interpretation. Local deployments fill gpus; remote deployments leave it None and use serving’s max_concurrent instead.

Fields§

§name: String§model: String§runtime: Option<RuntimeKind>

Optional explicit runtime. When omitted, infer_runtime picks based on the model name (doc §3.2).

§runtime_config: Option<RuntimeConfig>

Backend-specific configuration. When omitted, defaults are used.

§gpus: Option<u32>

Local-only: number of GPUs per replica.

§replicas: u32

Number of replicas (local: HA + scale-out; remote: independent worker pools, possibly different API keys).

§serving: Serving§budget: Option<Budget>§idempotent: bool

True for normal LLM inference; false to disable retries on non-idempotent stateful APIs (doc §12.3).

Implementations§

impl Deployment

pub fn effective_runtime(&self) -> RuntimeKind

Effective runtime kind: explicit override wins, otherwise infer from the model name (doc §3.2).

pub fn validate(&self) -> Result<(), DeploymentValidationError>

Cheap structural validation done at deploy time. Heavier checks (provider tier limits, network egress) live in inference-runtime where we can perform IO.

Trait Implementations§

impl Clone for Deployment

fn clone(&self) -> Deployment

Returns a duplicate of the value. Read more

1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Debug for Deployment

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl<'de> Deserialize<'de> for Deployment

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more

impl Serialize for Deployment

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

impl Freeze for Deployment

impl RefUnwindSafe for Deployment

impl Send for Deployment

impl Sync for Deployment

impl Unpin for Deployment

impl UnsafeUnpin for Deployment

impl UnwindSafe for Deployment

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> CloneToUninit for T
where T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> ToOwned for T
where T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,