Skip to main content

ModelPool

Struct ModelPool 

Source
pub struct ModelPool { /* private fields */ }
Expand description

The multi-model LRU warm-pool.

Owned by the inference worker thread; no Send + Sync requirement for the HashMap internals because all accesses happen on one thread.

Implementations§

Source§

impl ModelPool

Source

pub fn new(capacity: usize, mem_budget_mb: usize) -> Self

Create a new empty pool.

  • capacity: maximum number of models that may be resident at once.
  • mem_budget_mb: memory budget in MiB (0 = unlimited).
Source

pub fn loader_register(&mut self, id: impl Into<String>, spec: ModelSpec)

Register a model spec so it can be loaded on demand via acquire.

Called by the admin POST /admin/models/load route before initiating a background load, and at startup for models listed in [router] preload.

Source

pub fn loader(&self) -> &ModelLoader

Access the embedded loader (read-only).

Source

pub fn acquire( &mut self, model_id: &str, ext_loader: Option<&ModelLoader>, ) -> ServerResult<Arc<RwLock<LoadedModel>>>

Acquire an engine for model_id.

If the model is already loaded it is promoted to the MRU position and its Arc<RwLock<LoadedModel>> is returned immediately.

Otherwise the model is loaded synchronously (blocking the calling thread) after evicting LRU entries as needed.

The optional loader parameter allows callers (like tests) to supply an external loader; None uses the pool’s embedded loader.

Source

pub fn release(&self, model_id: &str)

Decrement inflight count for a model when the caller is done with it.

Source

pub fn unload(&mut self, model_id: &str) -> ServerResult<()>

Explicitly unload a model, freeing its memory.

Returns an error if the model ID is not currently loaded.

Source

pub fn list(&self) -> Vec<ModelStatus>

List the status of all known models (loaded + pending).

Source

pub fn mark_loading(&mut self, model_id: impl Into<String>)

Mark a model as being loaded in a background task.

Source

pub fn mark_ready( &mut self, model_id: &str, engine: InferenceEngine, mem_bytes: usize, ) -> ServerResult<()>

Mark a pending model as ready (called after a background load succeeds).

Moves the engine from the temporary pending state into the loaded map.

Source

pub fn mark_failed(&mut self, model_id: &str, reason: String)

Mark a pending model as failed to load.

Source

pub fn current_mem_bytes(&self) -> usize

Total estimated bytes currently consumed by loaded models.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<A, B, T> HttpServerConnExec<A, B> for T
where B: Body,