pub struct ModelPool { /* private fields */ }Expand description
The multi-model LRU warm-pool.
Owned by the inference worker thread; no Send + Sync requirement for the
HashMap internals because all accesses happen on one thread.
Implementations§
Source§impl ModelPool
impl ModelPool
Sourcepub fn new(capacity: usize, mem_budget_mb: usize) -> Self
pub fn new(capacity: usize, mem_budget_mb: usize) -> Self
Create a new empty pool.
capacity: maximum number of models that may be resident at once.mem_budget_mb: memory budget in MiB (0 = unlimited).
Sourcepub fn loader_register(&mut self, id: impl Into<String>, spec: ModelSpec)
pub fn loader_register(&mut self, id: impl Into<String>, spec: ModelSpec)
Register a model spec so it can be loaded on demand via acquire.
Called by the admin POST /admin/models/load route before initiating a
background load, and at startup for models listed in [router] preload.
Sourcepub fn loader(&self) -> &ModelLoader
pub fn loader(&self) -> &ModelLoader
Access the embedded loader (read-only).
Sourcepub fn acquire(
&mut self,
model_id: &str,
ext_loader: Option<&ModelLoader>,
) -> ServerResult<Arc<RwLock<LoadedModel>>>
pub fn acquire( &mut self, model_id: &str, ext_loader: Option<&ModelLoader>, ) -> ServerResult<Arc<RwLock<LoadedModel>>>
Acquire an engine for model_id.
If the model is already loaded it is promoted to the MRU position and
its Arc<RwLock<LoadedModel>> is returned immediately.
Otherwise the model is loaded synchronously (blocking the calling thread) after evicting LRU entries as needed.
The optional loader parameter allows callers (like tests) to supply
an external loader; None uses the pool’s embedded loader.
Sourcepub fn release(&self, model_id: &str)
pub fn release(&self, model_id: &str)
Decrement inflight count for a model when the caller is done with it.
Sourcepub fn unload(&mut self, model_id: &str) -> ServerResult<()>
pub fn unload(&mut self, model_id: &str) -> ServerResult<()>
Explicitly unload a model, freeing its memory.
Returns an error if the model ID is not currently loaded.
Sourcepub fn list(&self) -> Vec<ModelStatus>
pub fn list(&self) -> Vec<ModelStatus>
List the status of all known models (loaded + pending).
Sourcepub fn mark_loading(&mut self, model_id: impl Into<String>)
pub fn mark_loading(&mut self, model_id: impl Into<String>)
Mark a model as being loaded in a background task.
Sourcepub fn mark_ready(
&mut self,
model_id: &str,
engine: InferenceEngine,
mem_bytes: usize,
) -> ServerResult<()>
pub fn mark_ready( &mut self, model_id: &str, engine: InferenceEngine, mem_bytes: usize, ) -> ServerResult<()>
Mark a pending model as ready (called after a background load succeeds).
Moves the engine from the temporary pending state into the loaded map.
Sourcepub fn mark_failed(&mut self, model_id: &str, reason: String)
pub fn mark_failed(&mut self, model_id: &str, reason: String)
Mark a pending model as failed to load.
Sourcepub fn current_mem_bytes(&self) -> usize
pub fn current_mem_bytes(&self) -> usize
Total estimated bytes currently consumed by loaded models.
Auto Trait Implementations§
impl !Freeze for ModelPool
impl RefUnwindSafe for ModelPool
impl Send for ModelPool
impl Sync for ModelPool
impl Unpin for ModelPool
impl UnsafeUnpin for ModelPool
impl UnwindSafe for ModelPool
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more