Skip to main content

TimeSlicePolicy

Struct TimeSlicePolicy 

Source
pub struct TimeSlicePolicy { /* private fields */ }
Expand description

Drain-first scheduling policy with a proactive background scheduler.

This policy minimizes GPU time wasted on model switches by following two principles:

  1. Never preempt a serving model. When a request arrives for a non-active model, the policy defers to the background scheduler rather than switching reactively. The only exception is the staleness bound, which forces a switch if any request has waited longer than max_wait.

  2. Switch when idle. The background scheduler periodically checks all models’ queue depths. When the active model has completely drained its queue (no pending requests, no in-flight), the scheduler switches to the model with the most waiting requests.

This is equivalent to “serve everything from the active model’s queue, then switch to whoever has the most demand.” The scheduler’s global visibility into all queue depths prevents the pathological back-and-forth switching that reactive policies cause under interleaved or dominant workloads.

In simulation across 12 workload profiles at switch costs from 2s to 20s, this policy achieves 61-94% GPU serving time vs CostAware’s 40-81% and FIFO’s 33-79%, while also delivering 2-6x lower maximum wait times.

Implementations§

Source§

impl TimeSlicePolicy

Source

pub fn new( eviction: EvictionPolicy, request_timeout: Duration, min_active_duration: Duration, max_wait: Duration, _min_quantum: Duration, tick_interval: Duration, _model_names: Vec<String>, ) -> Self

Trait Implementations§

Source§

impl SwitchPolicy for TimeSlicePolicy

Source§

fn on_pending_request<'life0, 'life1, 'async_trait>( &'life0 self, ctx: &'life1 PolicyContext, ) -> Pin<Box<dyn Future<Output = PolicyDecision> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

Called when a request arrives for an inactive model
Source§

fn prepare_switch<'life0, 'life1, 'async_trait>( &'life0 self, ctx: &'life1 mut SwitchContext, ) -> Pin<Box<dyn Future<Output = ()> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

Called before switching. Can wait for in-flight drain.
Source§

fn eviction_policy(&self) -> EvictionPolicy

Default eviction policy for models that don’t specify one
Source§

fn request_timeout(&self) -> Duration

Request timeout
Source§

fn min_active_duration(&self) -> Duration

Minimum time a model must stay active before it can be put to sleep. Prevents rapid wake/sleep thrashing that can cause GPU page faults.
Source§

fn scheduler_interval(&self) -> Option<Duration>

If Some(interval), the switcher will spawn a background scheduler that calls [schedule_tick] every interval.
Source§

fn schedule_tick(&self, ctx: &ScheduleContext) -> Option<String>

Called periodically by the background scheduler. Returns the model name to switch to, or None to stay on the current model.
Source§

fn on_switch_complete(&self, _from: &str, _to: &str, _duration: Duration)

Called after a switch completes successfully with the measured duration. Policies can use this to track empirical switch costs.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more