pub struct SchedulerConfig {
pub policy: SchedulingPolicy,
pub max_waiting_requests: usize,
pub max_running_requests: usize,
pub enable_preemption: bool,
pub enable_load_balancing: bool,
pub fair_share_weights: HashMap<String, f32>,
pub enable_sla_enforcement: bool,
pub prompt_token_estimate: bool,
pub prefill_first_until_active: Option<usize>,
pub active_decode_prefill_chunk: Option<usize>,
pub scheduler_none_prof: bool,
}Expand description
Scheduler configuration
Fields§
§policy: SchedulingPolicyScheduling policy
max_waiting_requests: usizeMaximum waiting queue size
max_running_requests: usizeMaximum running requests
enable_preemption: boolEnable request preemption
enable_load_balancing: boolEnable load balancing
Fair share weights per client
enable_sla_enforcement: boolSLA enforcement enabled
prompt_token_estimate: boolUse prompt-token metadata for initial continuous-batch admission estimates.
prefill_first_until_active: Option<usize>Prefer new prefills over early decodes until this many requests are active.
active_decode_prefill_chunk: Option<usize>Cap prefill admission chunks only while decode requests are already active.
scheduler_none_prof: boolEmit diagnostic scheduler None/SOME decisions.
Implementations§
Source§impl SchedulerConfig
impl SchedulerConfig
pub fn apply_runtime_config_snapshot( &mut self, snapshot: &RuntimeConfigSnapshot, ) -> Result<(), String>
Trait Implementations§
Source§impl Clone for SchedulerConfig
impl Clone for SchedulerConfig
Source§fn clone(&self) -> SchedulerConfig
fn clone(&self) -> SchedulerConfig
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for SchedulerConfig
impl Debug for SchedulerConfig
Source§impl Default for SchedulerConfig
impl Default for SchedulerConfig
Source§impl<'de> Deserialize<'de> for SchedulerConfig
impl<'de> Deserialize<'de> for SchedulerConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for SchedulerConfig
impl RefUnwindSafe for SchedulerConfig
impl Send for SchedulerConfig
impl Sync for SchedulerConfig
impl Unpin for SchedulerConfig
impl UnsafeUnpin for SchedulerConfig
impl UnwindSafe for SchedulerConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more