pub struct SchedulerConfig {
pub max_num_batched_tokens: u32,
pub max_num_seqs: u32,
pub policy: String,
pub enable_chunked_prefill: bool,
pub long_prefill_token_threshold: u32,
pub max_num_partial_prefills: u32,
pub block_size: u32,
}Fields§
§max_num_batched_tokens: u32Maximum number of tokens processed in a single iteration
max_num_seqs: u32Maximum number of sequences that can run concurrently
policy: StringScheduling policy: “fcfs” or “priority”
enable_chunked_prefill: boolEnable chunked prefilling
long_prefill_token_threshold: u32Maximum tokens to prefill in a single iteration (vLLM’s long_prefill_token_threshold) Defaults to 4% of max_model_len if not specified
max_num_partial_prefills: u32Maximum number of sequences that can be partially prefilled concurrently (vLLM default: 1) This limits how many NEW waiting requests can start prefilling per iteration
block_size: u32Block size for KV cache (in tokens)
Implementations§
Source§impl SchedulerConfig
impl SchedulerConfig
Sourcepub fn set_default_prefill_threshold(&mut self, max_model_len: u32)
pub fn set_default_prefill_threshold(&mut self, max_model_len: u32)
Set default prefill threshold based on max model length (vLLM uses 4%) Only sets threshold if max_num_partial_prefills > 1 (matching vLLM behavior)
Trait Implementations§
Source§impl Clone for SchedulerConfig
impl Clone for SchedulerConfig
Source§fn clone(&self) -> SchedulerConfig
fn clone(&self) -> SchedulerConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for SchedulerConfig
impl Debug for SchedulerConfig
Source§impl<'de> Deserialize<'de> for SchedulerConfig
impl<'de> Deserialize<'de> for SchedulerConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for SchedulerConfig
impl RefUnwindSafe for SchedulerConfig
impl Send for SchedulerConfig
impl Sync for SchedulerConfig
impl Unpin for SchedulerConfig
impl UnwindSafe for SchedulerConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more