Expand description
Scheduler - Continuous Batching Request Scheduler
Manages multiple inference requests with dynamic batching. Supports adding/removing requests mid-inference.
§Architecture
- FIFO queue for pending requests
- Active batch with configurable max size
- Preemption support for priority requests (future)
Structs§
- Batch
- Batch of requests currently being processed
- Request
- Inference request
- Scheduler
- Continuous batching scheduler
- Scheduler
Config - Scheduler configuration
- Scheduler
Stats - Scheduler statistics
Enums§
- Priority
- Request priority (higher = more urgent)
- Request
State - Request state
Type Aliases§
- Request
Id - Unique request identifier