#[repr(u32)]pub enum Limit {
Off = 0,
ExecCount = 1,
}Expand description
Triton server rate limit modes.
Variants§
Off = 0
ExecCount = 1
The rate limiting prioritizes the inference execution using the number of times each instance has got a chance to run.
The execution gets to run only when its resource constraints are satisfied.
Trait Implementations§
impl Copy for Limit
impl Eq for Limit
impl StructuralPartialEq for Limit
Auto Trait Implementations§
impl Freeze for Limit
impl RefUnwindSafe for Limit
impl Send for Limit
impl Sync for Limit
impl Unpin for Limit
impl UnwindSafe for Limit
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more