pub struct CascadeConfig {
pub quality_threshold: f64,
pub max_escalations: u8,
pub classifier_mode: CascadeClassifierMode,
pub window_size: usize,
pub max_cascade_tokens: Option<u32>,
pub cost_tiers: Option<Vec<String>>,
}Expand description
Configuration for cascade routing (strategy = "cascade").
Cascade routing tries providers in chain order (cheapest first), escalating to the next provider when the response is classified as degenerate (empty, repetitive, incoherent). Chain order determines cost order: first provider = cheapest.
§Limitations
The heuristic classifier detects degenerate outputs only, not semantic failures.
Use classifier_mode = "judge" for semantic quality gating (adds LLM call cost).
Fields§
§quality_threshold: f64Minimum quality score [0.0, 1.0] to accept a response without escalating. Responses scoring below this threshold trigger escalation.
max_escalations: u8Maximum number of quality-based escalations per request. Network/API errors do not count against this budget. Default: 2 (allows up to 3 providers: cheap → mid → expensive).
classifier_mode: CascadeClassifierModeQuality classifier mode: "heuristic" (default) or "judge".
Heuristic is zero-cost but detects only degenerate outputs.
Judge requires a configured summary_model and adds one LLM call per evaluation.
window_size: usizeRolling quality history window size per provider. Default: 50.
max_cascade_tokens: Option<u32>Maximum cumulative input+output tokens across all escalation levels.
When exceeded, returns the best-seen response instead of escalating further.
None disables the budget (unbounded escalation cost).
cost_tiers: Option<Vec<String>>Explicit cost ordering of provider names (cheapest first). When set, cascade routing sorts providers by their position in this list before trying them. Providers not in the list are appended after listed ones in their original chain order. When unset, chain order is used (default behavior).
Trait Implementations§
Source§impl Clone for CascadeConfig
impl Clone for CascadeConfig
Source§fn clone(&self) -> CascadeConfig
fn clone(&self) -> CascadeConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for CascadeConfig
impl Debug for CascadeConfig
Source§impl Default for CascadeConfig
impl Default for CascadeConfig
Source§impl<'de> Deserialize<'de> for CascadeConfig
impl<'de> Deserialize<'de> for CascadeConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for CascadeConfig
impl RefUnwindSafe for CascadeConfig
impl Send for CascadeConfig
impl Sync for CascadeConfig
impl Unpin for CascadeConfig
impl UnsafeUnpin for CascadeConfig
impl UnwindSafe for CascadeConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T in a tonic::Request