pub struct RouterProvider { /* private fields */ }Expand description
Multi-provider LLM router implementing LlmProvider.
Construct with RouterProvider::new and configure a routing strategy via the
builder methods. All configuration is immutable after construction except for
runtime state (EMA statistics, Thompson distribution, bandit weights) which is
stored behind Arc<Mutex<_>> and updated on every successful call.
Cloning is cheap: RouterState and all per-strategy state are Arc-wrapped
and shared between the original and all clones — clone cost is proportional to
the number of Arc fields, not to provider count or strategy complexity.
Implementations§
Source§impl RouterProvider
impl RouterProvider
Sourcepub fn new(providers: Vec<AnyProvider>) -> Self
pub fn new(providers: Vec<AnyProvider>) -> Self
Create a new router over providers.
Use the builder methods (e.g., with_thompson,
with_cascade) to configure a routing strategy.
The default strategy is RouterStrategy::Ema.
Sourcepub fn with_embed_timeout(self, timeout_ms: u64) -> Self
pub fn with_embed_timeout(self, timeout_ms: u64) -> Self
Sourcepub fn with_embed_concurrency(self, limit: usize) -> Self
pub fn with_embed_concurrency(self, limit: usize) -> Self
Set the maximum number of concurrent embed_batch calls.
A value of 0 disables the semaphore (unlimited). Default is no semaphore.
Sourcepub fn set_memory_confidence(&self, confidence: Option<f32>)
pub fn set_memory_confidence(&self, confidence: Option<f32>)
Set the MAR (Memory-Augmented Routing) signal for the current turn.
Must be called before chat / chat_stream to influence bandit provider selection.
Pass None to disable MAR for this turn.
Sourcepub fn with_ema(self, alpha: f64, reorder_interval: u64) -> Self
pub fn with_ema(self, alpha: f64, reorder_interval: u64) -> Self
Enable EMA-based adaptive provider ordering.
Sourcepub fn with_coe(
self,
config: CoeConfig,
secondary: AnyProvider,
embed: AnyProvider,
) -> Self
pub fn with_coe( self, config: CoeConfig, secondary: AnyProvider, embed: AnyProvider, ) -> Self
Enable Collaborative Entropy (CoE) for Ema/Thompson strategies.
CoE detects uncertain responses via intra-entropy and inter-divergence signals,
escalating to secondary when either threshold is exceeded.
No-op (with a warn!) when the active strategy is Cascade or Bandit.
Sourcepub fn coe_metrics(&self) -> Option<(u64, u64, u64, u64)>
pub fn coe_metrics(&self) -> Option<(u64, u64, u64, u64)>
Return session-level CoE metrics snapshot, or None if CoE is disabled.
Sourcepub fn with_asi(self, config: AsiRouterConfig) -> Self
pub fn with_asi(self, config: AsiRouterConfig) -> Self
Enable Agent Stability Index (ASI) coherence tracking.
When enabled, each successful response is embedded in a background task and added to a per-provider sliding window. The coherence score (cosine similarity of the latest embedding vs. window mean) penalizes Thompson/EMA routing priors for providers whose responses drift.
Sourcepub fn with_quality_gate(self, threshold: f32) -> Self
pub fn with_quality_gate(self, threshold: f32) -> Self
Enable embedding-based quality gate for Thompson/EMA routing.
After provider selection, computes cosine similarity between the query embedding
and the response embedding. If below threshold, tries the next provider in the
ordered list. On full exhaustion, returns the best response seen (highest similarity).
Fail-open: embedding errors disable the gate for that request.
Sourcepub fn with_thompson(self, state_path: Option<&Path>) -> Self
pub fn with_thompson(self, state_path: Option<&Path>) -> Self
Enable Thompson Sampling strategy.
Loads existing state from state_path if present; falls back to uniform prior.
Prunes stale entries for providers not in the current chain.
Sourcepub fn with_bandit(
self,
config: BanditRouterConfig,
state_path: Option<&Path>,
embedding_provider: Option<AnyProvider>,
) -> Self
pub fn with_bandit( self, config: BanditRouterConfig, state_path: Option<&Path>, embedding_provider: Option<AnyProvider>, ) -> Self
Enable PILOT bandit routing strategy (LinUCB contextual bandit).
Loads existing state from state_path (or the default path) using
tokio::task::block_in_place to avoid blocking the async executor.
Applies session-level decay if config.decay_factor < 1.0, and prunes arms for
removed providers.
embedding_provider is used to obtain feature vectors for each query.
When None, the bandit falls back to Thompson/uniform selection whenever an
embedding cannot be obtained within config.embedding_timeout_ms.
The warmup_queries default of 0 in BanditRouterConfig is overridden here to
10 * num_providers to ensure sufficient initial exploration.
Sourcepub async fn save_bandit_state(&self)
pub async fn save_bandit_state(&self)
Persist current bandit state to disk. No-op if bandit strategy is not active.
Uses tokio::task::spawn_blocking so it is safe to call from any async context.
Sourcepub fn bandit_stats(&self) -> Vec<(String, u64, f32)>
pub fn bandit_stats(&self) -> Vec<(String, u64, f32)>
Return bandit diagnostic stats: (provider_name, pulls, mean_reward).
Returns an empty vec if bandit strategy is not active.
Sourcepub fn with_reputation(
self,
decay_factor: f64,
weight: f64,
min_observations: u64,
state_path: Option<&Path>,
) -> Self
pub fn with_reputation( self, decay_factor: f64, weight: f64, min_observations: u64, state_path: Option<&Path>, ) -> Self
Enable Bayesian reputation scoring (RAPS).
Loads existing state from state_path (or the default path) using
tokio::task::block_in_place to avoid blocking the async executor.
Applies session-level decay and prunes stale provider entries.
No-op for Cascade routing (reputation is not used for cost-tier ordering).
Sourcepub fn record_quality_outcome(&self, _provider_name: &str, success: bool)
pub fn record_quality_outcome(&self, _provider_name: &str, success: bool)
Record a quality outcome for the last active sub-provider (tool execution result).
Call only for semantic failures (invalid tool args, parse errors). Do NOT call for network errors, rate limits, or transient I/O failures. No-op when reputation scoring is disabled, strategy is Cascade, or no tool call has been made yet in this session.
The _provider_name parameter is ignored — quality is attributed to the sub-provider
that served the most recent chat_with_tools call, tracked via last_active_provider.
Sourcepub fn last_selected_provider_kind(&self) -> &'static str
pub fn last_selected_provider_kind(&self) -> &'static str
Returns the provider_kind_str of the last provider selected by the router.
Used by crate::any::AnyProvider::provider_kind_str to attribute cost to the
actual child provider rather than returning the generic "local" sentinel for all
router-dispatched calls. Falls back to "local" when no call has been made yet.
Sourcepub async fn save_reputation_state(&self)
pub async fn save_reputation_state(&self)
Persist current reputation state to disk. No-op if reputation is disabled.
Uses tokio::task::spawn_blocking so it is safe to call from any async context.
Sourcepub fn reputation_stats(&self) -> Vec<(String, f64, f64, f64, u64)>
pub fn reputation_stats(&self) -> Vec<(String, f64, f64, f64, u64)>
Return reputation stats for all tracked providers: (name, alpha, beta, mean, observations).
Sourcepub fn with_cascade(self, config: CascadeRouterConfig) -> Self
pub fn with_cascade(self, config: CascadeRouterConfig) -> Self
Enable Cascade routing strategy.
Providers are tried in chain order (cheapest first). Each response is evaluated
by the quality classifier; if it falls below quality_threshold, the next
provider is tried. At most max_escalations quality-based escalations occur.
Network/API errors do not count against the escalation budget. The best response seen so far is returned if all escalations are exhausted.
When config.cost_tiers is set, providers are reordered once at construction
time (no per-request cost). Providers absent from cost_tiers are appended
after listed ones in original chain order. Unknown names in cost_tiers emit
a warning and are otherwise ignored.
Sourcepub async fn save_thompson_state(&self)
pub async fn save_thompson_state(&self)
Persist current Thompson state to disk.
No-op if Thompson strategy is not active.
Uses tokio::task::spawn_blocking so it is safe to call from any async context,
including mid-request paths.
Sourcepub fn thompson_stats(&self) -> Vec<(String, f64, f64)>
pub fn thompson_stats(&self) -> Vec<(String, f64, f64)>
Return a snapshot of Thompson distribution parameters for all tracked providers.
Returns an empty vec if Thompson strategy is not active.
pub fn set_status_tx(&mut self, tx: StatusTx)
Sourcepub async fn list_models_remote(&self) -> Result<Vec<RemoteModelInfo>, LlmError>
pub async fn list_models_remote(&self) -> Result<Vec<RemoteModelInfo>, LlmError>
Aggregate model lists from all sub-providers, deduplicating by id.
Individual sub-provider errors are logged as warnings and skipped.
§Errors
Always succeeds (errors per-provider are swallowed).
Source§impl RouterProvider
impl RouterProvider
Sourcepub fn embed_cache_metrics(&self) -> (u64, u64)
pub fn embed_cache_metrics(&self) -> (u64, u64)
Return session-level embedding cache metrics: (total_calls, cache_hits).
Trait Implementations§
Source§impl Clone for RouterProvider
impl Clone for RouterProvider
Source§fn clone(&self) -> RouterProvider
fn clone(&self) -> RouterProvider
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for RouterProvider
impl Debug for RouterProvider
Source§impl LlmProvider for RouterProvider
impl LlmProvider for RouterProvider
Source§fn context_window(&self) -> Option<usize>
fn context_window(&self) -> Option<usize>
Source§fn chat(
&self,
messages: &[Message],
) -> impl Future<Output = Result<String, LlmError>> + Send
fn chat( &self, messages: &[Message], ) -> impl Future<Output = Result<String, LlmError>> + Send
Source§fn chat_stream(
&self,
messages: &[Message],
) -> impl Future<Output = Result<ChatStream, LlmError>> + Send
fn chat_stream( &self, messages: &[Message], ) -> impl Future<Output = Result<ChatStream, LlmError>> + Send
Source§fn supports_streaming(&self) -> bool
fn supports_streaming(&self) -> bool
Source§fn embed(
&self,
text: &str,
) -> impl Future<Output = Result<Vec<f32>, LlmError>> + Send
fn embed( &self, text: &str, ) -> impl Future<Output = Result<Vec<f32>, LlmError>> + Send
Source§fn embed_batch(
&self,
texts: &[&str],
) -> impl Future<Output = Result<Vec<Vec<f32>>, LlmError>> + Send
fn embed_batch( &self, texts: &[&str], ) -> impl Future<Output = Result<Vec<Vec<f32>>, LlmError>> + Send
Source§fn supports_embeddings(&self) -> bool
fn supports_embeddings(&self) -> bool
Source§fn model_identifier(&self) -> &str
fn model_identifier(&self) -> &str
gpt-4o-mini, claude-sonnet-4-6).
Used by cost-estimation heuristics. Returns "" when not applicable.Source§fn supports_tool_use(&self) -> bool
fn supports_tool_use(&self) -> bool
tool_use / function calling.Source§fn list_models(&self) -> Vec<String>
fn list_models(&self) -> Vec<String>
Source§fn chat_with_tools(
&self,
messages: &[Message],
tools: &[ToolDefinition],
) -> impl Future<Output = Result<ChatResponse, LlmError>> + Send
fn chat_with_tools( &self, messages: &[Message], tools: &[ToolDefinition], ) -> impl Future<Output = Result<ChatResponse, LlmError>> + Send
Source§fn debug_request_json(
&self,
messages: &[Message],
tools: &[ToolDefinition],
stream: bool,
) -> Value
fn debug_request_json( &self, messages: &[Message], tools: &[ToolDefinition], stream: bool, ) -> Value
Source§fn last_cache_usage(&self) -> Option<(u64, u64)>
fn last_cache_usage(&self) -> Option<(u64, u64)>
(cache_creation_tokens, cache_read_tokens).Source§fn supports_vision(&self) -> bool
fn supports_vision(&self) -> bool
Source§fn last_usage(&self) -> Option<(u64, u64)>
fn last_usage(&self) -> Option<(u64, u64)>
(input_tokens, output_tokens).Source§fn last_reasoning_tokens(&self) -> Option<u64>
fn last_reasoning_tokens(&self) -> Option<u64>
Source§fn take_compaction_summary(&self) -> Option<String>
fn take_compaction_summary(&self) -> Option<String>
Source§fn chat_with_extras(
&self,
messages: &[Message],
) -> impl Future<Output = Result<(String, ChatExtras), LlmError>> + Send
fn chat_with_extras( &self, messages: &[Message], ) -> impl Future<Output = Result<(String, ChatExtras), LlmError>> + Send
Source§fn supports_structured_output(&self) -> bool
fn supports_structured_output(&self) -> bool
Source§impl RouterAware for RouterProvider
impl RouterAware for RouterProvider
Auto Trait Implementations§
impl !RefUnwindSafe for RouterProvider
impl !UnwindSafe for RouterProvider
impl Freeze for RouterProvider
impl Send for RouterProvider
impl Sync for RouterProvider
impl Unpin for RouterProvider
impl UnsafeUnpin for RouterProvider
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> LlmProviderDyn for T
impl<T> LlmProviderDyn for T
Source§fn context_window(&self) -> Option<usize>
fn context_window(&self) -> Option<usize>
None if unknown.Source§fn chat<'a>(
&'a self,
messages: &'a [Message],
) -> Pin<Box<dyn Future<Output = Result<String, LlmError>> + Send + 'a>>
fn chat<'a>( &'a self, messages: &'a [Message], ) -> Pin<Box<dyn Future<Output = Result<String, LlmError>> + Send + 'a>>
Source§fn chat_stream<'a>(
&'a self,
messages: &'a [Message],
) -> Pin<Box<dyn Future<Output = Result<Pin<Box<dyn Stream<Item = Result<StreamChunk, LlmError>> + Send>>, LlmError>> + Send + 'a>>
fn chat_stream<'a>( &'a self, messages: &'a [Message], ) -> Pin<Box<dyn Future<Output = Result<Pin<Box<dyn Stream<Item = Result<StreamChunk, LlmError>> + Send>>, LlmError>> + Send + 'a>>
Source§fn supports_streaming(&self) -> bool
fn supports_streaming(&self) -> bool
Source§fn embed<'a>(
&'a self,
text: &'a str,
) -> Pin<Box<dyn Future<Output = Result<Vec<f32>, LlmError>> + Send + 'a>>
fn embed<'a>( &'a self, text: &'a str, ) -> Pin<Box<dyn Future<Output = Result<Vec<f32>, LlmError>> + Send + 'a>>
Source§fn embed_batch<'a>(
&'a self,
texts: &'a [&'a str],
) -> Pin<Box<dyn Future<Output = Result<Vec<Vec<f32>>, LlmError>> + Send + 'a>>
fn embed_batch<'a>( &'a self, texts: &'a [&'a str], ) -> Pin<Box<dyn Future<Output = Result<Vec<Vec<f32>>, LlmError>> + Send + 'a>>
Source§fn supports_embeddings(&self) -> bool
fn supports_embeddings(&self) -> bool
Source§fn model_identifier(&self) -> &str
fn model_identifier(&self) -> &str
gpt-4o-mini, claude-sonnet-4-6).Source§fn supports_vision(&self) -> bool
fn supports_vision(&self) -> bool
Source§fn supports_tool_use(&self) -> bool
fn supports_tool_use(&self) -> bool
tool_use / function calling.Source§fn chat_with_tools<'a>(
&'a self,
messages: &'a [Message],
tools: &'a [ToolDefinition],
) -> Pin<Box<dyn Future<Output = Result<ChatResponse, LlmError>> + Send + 'a>>
fn chat_with_tools<'a>( &'a self, messages: &'a [Message], tools: &'a [ToolDefinition], ) -> Pin<Box<dyn Future<Output = Result<ChatResponse, LlmError>> + Send + 'a>>
Source§fn last_cache_usage(&self) -> Option<(u64, u64)>
fn last_cache_usage(&self) -> Option<(u64, u64)>
(cache_creation_tokens, cache_read_tokens).Source§fn last_usage(&self) -> Option<(u64, u64)>
fn last_usage(&self) -> Option<(u64, u64)>
(input_tokens, output_tokens).