pub struct ModelRouter<C, S, A> { /* private fields */ }Expand description
Routes each request to a model tier chosen by an LLM classifier.
A ModelRouter wraps a classifier provider plus three tier providers and,
for every request, makes one extra classifier call to decide whether the work
is Simple, Moderate,
or Complex, then dispatches to the fast,
capable, or advanced provider respectively. If the classifier call fails
(error or rate limit) the router conservatively treats the request as
Complex rather than risk under-serving it (see classify).
ModelRouter itself implements LlmProvider, so it can be passed anywhere
a &dyn LlmProvider is expected (run_structured, RefreshingProvider,
etc.). chat classifies then routes; the streaming
chat_stream classifies first, then streams the
chosen tier.
Note: the fast and capable tiers currently share one provider type S
(only advanced has its own type A), so mixing e.g. a Gemini fast tier with
an OpenAI capable tier requires both behind the same concrete type. Use
Arc<dyn LlmProvider> for all three tiers to mix providers freely.
Implementations§
Source§impl<C, S, A> ModelRouter<C, S, A>
impl<C, S, A> ModelRouter<C, S, A>
pub const fn new( classifier: C, fast: S, capable: S, advanced: A, ) -> ModelRouter<C, S, A>
Sourcepub async fn classify(
&self,
request: &ChatRequest,
) -> Result<TaskComplexity, Error>
pub async fn classify( &self, request: &ChatRequest, ) -> Result<TaskComplexity, Error>
§Errors
Returns an error if the LLM provider fails.
Sourcepub async fn route(&self, request: ChatRequest) -> Result<ChatOutcome, Error>
pub async fn route(&self, request: ChatRequest) -> Result<ChatOutcome, Error>
§Errors
Returns an error if the LLM provider fails.
Sourcepub async fn route_with_tier(
&self,
request: ChatRequest,
tier: ModelTier,
) -> Result<ChatOutcome, Error>
pub async fn route_with_tier( &self, request: ChatRequest, tier: ModelTier, ) -> Result<ChatOutcome, Error>
§Errors
Returns an error if the LLM provider fails.
pub const fn fast_provider(&self) -> &S
pub const fn capable_provider(&self) -> &S
pub const fn advanced_provider(&self) -> &A
Trait Implementations§
Source§impl<C, S, A> LlmProvider for ModelRouter<C, S, A>
impl<C, S, A> LlmProvider for ModelRouter<C, S, A>
Source§fn model(&self) -> &str
fn model(&self) -> &str
Reports the capable (mid) tier’s model as the router’s representative
model identifier.
Source§fn provider(&self) -> &'static str
fn provider(&self) -> &'static str
Reports the capable (mid) tier’s provider as the router’s representative
provider identifier.
Source§fn chat<'life0, 'async_trait>(
&'life0 self,
request: ChatRequest,
) -> Pin<Box<dyn Future<Output = Result<ChatOutcome, Error>> + Send + 'async_trait>>where
'life0: 'async_trait,
ModelRouter<C, S, A>: 'async_trait,
fn chat<'life0, 'async_trait>(
&'life0 self,
request: ChatRequest,
) -> Pin<Box<dyn Future<Output = Result<ChatOutcome, Error>> + Send + 'async_trait>>where
'life0: 'async_trait,
ModelRouter<C, S, A>: 'async_trait,
Source§fn chat_stream(
&self,
request: ChatRequest,
) -> Pin<Box<dyn Stream<Item = Result<StreamDelta, Error>> + Send + '_>>
fn chat_stream( &self, request: ChatRequest, ) -> Pin<Box<dyn Stream<Item = Result<StreamDelta, Error>> + Send + '_>>
Source§fn configured_thinking(&self) -> Option<&ThinkingConfig>
fn configured_thinking(&self) -> Option<&ThinkingConfig>
Source§fn capabilities(&self) -> Option<&'static ModelCapabilities>
fn capabilities(&self) -> Option<&'static ModelCapabilities>
Source§fn validate_thinking_config(
&self,
thinking: Option<&ThinkingConfig>,
) -> Result<(), Error>
fn validate_thinking_config( &self, thinking: Option<&ThinkingConfig>, ) -> Result<(), Error>
Source§fn resolve_thinking_config(
&self,
request_thinking: Option<&ThinkingConfig>,
) -> Result<Option<ThinkingConfig>, Error>
fn resolve_thinking_config( &self, request_thinking: Option<&ThinkingConfig>, ) -> Result<Option<ThinkingConfig>, Error>
Source§fn default_max_tokens(&self) -> u32
fn default_max_tokens(&self) -> u32
AgentConfig.max_tokens.Source§fn structured_output_support(&self) -> StructuredOutputSupport
fn structured_output_support(&self) -> StructuredOutputSupport
ResponseFormat) request. Read more