LlmBasedRouter

octoroute::router::llm_based

Struct LlmBasedRouter

pub struct LlmBasedRouter { /* private fields */ }

Expand description

LLM-powered router that uses a model to make routing decisions

Uses the configured tier to analyze requests and choose optimal target. Provides intelligent fallback when rule-based routing is ambiguous.

§Construction-Time Validation

Uses TierSelector to validate that the specified tier has available endpoints. The tier is chosen via config.routing.router_tier at construction time.

Implementations§

impl LlmBasedRouter

pub fn new( selector: Arc<ModelSelector>, tier: TargetModel, router_timeout_secs: u64, metrics: Arc<Metrics>, ) -> AppResult<Self>

Create a new LLM-based router using the specified tier

Returns an error if no endpoints are configured for the specified tier.

§Arguments

selector - The underlying ModelSelector
tier - Which tier (Fast, Balanced, Deep) to use for routing decisions
router_timeout_secs - Timeout for router queries in seconds
metrics - Metrics collector for observability

§Tier Selection

Fast: Lowest latency (~50-200ms) but may misroute complex requests
Balanced: Recommended default (~100-500ms) with good accuracy
Deep: Highest accuracy (~2-5s) but rarely worth the latency overhead

§Construction-Time Validation

The TierSelector validates tier availability at construction, ensuring at least one endpoint exists for the specified tier.

pub fn tier(&self) -> TargetModel

Returns the configured router tier

pub async fn route( &self, user_prompt: &str, meta: &RouteMetadata, ) -> AppResult<RoutingDecision>

Route request using LLM analysis

§Async Behavior

This method is async because it:

Waits for LLM inference: ~100-500ms for 30B model routing decision (dominant latency)
Makes HTTP requests to LLM endpoints (network I/O, ~10-100ms connection overhead)
Awaits endpoint selection from ModelSelector (async lock acquisition, <1ms)
Performs health tracking mark_success/mark_failure (async lock, <1ms)

Total typical latency: ~110-600ms (dominated by LLM inference)

§Retry Logic & Failure Tracking (Dual-Level)

Implements sophisticated retry with TWO failure tracking mechanisms:

Request-Scoped Exclusion (failed_endpoints): Prevents retrying the same endpoint within THIS request. Clears when function returns.
Global Health Tracking: Marks endpoints unhealthy after 3 consecutive failures across ALL requests. Persists via ModelSelector’s health_checker.

§Cancellation Safety

If the returned Future is dropped (cancelled), in-flight LLM queries will be aborted but endpoint health state remains consistent (mark_success/mark_failure only called after query completes).

Trait Implementations§

impl LlmRouter for LlmBasedRouter

Implementation of LlmRouter trait for LlmBasedRouter

This allows LlmBasedRouter to be used as a trait object for dependency injection in tests.

fn route<'life0, 'life1, 'life2, 'async_trait>( &'life0 self, user_prompt: &'life1 str, meta: &'life2 RouteMetadata, ) -> Pin<Box<dyn Future<Output = AppResult<RoutingDecision>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait, 'life2: 'async_trait,

Route a request based on LLM analysis Read more

Auto Trait Implementations§

impl Freeze for LlmBasedRouter

impl !RefUnwindSafe for LlmBasedRouter

impl Send for LlmBasedRouter

impl Sync for LlmBasedRouter

impl Unpin for LlmBasedRouter

impl !UnwindSafe for LlmBasedRouter

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn vzip(self) -> V

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more