Skip to main content

Module serve

Module serve 

Source
Expand description

Model Serving Ecosystem

Unified interface for local and remote model serving across the ML ecosystem.

§Components

  • ChatTemplateEngine - Unified prompt templating (Llama2, Mistral, ChatML)
  • BackendSelector - Intelligent backend selection with privacy tiers
  • CostCircuitBreaker - Daily budget limits to prevent runaway costs
  • ContextManager - Automatic token counting and truncation
  • StatefulFailover - Streaming failover with context preservation
  • SpilloverRouter - Hybrid cloud spillover routing
  • LambdaDeployer - AWS Lambda inference deployment

§Toyota Way Principles

  • Standardized Work: Chat templates ensure consistent model interaction
  • Poka-Yoke: Privacy gates prevent accidental data leakage
  • Jidoka: Stateful failover maintains context on errors
  • Muda Elimination: Cost circuit breakers prevent waste

Re-exports§

pub use backends::BackendSelector;
pub use backends::LatencyTier;
pub use backends::PrivacyTier;
pub use backends::ServingBackend;
pub use circuit_breaker::CircuitBreakerConfig;
pub use circuit_breaker::CostCircuitBreaker;
pub use circuit_breaker::TokenPricing;
pub use context::ContextManager;
pub use context::ContextWindow;
pub use context::TokenEstimator;
pub use context::TruncationStrategy;
pub use failover::FailoverConfig;
pub use failover::FailoverManager;
pub use failover::StreamingContext;
pub use lambda::LambdaConfig;
pub use lambda::LambdaDeployer;
pub use lambda::LambdaRuntime;
pub use router::RejectReason;
pub use router::RouterConfig;
pub use router::RoutingDecision;
pub use router::SpilloverRouter;
pub use templates::ChatMessage;
pub use templates::ChatTemplateEngine;
pub use templates::Role;
pub use templates::TemplateFormat;

Modules§

backends
Backend Selection and Privacy Tiers
banco
Banco: Local-first AI Workbench HTTP API
circuit_breaker
Cost Circuit Breaker
context
Context Window Management
failover
Stateful Failover Protocol
lambda
AWS Lambda Inference Deployment
router
Spillover Router for Hybrid Cloud
templates
Chat Template Engine