Skip to main content

Module serve

Module serve

Expand description

Model Serving Ecosystem

Unified interface for local and remote model serving across the ML ecosystem.

§Components

ChatTemplateEngine - Unified prompt templating (Llama2, Mistral, ChatML)
BackendSelector - Intelligent backend selection with privacy tiers
CostCircuitBreaker - Daily budget limits to prevent runaway costs
ContextManager - Automatic token counting and truncation
StatefulFailover - Streaming failover with context preservation
SpilloverRouter - Hybrid cloud spillover routing
LambdaDeployer - AWS Lambda inference deployment

§Toyota Way Principles

Standardized Work: Chat templates ensure consistent model interaction
Poka-Yoke: Privacy gates prevent accidental data leakage
Jidoka: Stateful failover maintains context on errors
Muda Elimination: Cost circuit breakers prevent waste

Re-exports§

pub use backends::BackendSelector;
pub use backends::LatencyTier;
pub use backends::PrivacyTier;
pub use backends::ServingBackend;
pub use circuit_breaker::CircuitBreakerConfig;
pub use circuit_breaker::CostCircuitBreaker;
pub use circuit_breaker::TokenPricing;
pub use context::ContextManager;
pub use context::ContextWindow;
pub use context::TokenEstimator;
pub use context::TruncationStrategy;
pub use failover::FailoverConfig;
pub use failover::FailoverManager;
pub use failover::StreamingContext;
pub use lambda::LambdaConfig;
pub use lambda::LambdaDeployer;
pub use lambda::LambdaRuntime;
pub use router::RejectReason;
pub use router::RouterConfig;
pub use router::RoutingDecision;
pub use router::SpilloverRouter;
pub use templates::ChatMessage;
pub use templates::ChatTemplateEngine;
pub use templates::Role;
pub use templates::TemplateFormat;

Modules§

backends: Backend Selection and Privacy Tiers
banco: Banco: Local-first AI Workbench HTTP API
circuit_breaker: Cost Circuit Breaker
context: Context Window Management
failover: Stateful Failover Protocol
lambda: AWS Lambda Inference Deployment
router: Spillover Router for Hybrid Cloud
templates: Chat Template Engine