Expand description
Resilient, cascading LLM inference across multiple providers.
llm-cascade provides automatic failover, circuit breaking, and retry cooldowns
when calling LLM APIs. Define ordered provider/model lists in a TOML config;
the library tries each entry in sequence, skipping those on cooldown, and returns
the first successful response.
§Quick start
use llm_cascade::{run_cascade, load_config, db, Conversation, Message};
#[tokio::main]
async fn main() {
let config = load_config(&"config.toml".into()).expect("config");
let conn = db::init_db(&config.database.path).expect("db");
let conversation = Conversation::single_user_prompt("What is 2 + 2?");
match run_cascade("my_cascade", &conversation, &config, &conn).await {
Ok(response) => println!("{}", response.text_only()),
Err(e) => eprintln!("All providers failed: {}", e),
}
}Re-exports§
pub use cascade::run_cascade;pub use config::AppConfig;pub use config::CascadeConfig;pub use config::CascadeEntry;pub use config::DatabaseConfig;pub use config::FailureConfig;pub use config::ProviderConfig;pub use config::ProviderType;pub use config::load_config;pub use error::CascadeError;pub use error::ProviderError;pub use models::ContentBlock;pub use models::Conversation;pub use models::LlmResponse;pub use models::Message;pub use models::MessageRole;pub use models::ToolDefinition;
Modules§
- cascade
- Cascade execution engine: iterates provider entries with failover and cooldown.
- config
- Configuration loading and types for
config.toml. - db
- SQLite-backed persistence for attempt logs and cooldown state.
- error
- Error types for provider and cascade operations.
- models
- Data models for conversations, messages, responses, and tool definitions.
- persistence
- Persistence of failed conversations to JSON files.
- providers
- LLM provider implementations.
- secrets
- API key resolution and management via the OS keyring or environment variables.