Skip to main content

Crate llm_cascade

Crate llm_cascade 

Source
Expand description

Resilient, cascading LLM inference across multiple providers.

llm-cascade provides automatic failover, circuit breaking, and retry cooldowns when calling LLM APIs. Define ordered provider/model lists in a TOML config; the library tries each entry in sequence, skipping those on cooldown, and returns the first successful response.

§Quick start

use llm_cascade::{run_cascade, load_config, db, Conversation, Message};

#[tokio::main]
async fn main() {
    let config = load_config(&"config.toml".into()).expect("config");
    let conn = db::init_db(&config.database.path).expect("db");

    let conversation = Conversation::single_user_prompt("What is 2 + 2?");
    match run_cascade("my_cascade", &conversation, &config, &conn).await {
        Ok(response) => println!("{}", response.text_only()),
        Err(e) => eprintln!("All providers failed: {}", e),
    }
}

Re-exports§

pub use cascade::run_cascade;
pub use config::AppConfig;
pub use config::CascadeConfig;
pub use config::CascadeEntry;
pub use config::DatabaseConfig;
pub use config::FailureConfig;
pub use config::ProviderConfig;
pub use config::ProviderType;
pub use config::load_config;
pub use error::CascadeError;
pub use error::ProviderError;
pub use models::ContentBlock;
pub use models::Conversation;
pub use models::LlmResponse;
pub use models::Message;
pub use models::MessageRole;
pub use models::ToolDefinition;

Modules§

cascade
Cascade execution engine: iterates provider entries with failover and cooldown.
config
Configuration loading and types for config.toml.
db
SQLite-backed persistence for attempt logs and cooldown state.
error
Error types for provider and cascade operations.
models
Data models for conversations, messages, responses, and tool definitions.
persistence
Persistence of failed conversations to JSON files.
providers
LLM provider implementations.
secrets
API key resolution and management via the OS keyring or environment variables.