<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.8.27">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>readme</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
vertical-align: middle;
}
/* CSS for syntax highlighting */
html { -webkit-text-size-adjust: 100%; }
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
}
pre.numberSource { margin-left: 3em; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
</style>
<script src="README_files/libs/clipboard/clipboard.min.js"></script>
<script src="README_files/libs/quarto-html/quarto.js" type="module"></script>
<script src="README_files/libs/quarto-html/tabsets/tabsets.js" type="module"></script>
<script src="README_files/libs/quarto-html/axe/axe-check.js" type="module"></script>
<script src="README_files/libs/quarto-html/popper.min.js"></script>
<script src="README_files/libs/quarto-html/tippy.umd.min.js"></script>
<script src="README_files/libs/quarto-html/anchor.min.js"></script>
<link href="README_files/libs/quarto-html/tippy.css" rel="stylesheet">
<link href="README_files/libs/quarto-html/quarto-syntax-highlighting-ed96de9b727972fe78a7b5d16c58bf87.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="README_files/libs/bootstrap/bootstrap.min.js"></script>
<link href="README_files/libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="README_files/libs/bootstrap/bootstrap-13c11614e496eb4a12075fa83ae65022.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="light">
</head>
<body class="fullcontent quarto-light">
<div id="quarto-content" class="page-columns page-rows-contents page-layout-article">
<main class="content" id="quarto-document-content">
<blockquote class="blockquote">
<p><strong>⚠️ Early Stage / Alpha Software</strong></p>
<p>This project is in active early development. The API surface may change, and not all edge cases are handled yet. Contributions are welcome — see <a href="#contributing">Contributing</a> below. Use at your own risk.</p>
</blockquote>
<hr>
<section id="llm-cascade" class="level1">
<h1>llm-cascade</h1>
<p><strong>Resilient, cascading LLM inference across multiple providers — failover, circuit breaking, and retry cooldowns built in.</strong></p>
<p><code>llm-cascade</code> is a Rust library and CLI that sends prompts to an ordered list of LLM providers (OpenAI, Anthropic, Google Gemini, Ollama, and any OpenAI-compatible endpoint like Groq or Together). If one provider is rate-limited or down, it automatically falls through to the next, tracks per-entry cooldowns in SQLite, and persists failed prompts as JSON files.</p>
<hr>
<section id="features" class="level2">
<h2 class="anchored" data-anchor-id="features">Features</h2>
<ul>
<li><strong>Cascading failover</strong> — define ordered provider/model lists; the first successful response wins</li>
<li><strong>OpenAI-compatible providers</strong> — point any <code>openai</code>-type provider at a custom <code>base_url</code> (Groq, Together, Z.AI, vLLM, etc.)</li>
<li><strong>Per-entry circuit breaker</strong> — cooldowns tracked per <code>provider/model</code> pair in SQLite</li>
<li><strong>429-aware backoff</strong> — parses <code>retry-after</code> headers; falls back to exponential backoff (30 s base, 1 h cap)</li>
<li><strong>Cross-process state</strong> — cooldown state persists across CLI invocations via SQLite</li>
<li><strong>Secret management</strong> — OS keyring (via <code>keyring</code>) with environment variable fallback</li>
<li><strong>Failure persistence</strong> — total cascade failures saved as timestamped <code>.json</code> files</li>
<li><strong>Full audit log</strong> — every attempt logged with timestamp, status, latency, and token counts</li>
<li><strong>Dual interface</strong> — use as a CLI tool or as an async library in your own Rust projects</li>
</ul>
<hr>
</section>
<section id="how-it-works" class="level2">
<h2 class="anchored" data-anchor-id="how-it-works">How It Works</h2>
<pre><code>┌──────────┐ ┌─────────────────────────────────────────────┐
│ CLI / │ │ Cascade Engine │
│ Library │──────▶│ │
│ Caller │ │ ┌───────────┐ ┌───────────┐ ┌─────────┐ │
└──────────┘ │ │ openai/ │─▶│ anthropic/│─▶│ ollama/ │ │
│ │ gpt-4o │ │ claude… │ │ llama3 │ │
│ └────┬──────┘ └────┬──────┘ └────┬────┘ │
│ │ │ │ │
│ ┌────▼──────────────▼──────────────▼────┐ │
│ │ SQLite Database │ │
│ │ • attempt_log (audit trail) │ │
│ │ • cooldown (circuit breaker state) │ │
│ └───────────────────────────────────────┘ │
│ │
│ On total failure: │
│ ┌───────────────────────────────────────┐ │
│ │ failed_prompts/cascade_20260414.json │ │
│ └───────────────────────────────────────┘ │
└─────────────────────────────────────────────┘</code></pre>
<ol type="1">
<li><strong>Load config</strong> from <code>~/.config/llm-cascade/config.toml</code> (or a custom path).</li>
<li><strong>Initialize SQLite</strong> — creates <code>attempt_log</code> and <code>cooldown</code> tables if missing.</li>
<li><strong>Iterate cascade entries</strong> — for each <code>provider/model</code> in the named cascade:
<ul>
<li>Check if the entry is on cooldown in the DB → skip if so.</li>
<li>Resolve the API key (keyring → env var).</li>
<li>Send the <code>Conversation</code> to the provider’s API.</li>
<li>Log the attempt (status, latency, tokens).</li>
<li>On success → return <code>LlmResponse</code> immediately.</li>
<li>On failure → set cooldown (from <code>retry-after</code> header or exponential backoff) and continue.</li>
</ul></li>
<li><strong>Total failure</strong> → persist the <code>Conversation</code> as a <code>.json</code> file, return <code>CascadeError</code> with the file path.</li>
</ol>
<hr>
</section>
<section id="installation" class="level2">
<h2 class="anchored" data-anchor-id="installation">Installation</h2>
<section id="from-source" class="level3">
<h3 class="anchored" data-anchor-id="from-source">From source</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="fu">git</span> clone https://github.com/paluigi/llm-cascade.git</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="bu">cd</span> llm-cascade</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="ex">cargo</span> install <span class="at">--path</span> .</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
</section>
<section id="as-a-library-dependency" class="level3">
<h3 class="anchored" data-anchor-id="as-a-library-dependency">As a library dependency</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3"><pre class="sourceCode toml code-with-copy"><code class="sourceCode toml"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Cargo.toml</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="kw">[dependencies]</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="dt">llm-cascade</span> <span class="op">=</span> <span class="st">"0.1"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<blockquote class="blockquote">
<p>Requires Rust <strong>1.85+</strong> (edition 2024).</p>
</blockquote>
<hr>
</section>
</section>
<section id="configuration" class="level2">
<h2 class="anchored" data-anchor-id="configuration">Configuration</h2>
<p>Run the setup command to scaffold the default configuration:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> setup</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Or use the interactive wizard to configure providers and cascades step-by-step:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> setup <span class="at">--interactive</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>This creates <code>~/.config/llm-cascade/config.toml</code> with sensible defaults. Edit it to customize:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6"><pre class="sourceCode toml code-with-copy"><code class="sourceCode toml"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="co"># ── Provider Definitions ────────────────────────────────────</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a><span class="co"># Each block defines an endpoint (type, base_url, auth).</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a><span class="co"># Providers are referenced by name in cascades and can be</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a><span class="co"># reused with different models — no need to duplicate config.</span></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a><span class="kw">[providers.openai]</span></span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a><span class="dt">type</span> <span class="op">=</span> <span class="st">"openai"</span></span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a><span class="dt">api_key_service</span> <span class="op">=</span> <span class="st">"openai"</span> <span class="co"># keyring entry name</span></span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a><span class="dt">api_key_env</span> <span class="op">=</span> <span class="st">"OPENAI_API_KEY"</span> <span class="co"># env var fallback</span></span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a><span class="co"># base_url defaults to https://api.openai.com/v1</span></span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a><span class="kw">[providers.anthropic]</span></span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a><span class="dt">type</span> <span class="op">=</span> <span class="st">"anthropic"</span></span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a><span class="dt">api_key_service</span> <span class="op">=</span> <span class="st">"anthropic"</span></span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a><span class="dt">api_key_env</span> <span class="op">=</span> <span class="st">"ANTHROPIC_API_KEY"</span></span>
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true" tabindex="-1"></a><span class="co"># base_url defaults to https://api.anthropic.com</span></span>
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true" tabindex="-1"></a><span class="kw">[providers.gemini]</span></span>
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true" tabindex="-1"></a><span class="dt">type</span> <span class="op">=</span> <span class="st">"gemini"</span></span>
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true" tabindex="-1"></a><span class="dt">api_key_service</span> <span class="op">=</span> <span class="st">"gemini"</span></span>
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true" tabindex="-1"></a><span class="dt">api_key_env</span> <span class="op">=</span> <span class="st">"GOOGLE_API_KEY"</span></span>
<span id="cb6-22"><a href="#cb6-22" aria-hidden="true" tabindex="-1"></a><span class="co"># base_url defaults to https://generativelanguage.googleapis.com</span></span>
<span id="cb6-23"><a href="#cb6-23" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-24"><a href="#cb6-24" aria-hidden="true" tabindex="-1"></a><span class="kw">[providers.groq]</span></span>
<span id="cb6-25"><a href="#cb6-25" aria-hidden="true" tabindex="-1"></a><span class="dt">type</span> <span class="op">=</span> <span class="st">"openai"</span> <span class="co"># reuse OpenAI-compatible protocol</span></span>
<span id="cb6-26"><a href="#cb6-26" aria-hidden="true" tabindex="-1"></a><span class="dt">base_url</span> <span class="op">=</span> <span class="st">"https://api.groq.com/openai/v1"</span></span>
<span id="cb6-27"><a href="#cb6-27" aria-hidden="true" tabindex="-1"></a><span class="dt">api_key_service</span> <span class="op">=</span> <span class="st">"groq"</span></span>
<span id="cb6-28"><a href="#cb6-28" aria-hidden="true" tabindex="-1"></a><span class="dt">api_key_env</span> <span class="op">=</span> <span class="st">"GROQ_API_KEY"</span></span>
<span id="cb6-29"><a href="#cb6-29" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-30"><a href="#cb6-30" aria-hidden="true" tabindex="-1"></a><span class="kw">[providers.ollama]</span></span>
<span id="cb6-31"><a href="#cb6-31" aria-hidden="true" tabindex="-1"></a><span class="dt">type</span> <span class="op">=</span> <span class="st">"ollama"</span></span>
<span id="cb6-32"><a href="#cb6-32" aria-hidden="true" tabindex="-1"></a><span class="dt">base_url</span> <span class="op">=</span> <span class="st">"http://localhost:11434"</span></span>
<span id="cb6-33"><a href="#cb6-33" aria-hidden="true" tabindex="-1"></a><span class="co"># No API key needed</span></span>
<span id="cb6-34"><a href="#cb6-34" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-35"><a href="#cb6-35" aria-hidden="true" tabindex="-1"></a><span class="co"># ── Cascades ───────────────────────────────────────────────</span></span>
<span id="cb6-36"><a href="#cb6-36" aria-hidden="true" tabindex="-1"></a><span class="co"># Each entry references a provider by name and specifies a model.</span></span>
<span id="cb6-37"><a href="#cb6-37" aria-hidden="true" tabindex="-1"></a><span class="co"># The same provider can appear multiple times with different models.</span></span>
<span id="cb6-38"><a href="#cb6-38" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-39"><a href="#cb6-39" aria-hidden="true" tabindex="-1"></a><span class="kw">[cascades.creative_task]</span></span>
<span id="cb6-40"><a href="#cb6-40" aria-hidden="true" tabindex="-1"></a><span class="dt">entries</span> <span class="op">=</span> <span class="op">[</span></span>
<span id="cb6-41"><a href="#cb6-41" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"openai"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"gpt-4o"</span><span class="op"> },</span></span>
<span id="cb6-42"><a href="#cb6-42" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"anthropic"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"claude-sonnet-4-20250514"</span><span class="op"> },</span></span>
<span id="cb6-43"><a href="#cb6-43" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"gemini"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"gemini-2.0-flash"</span><span class="op"> },</span></span>
<span id="cb6-44"><a href="#cb6-44" aria-hidden="true" tabindex="-1"></a><span class="op">]</span></span>
<span id="cb6-45"><a href="#cb6-45" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-46"><a href="#cb6-46" aria-hidden="true" tabindex="-1"></a><span class="kw">[cascades.fast_task]</span></span>
<span id="cb6-47"><a href="#cb6-47" aria-hidden="true" tabindex="-1"></a><span class="dt">entries</span> <span class="op">=</span> <span class="op">[</span></span>
<span id="cb6-48"><a href="#cb6-48" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"ollama"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"llama3"</span><span class="op"> },</span></span>
<span id="cb6-49"><a href="#cb6-49" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"groq"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"llama-3.3-70b-versatile"</span><span class="op"> },</span></span>
<span id="cb6-50"><a href="#cb6-50" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"openai"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"gpt-4o-mini"</span><span class="op"> },</span></span>
<span id="cb6-51"><a href="#cb6-51" aria-hidden="true" tabindex="-1"></a><span class="op">]</span></span>
<span id="cb6-52"><a href="#cb6-52" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-53"><a href="#cb6-53" aria-hidden="true" tabindex="-1"></a><span class="kw">[cascades.resilient_task]</span></span>
<span id="cb6-54"><a href="#cb6-54" aria-hidden="true" tabindex="-1"></a><span class="dt">entries</span> <span class="op">=</span> <span class="op">[</span></span>
<span id="cb6-55"><a href="#cb6-55" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"openai"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"gpt-4o"</span><span class="op"> },</span></span>
<span id="cb6-56"><a href="#cb6-56" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"openai"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"gpt-4o-mini"</span><span class="op"> },</span> <span class="co"># same provider, different model</span></span>
<span id="cb6-57"><a href="#cb6-57" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"groq"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"llama-3.3-70b-versatile"</span><span class="op"> },</span></span>
<span id="cb6-58"><a href="#cb6-58" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"anthropic"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"claude-sonnet-4-20250514"</span><span class="op"> },</span></span>
<span id="cb6-59"><a href="#cb6-59" aria-hidden="true" tabindex="-1"></a> <span class="op">{ </span><span class="dt">provider</span><span class="op"> =</span> <span class="st">"ollama"</span><span class="op">, </span><span class="dt">model</span><span class="op"> =</span> <span class="st">"llama3"</span><span class="op"> },</span></span>
<span id="cb6-60"><a href="#cb6-60" aria-hidden="true" tabindex="-1"></a><span class="op">]</span></span>
<span id="cb6-61"><a href="#cb6-61" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-62"><a href="#cb6-62" aria-hidden="true" tabindex="-1"></a><span class="co"># ── Persistence ────────────────────────────────────────────</span></span>
<span id="cb6-63"><a href="#cb6-63" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-64"><a href="#cb6-64" aria-hidden="true" tabindex="-1"></a><span class="kw">[database]</span></span>
<span id="cb6-65"><a href="#cb6-65" aria-hidden="true" tabindex="-1"></a><span class="dt">path</span> <span class="op">=</span> <span class="st">"~/.local/share/llm-cascade/db.sqlite"</span></span>
<span id="cb6-66"><a href="#cb6-66" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-67"><a href="#cb6-67" aria-hidden="true" tabindex="-1"></a><span class="kw">[failure_persistence]</span></span>
<span id="cb6-68"><a href="#cb6-68" aria-hidden="true" tabindex="-1"></a><span class="dt">dir</span> <span class="op">=</span> <span class="st">"~/.local/share/llm-cascade/failed_prompts"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<section id="provider-types" class="level3">
<h3 class="anchored" data-anchor-id="provider-types">Provider types</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 15%">
<col style="width: 34%">
<col style="width: 50%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Description</th>
<th>Default <code>base_url</code></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><code>openai</code></td>
<td>OpenAI Chat Completions API</td>
<td><code>https://api.openai.com/v1</code></td>
</tr>
<tr class="even">
<td><code>anthropic</code></td>
<td>Anthropic Messages API</td>
<td><code>https://api.anthropic.com</code></td>
</tr>
<tr class="odd">
<td><code>gemini</code></td>
<td>Google Gemini generateContent API</td>
<td><code>https://generativelanguage.googleapis.com</code></td>
</tr>
<tr class="even">
<td><code>ollama</code></td>
<td>Ollama local inference</td>
<td><code>http://localhost:11434</code></td>
</tr>
</tbody>
</table>
<p>Any provider with <code>type = "openai"</code> can be pointed at a custom <code>base_url</code> to use OpenAI-compatible services such as <strong>Groq</strong>, <strong>Together AI</strong>, <strong>Z.AI</strong>, <strong>vLLM</strong>, <strong>LiteLLM</strong>, etc.</p>
</section>
<section id="api-keys" class="level3">
<h3 class="anchored" data-anchor-id="api-keys">API Keys</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 38%">
<col style="width: 61%">
</colgroup>
<thead>
<tr class="header">
<th>Method</th>
<th>How it works</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>OS Keyring</strong> (preferred)</td>
<td>Set via <code>llm-cascade key set <provider></code>. The <code>api_key_service</code> field is the keyring entry name.</td>
</tr>
<tr class="even">
<td><strong>Environment variable</strong></td>
<td>Export the variable named in <code>api_key_env</code> (e.g., <code>export OPENAI_API_KEY=sk-...</code>).</td>
</tr>
<tr class="odd">
<td><strong>Ollama</strong></td>
<td>No API key needed for local models.</td>
</tr>
</tbody>
</table>
<p>The library tries the keyring first and falls back to the environment variable automatically. Use <code>llm-cascade key list</code> to check the status of all providers.</p>
<hr>
</section>
</section>
<section id="cli-usage" class="level2">
<h2 class="anchored" data-anchor-id="cli-usage">CLI Usage</h2>
<section id="subcommands" class="level3">
<h3 class="anchored" data-anchor-id="subcommands">Subcommands</h3>
<p><code>llm-cascade</code> uses subcommands for all operations:</p>
<pre><code>llm-cascade run -C <cascade> -p <prompt> Run a cascade
llm-cascade setup [--interactive] Initialize configuration
llm-cascade key set <provider> Store an API key
llm-cascade key get <provider> [--show-full] Retrieve an API key
llm-cascade key list Show key status for all providers
llm-cascade key delete <provider> Remove an API key</code></pre>
</section>
<section id="running-a-cascade" class="level3">
<h3 class="anchored" data-anchor-id="running-a-cascade">Running a cascade</h3>
<p><strong>Basic prompt:</strong></p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> run <span class="at">-C</span> creative_task <span class="at">-p</span> <span class="st">"Write a haiku about Rust"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p><strong>From a JSON conversation file:</strong></p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb9"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> run <span class="at">-C</span> creative_task <span class="at">-f</span> conversation.json</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>The JSON file must match the <code>Conversation</code> schema:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb10"><pre class="sourceCode json code-with-copy"><code class="sourceCode json"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="fu">{</span></span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">"messages"</span><span class="fu">:</span> <span class="ot">[</span></span>
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">{</span> <span class="dt">"role"</span><span class="fu">:</span> <span class="st">"system"</span><span class="fu">,</span> <span class="dt">"content"</span><span class="fu">:</span> <span class="st">"You are a helpful assistant."</span> <span class="fu">}</span><span class="ot">,</span></span>
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">{</span> <span class="dt">"role"</span><span class="fu">:</span> <span class="st">"user"</span><span class="fu">,</span> <span class="dt">"content"</span><span class="fu">:</span> <span class="st">"What is 2 + 2?"</span> <span class="fu">}</span></span>
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a> <span class="ot">]</span><span class="fu">,</span></span>
<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a> <span class="dt">"tools"</span><span class="fu">:</span> <span class="ot">[</span></span>
<span id="cb10-7"><a href="#cb10-7" aria-hidden="true" tabindex="-1"></a> <span class="fu">{</span></span>
<span id="cb10-8"><a href="#cb10-8" aria-hidden="true" tabindex="-1"></a> <span class="dt">"name"</span><span class="fu">:</span> <span class="st">"get_weather"</span><span class="fu">,</span></span>
<span id="cb10-9"><a href="#cb10-9" aria-hidden="true" tabindex="-1"></a> <span class="dt">"description"</span><span class="fu">:</span> <span class="st">"Get the current weather"</span><span class="fu">,</span></span>
<span id="cb10-10"><a href="#cb10-10" aria-hidden="true" tabindex="-1"></a> <span class="dt">"parameters"</span><span class="fu">:</span> <span class="fu">{</span></span>
<span id="cb10-11"><a href="#cb10-11" aria-hidden="true" tabindex="-1"></a> <span class="dt">"type"</span><span class="fu">:</span> <span class="st">"object"</span><span class="fu">,</span></span>
<span id="cb10-12"><a href="#cb10-12" aria-hidden="true" tabindex="-1"></a> <span class="dt">"properties"</span><span class="fu">:</span> <span class="fu">{</span></span>
<span id="cb10-13"><a href="#cb10-13" aria-hidden="true" tabindex="-1"></a> <span class="dt">"location"</span><span class="fu">:</span> <span class="fu">{</span> <span class="dt">"type"</span><span class="fu">:</span> <span class="st">"string"</span> <span class="fu">}</span></span>
<span id="cb10-14"><a href="#cb10-14" aria-hidden="true" tabindex="-1"></a> <span class="fu">},</span></span>
<span id="cb10-15"><a href="#cb10-15" aria-hidden="true" tabindex="-1"></a> <span class="dt">"required"</span><span class="fu">:</span> <span class="ot">[</span><span class="st">"location"</span><span class="ot">]</span></span>
<span id="cb10-16"><a href="#cb10-16" aria-hidden="true" tabindex="-1"></a> <span class="fu">}</span></span>
<span id="cb10-17"><a href="#cb10-17" aria-hidden="true" tabindex="-1"></a> <span class="fu">}</span></span>
<span id="cb10-18"><a href="#cb10-18" aria-hidden="true" tabindex="-1"></a> <span class="ot">]</span></span>
<span id="cb10-19"><a href="#cb10-19" aria-hidden="true" tabindex="-1"></a><span class="fu">}</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p><strong>Custom config path:</strong></p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb11"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> run <span class="at">-c</span> /path/to/my/config.toml <span class="at">-C</span> fast_task <span class="at">-p</span> <span class="st">"Hello"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
</section>
<section id="setup" class="level3">
<h3 class="anchored" data-anchor-id="setup">Setup</h3>
<p><strong>Default setup</strong> — scaffolds the example config, creates directories, and initializes the database:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb12"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> setup</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p><strong>Interactive setup</strong> — wizard for selecting providers, defining cascades, and setting API keys:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb13"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> setup <span class="at">--interactive</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
</section>
<section id="key-management" class="level3">
<h3 class="anchored" data-anchor-id="key-management">Key management</h3>
<p><strong>Store an API key</strong> (prompts with hidden input):</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb14"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> key set openai</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p><strong>Retrieve an API key</strong> (masked by default; use <code>--show-full</code> to reveal):</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb15"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> key get openai</span>
<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> key get openai <span class="at">--show-full</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p><strong>List key status</strong> for all providers (checks both keyring and env vars):</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb16"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> key list</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p><strong>Delete an API key</strong> from the keyring:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb17"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="ex">llm-cascade</span> key delete openai</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
</section>
<section id="output" class="level3">
<h3 class="anchored" data-anchor-id="output">Output</h3>
<ul>
<li><strong>Text responses</strong> are printed to stdout.</li>
<li><strong>Tool call responses</strong> are printed as pretty JSON to stdout.</li>
<li><strong>Errors</strong> (including <code>CascadeError</code> with the <code>.json</code> file path) are printed to stderr with exit code 1.</li>
</ul>
</section>
<section id="verbosity" class="level3">
<h3 class="anchored" data-anchor-id="verbosity">Verbosity</h3>
<p>Control log output via the <code>RUST_LOG</code> environment variable:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb18"><pre class="sourceCode sh code-with-copy"><code class="sourceCode bash"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="va">RUST_LOG</span><span class="op">=</span>debug <span class="ex">llm-cascade</span> run <span class="at">-C</span> creative_task <span class="at">-p</span> <span class="st">"Hello"</span></span>
<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a><span class="va">RUST_LOG</span><span class="op">=</span>llm_cascade=trace <span class="ex">llm-cascade</span> run <span class="at">-C</span> creative_task <span class="at">-p</span> <span class="st">"Hello"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<hr>
</section>
</section>
<section id="library-usage" class="level2">
<h2 class="anchored" data-anchor-id="library-usage">Library Usage</h2>
<p>Use <code>llm-cascade</code> as an async library in any Rust project:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb19"><pre class="sourceCode rust code-with-copy"><code class="sourceCode rust"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a><span class="kw">use</span> <span class="pp">llm_cascade::</span><span class="op">{</span>run_cascade<span class="op">,</span> load_config<span class="op">,</span> db<span class="op">,</span> Conversation<span class="op">,</span> Message<span class="op">,</span> MessageRole<span class="op">};</span></span>
<span id="cb19-2"><a href="#cb19-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb19-3"><a href="#cb19-3" aria-hidden="true" tabindex="-1"></a><span class="at">#[</span><span class="pp">tokio::</span>main<span class="at">]</span></span>
<span id="cb19-4"><a href="#cb19-4" aria-hidden="true" tabindex="-1"></a><span class="kw">async</span> <span class="kw">fn</span> main() <span class="op">{</span></span>
<span id="cb19-5"><a href="#cb19-5" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> config <span class="op">=</span> load_config(<span class="op">&</span><span class="st">"config.toml"</span><span class="op">.</span>into())<span class="op">.</span>expect(<span class="st">"config"</span>)<span class="op">;</span></span>
<span id="cb19-6"><a href="#cb19-6" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> conn <span class="op">=</span> <span class="pp">db::</span>init_db(<span class="op">&</span>config<span class="op">.</span>database<span class="op">.</span>path)<span class="op">.</span>expect(<span class="st">"db"</span>)<span class="op">;</span></span>
<span id="cb19-7"><a href="#cb19-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb19-8"><a href="#cb19-8" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> conversation <span class="op">=</span> <span class="pp">Conversation::</span>new(<span class="pp">vec!</span>[</span>
<span id="cb19-9"><a href="#cb19-9" aria-hidden="true" tabindex="-1"></a> <span class="pp">Message::</span>system(<span class="st">"You are a concise assistant."</span>)<span class="op">,</span></span>
<span id="cb19-10"><a href="#cb19-10" aria-hidden="true" tabindex="-1"></a> <span class="pp">Message::</span>user(<span class="st">"What is the capital of France?"</span>)<span class="op">,</span></span>
<span id="cb19-11"><a href="#cb19-11" aria-hidden="true" tabindex="-1"></a> ])<span class="op">;</span></span>
<span id="cb19-12"><a href="#cb19-12" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb19-13"><a href="#cb19-13" aria-hidden="true" tabindex="-1"></a> <span class="cf">match</span> run_cascade(<span class="st">"creative_task"</span><span class="op">,</span> <span class="op">&</span>conversation<span class="op">,</span> <span class="op">&</span>config<span class="op">,</span> <span class="op">&</span>conn)<span class="op">.</span><span class="kw">await</span> <span class="op">{</span></span>
<span id="cb19-14"><a href="#cb19-14" aria-hidden="true" tabindex="-1"></a> <span class="cn">Ok</span>(response) <span class="op">=></span> <span class="op">{</span></span>
<span id="cb19-15"><a href="#cb19-15" aria-hidden="true" tabindex="-1"></a> <span class="pp">println!</span>(<span class="st">"Model: {}"</span><span class="op">,</span> response<span class="op">.</span>model)<span class="op">;</span></span>
<span id="cb19-16"><a href="#cb19-16" aria-hidden="true" tabindex="-1"></a> <span class="pp">println!</span>(<span class="st">"Response: {}"</span><span class="op">,</span> response<span class="op">.</span>text_only())<span class="op">;</span></span>
<span id="cb19-17"><a href="#cb19-17" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> <span class="kw">let</span> (<span class="cn">Some</span>(<span class="kw">in</span>)<span class="op">,</span> <span class="cn">Some</span>(out)) <span class="op">=</span> (response<span class="op">.</span>input_tokens<span class="op">,</span> response<span class="op">.</span>output_tokens) <span class="op">{</span></span>
<span id="cb19-18"><a href="#cb19-18" aria-hidden="true" tabindex="-1"></a> <span class="pp">println!</span>(<span class="st">"Tokens: {} in / {} out"</span><span class="op">,</span> <span class="kw">in</span><span class="op">,</span> out)<span class="op">;</span></span>
<span id="cb19-19"><a href="#cb19-19" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb19-20"><a href="#cb19-20" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb19-21"><a href="#cb19-21" aria-hidden="true" tabindex="-1"></a> <span class="cn">Err</span>(e) <span class="op">=></span> <span class="op">{</span></span>
<span id="cb19-22"><a href="#cb19-22" aria-hidden="true" tabindex="-1"></a> <span class="pp">eprintln!</span>(<span class="st">"Cascade failed: {}"</span><span class="op">,</span> e)<span class="op">;</span></span>
<span id="cb19-23"><a href="#cb19-23" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb19-24"><a href="#cb19-24" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
<span id="cb19-25"><a href="#cb19-25" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<section id="with-tool-definitions" class="level3">
<h3 class="anchored" data-anchor-id="with-tool-definitions">With tool definitions</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb20"><pre class="sourceCode rust code-with-copy"><code class="sourceCode rust"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a><span class="kw">use</span> <span class="pp">llm_cascade::</span><span class="op">{</span>run_cascade<span class="op">,</span> load_config<span class="op">,</span> db<span class="op">,</span> Conversation<span class="op">,</span> Message<span class="op">,</span> ToolDefinition<span class="op">};</span></span>
<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a><span class="kw">use</span> <span class="pp">serde_json::</span>json<span class="op">;</span></span>
<span id="cb20-3"><a href="#cb20-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb20-4"><a href="#cb20-4" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> conversation <span class="op">=</span> <span class="pp">Conversation::</span>new(<span class="pp">vec!</span>[</span>
<span id="cb20-5"><a href="#cb20-5" aria-hidden="true" tabindex="-1"></a> <span class="pp">Message::</span>user(<span class="st">"What's the weather in Tokyo?"</span>)<span class="op">,</span></span>
<span id="cb20-6"><a href="#cb20-6" aria-hidden="true" tabindex="-1"></a>])<span class="op">.</span>with_tools(<span class="pp">vec!</span>[</span>
<span id="cb20-7"><a href="#cb20-7" aria-hidden="true" tabindex="-1"></a> ToolDefinition <span class="op">{</span></span>
<span id="cb20-8"><a href="#cb20-8" aria-hidden="true" tabindex="-1"></a> name<span class="op">:</span> <span class="st">"get_weather"</span><span class="op">.</span>into()<span class="op">,</span></span>
<span id="cb20-9"><a href="#cb20-9" aria-hidden="true" tabindex="-1"></a> description<span class="op">:</span> <span class="st">"Get current weather for a location"</span><span class="op">.</span>into()<span class="op">,</span></span>
<span id="cb20-10"><a href="#cb20-10" aria-hidden="true" tabindex="-1"></a> parameters<span class="op">:</span> <span class="pp">json!</span>(<span class="op">{</span></span>
<span id="cb20-11"><a href="#cb20-11" aria-hidden="true" tabindex="-1"></a> <span class="st">"type"</span><span class="op">:</span> <span class="st">"object"</span><span class="op">,</span></span>
<span id="cb20-12"><a href="#cb20-12" aria-hidden="true" tabindex="-1"></a> <span class="st">"properties"</span><span class="op">:</span> <span class="op">{</span></span>
<span id="cb20-13"><a href="#cb20-13" aria-hidden="true" tabindex="-1"></a> <span class="st">"location"</span><span class="op">:</span> <span class="op">{</span> <span class="st">"type"</span><span class="op">:</span> <span class="st">"string"</span> <span class="op">}</span></span>
<span id="cb20-14"><a href="#cb20-14" aria-hidden="true" tabindex="-1"></a> <span class="op">},</span></span>
<span id="cb20-15"><a href="#cb20-15" aria-hidden="true" tabindex="-1"></a> <span class="st">"required"</span><span class="op">:</span> [<span class="st">"location"</span>]</span>
<span id="cb20-16"><a href="#cb20-16" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span>)<span class="op">,</span></span>
<span id="cb20-17"><a href="#cb20-17" aria-hidden="true" tabindex="-1"></a> <span class="op">},</span></span>
<span id="cb20-18"><a href="#cb20-18" aria-hidden="true" tabindex="-1"></a>])<span class="op">;</span></span>
<span id="cb20-19"><a href="#cb20-19" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb20-20"><a href="#cb20-20" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> response <span class="op">=</span> run_cascade(<span class="st">"creative_task"</span><span class="op">,</span> <span class="op">&</span>conversation<span class="op">,</span> <span class="op">&</span>config<span class="op">,</span> <span class="op">&</span>conn)<span class="op">.</span><span class="kw">await</span><span class="op">?;</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
</section>
<section id="key-types" class="level3">
<h3 class="anchored" data-anchor-id="key-types">Key types</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 31%">
<col style="width: 68%">
</colgroup>
<thead>
<tr class="header">
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><code>Conversation</code></td>
<td>Holds <code>messages: Vec<Message></code> and optional <code>tools: Vec<ToolDefinition></code></td>
</tr>
<tr class="even">
<td><code>Message</code></td>
<td>A single message with <code>role</code> (<code>System</code>/<code>User</code>/<code>Assistant</code>/<code>Tool</code>), <code>content</code>, and optional <code>tool_call_id</code></td>
</tr>
<tr class="odd">
<td><code>ToolDefinition</code></td>
<td>Tool name, description, and JSON Schema parameters</td>
</tr>
<tr class="even">
<td><code>LlmResponse</code></td>
<td>Response with <code>content: Vec<ContentBlock></code>, token counts, and model name</td>
</tr>
<tr class="odd">
<td><code>ContentBlock</code></td>
<td>Either <code>Text { text }</code> or <code>ToolCall { id, name, arguments }</code></td>
</tr>
<tr class="even">
<td><code>CascadeError</code></td>
<td>Contains cascade name, error message, and absolute path to the persisted <code>.json</code> file</td>
</tr>
<tr class="odd">
<td><code>ProviderError</code></td>
<td>HTTP status, body, optional <code>retry_after</code> seconds</td>
</tr>
</tbody>
</table>
<hr>
</section>
</section>
<section id="api-reference" class="level2">
<h2 class="anchored" data-anchor-id="api-reference">API Reference</h2>
<section id="run_cascade" class="level3">
<h3 class="anchored" data-anchor-id="run_cascade"><code>run_cascade</code></h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb21"><pre class="sourceCode rust code-with-copy"><code class="sourceCode rust"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">async</span> <span class="kw">fn</span> run_cascade(</span>
<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a> cascade_name<span class="op">:</span> <span class="op">&</span><span class="dt">str</span><span class="op">,</span></span>
<span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a> conversation<span class="op">:</span> <span class="op">&</span>Conversation<span class="op">,</span></span>
<span id="cb21-4"><a href="#cb21-4" aria-hidden="true" tabindex="-1"></a> config<span class="op">:</span> <span class="op">&</span>AppConfig<span class="op">,</span></span>
<span id="cb21-5"><a href="#cb21-5" aria-hidden="true" tabindex="-1"></a> conn<span class="op">:</span> <span class="op">&</span>Connection<span class="op">,</span></span>
<span id="cb21-6"><a href="#cb21-6" aria-hidden="true" tabindex="-1"></a>) <span class="op">-></span> <span class="dt">Result</span><span class="op"><</span>LlmResponse<span class="op">,</span> CascadeError<span class="op">></span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>The core entry point. Iterates through the named cascade’s provider entries, skipping those on cooldown, and returns the first successful <code>LlmResponse</code>.</p>
</section>
<section id="dbinit_db" class="level3">
<h3 class="anchored" data-anchor-id="dbinit_db"><code>db::init_db</code></h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb22"><pre class="sourceCode rust code-with-copy"><code class="sourceCode rust"><span id="cb22-1"><a href="#cb22-1" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> init_db(path<span class="op">:</span> <span class="op">&</span><span class="dt">str</span>) <span class="op">-></span> <span class="dt">Result</span><span class="op"><</span>Connection<span class="op">,</span> <span class="dt">String</span><span class="op">></span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Opens (or creates) the SQLite database and ensures the schema exists. Expands <code>~</code> in the path.</p>
</section>
<section id="dblog_attempt" class="level3">
<h3 class="anchored" data-anchor-id="dblog_attempt"><code>db::log_attempt</code></h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb23"><pre class="sourceCode rust code-with-copy"><code class="sourceCode rust"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> log_attempt(</span>
<span id="cb23-2"><a href="#cb23-2" aria-hidden="true" tabindex="-1"></a> conn<span class="op">:</span> <span class="op">&</span>Connection<span class="op">,</span></span>
<span id="cb23-3"><a href="#cb23-3" aria-hidden="true" tabindex="-1"></a> cascade_name<span class="op">:</span> <span class="op">&</span><span class="dt">str</span><span class="op">,</span></span>
<span id="cb23-4"><a href="#cb23-4" aria-hidden="true" tabindex="-1"></a> provider_model<span class="op">:</span> <span class="op">&</span><span class="dt">str</span><span class="op">,</span></span>
<span id="cb23-5"><a href="#cb23-5" aria-hidden="true" tabindex="-1"></a> http_status<span class="op">:</span> <span class="dt">Option</span><span class="op"><</span><span class="dt">u16</span><span class="op">>,</span></span>
<span id="cb23-6"><a href="#cb23-6" aria-hidden="true" tabindex="-1"></a> latency_ms<span class="op">:</span> <span class="dt">u64</span><span class="op">,</span></span>
<span id="cb23-7"><a href="#cb23-7" aria-hidden="true" tabindex="-1"></a> input_tokens<span class="op">:</span> <span class="dt">Option</span><span class="op"><</span><span class="dt">u32</span><span class="op">>,</span></span>
<span id="cb23-8"><a href="#cb23-8" aria-hidden="true" tabindex="-1"></a> output_tokens<span class="op">:</span> <span class="dt">Option</span><span class="op"><</span><span class="dt">u32</span><span class="op">>,</span></span>
<span id="cb23-9"><a href="#cb23-9" aria-hidden="true" tabindex="-1"></a>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Inserts a row into the <code>attempt_log</code> table.</p>
</section>
<section id="dbis_on_cooldown-dbset_cooldown" class="level3">
<h3 class="anchored" data-anchor-id="dbis_on_cooldown-dbset_cooldown"><code>db::is_on_cooldown</code> / <code>db::set_cooldown</code></h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb24"><pre class="sourceCode rust code-with-copy"><code class="sourceCode rust"><span id="cb24-1"><a href="#cb24-1" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> is_on_cooldown(conn<span class="op">:</span> <span class="op">&</span>Connection<span class="op">,</span> provider_model<span class="op">:</span> <span class="op">&</span><span class="dt">str</span>) <span class="op">-></span> <span class="dt">bool</span></span>
<span id="cb24-2"><a href="#cb24-2" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> set_cooldown(conn<span class="op">:</span> <span class="op">&</span>Connection<span class="op">,</span> provider_model<span class="op">:</span> <span class="op">&</span><span class="dt">str</span><span class="op">,</span> cooldown_until<span class="op">:</span> <span class="op">&</span><span class="dt">str</span>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Query and update the <code>cooldown</code> table. Timestamps are RFC 3339 strings.</p>
</section>
<section id="load_config" class="level3">
<h3 class="anchored" data-anchor-id="load_config"><code>load_config</code></h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb25"><pre class="sourceCode rust code-with-copy"><code class="sourceCode rust"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> load_config(path<span class="op">:</span> <span class="op">&</span><span class="dt">Path</span>) <span class="op">-></span> <span class="dt">Result</span><span class="op"><</span>AppConfig<span class="op">,</span> <span class="dt">String</span><span class="op">></span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Reads and parses the TOML configuration file.</p>
<hr>
</section>
</section>
<section id="cooldown-backoff-behavior" class="level2">
<h2 class="anchored" data-anchor-id="cooldown-backoff-behavior">Cooldown & Backoff Behavior</h2>
<table class="caption-top table">
<colgroup>
<col style="width: 35%">
<col style="width: 64%">
</colgroup>
<thead>
<tr class="header">
<th>Scenario</th>
<th>Cooldown Duration</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>HTTP 429 with <code>retry-after</code> header</td>
<td>Value from header (seconds)</td>
</tr>
<tr class="even">
<td>HTTP 429 without header</td>
<td>30 s (doubles per consecutive failure, max 1 h)</td>
</tr>
<tr class="odd">
<td>Other HTTP error (4xx/5xx)</td>
<td>30 s base, exponential doubling</td>
</tr>
<tr class="even">
<td>Successful response</td>
<td>No cooldown set</td>
</tr>
</tbody>
</table>
<p>Cooldowns are <strong>per entry</strong> (e.g., <code>openai/gpt-4o</code> can be on cooldown while <code>openai/gpt-3.5-turbo</code> stays active) and <strong>persisted in SQLite</strong> so separate CLI invocations share the same state.</p>
<hr>
</section>
<section id="roadmap" class="level2">
<h2 class="anchored" data-anchor-id="roadmap">Roadmap</h2>
<ul class="task-list">
<li><label><input type="checkbox">Streaming response support</label></li>
<li><label><input type="checkbox">Configurable per-provider timeouts</label></li>
<li><label><input type="checkbox">Token budget limits per cascade</label></li>
<li><label><input type="checkbox">Retry with modified parameters (e.g., lower temperature)</label></li>
<li><label><input type="checkbox">Prometheus metrics export</label></li>
<li><label><input type="checkbox">Web dashboard for cooldown/attempt monitoring</label></li>
<li><label><input type="checkbox">Additional native providers (Mistral, Cohere, AWS Bedrock, Azure OpenAI)</label></li>
<li><label><input type="checkbox" checked="">Published crate on crates.io</label></li>
</ul>
<hr>
</section>
<section id="contributing" class="level2">
<h2 class="anchored" data-anchor-id="contributing">Contributing</h2>
<p>Contributions are welcome! This is an open-source project under the MIT license.</p>
<section id="getting-started" class="level3">
<h3 class="anchored" data-anchor-id="getting-started">Getting started</h3>
<ol type="1">
<li>Fork the repository</li>
<li>Clone your fork: <code>git clone https://github.com/paluigi/llm-cascade.git</code></li>
<li>Create a branch: <code>git checkout -b feature/your-feature</code></li>
<li>Build and test: <code>cargo build && cargo clippy -- -D warnings</code></li>
</ol>
</section>
<section id="making-changes" class="level3">
<h3 class="anchored" data-anchor-id="making-changes">Making changes</h3>
<ul>
<li>Follow the existing code style (no comments unless necessary, concise naming)</li>
<li>Ensure <code>cargo clippy -- -D warnings</code> passes with zero warnings</li>
<li>Update this README if you change public API or configuration</li>
</ul>
</section>
<section id="submitting" class="level3">
<h3 class="anchored" data-anchor-id="submitting">Submitting</h3>
<ol type="1">
<li>Push to your fork: <code>git push origin feature/your-feature</code></li>
<li>Open a Pull Request against the <code>main</code> branch</li>
<li>Describe your changes and the motivation behind them</li>
</ol>
</section>
<section id="reporting-issues" class="level3">
<h3 class="anchored" data-anchor-id="reporting-issues">Reporting issues</h3>
<p>Use the <a href="https://github.com/paluigi/llm-cascade/issues">GitHub issue tracker</a> to report bugs, request features, or ask questions. Please include:</p>
<ul>
<li>Rust version (<code>rustc --version</code>)</li>
<li>OS and version</li>
<li>Minimal reproduction steps</li>
<li>Relevant log output (with <code>RUST_LOG=debug</code>)</li>
</ul>
<hr>
</section>
</section>
<section id="license" class="level2">
<h2 class="anchored" data-anchor-id="license">License</h2>
<p><a href="LICENSE">MIT</a> — Copyright (c) 2026 Luigi Palumbo</p>
</section>
</section>
</main>
<!-- /main column -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const isCodeAnnotation = (el) => {
for (const clz of el.classList) {
if (clz.startsWith('code-annotation-')) {
return true;
}
}
return false;
}
const onCopySuccess = function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
}
const getTextToCopy = function(trigger) {
const outerScaffold = trigger.parentElement.cloneNode(true);
const codeEl = outerScaffold.querySelector('code');
for (const childEl of codeEl.children) {
if (isCodeAnnotation(childEl)) {
childEl.remove();
}
}
return codeEl.innerText;
}
const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
text: getTextToCopy
});
clipboard.on('success', onCopySuccess);
if (window.document.getElementById('quarto-embedded-source-code-modal')) {
const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
text: getTextToCopy,
container: window.document.getElementById('quarto-embedded-source-code-modal')
});
clipboardModal.on('success', onCopySuccess);
}
var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
var mailtoRegex = new RegExp(/^mailto:/);
var filterRegex = new RegExp('/' + window.location.host + '/');
var isInternal = (href) => {
return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
}
// Inspect non-navigation links and adorn them if external
var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
for (var i=0; i<links.length; i++) {
const link = links[i];
if (!isInternal(link.href)) {
// undo the damage that might have been done by quarto-nav.js in the case of
// links that we want to consider external
if (link.dataset.originalHref !== undefined) {
link.href = link.dataset.originalHref;
}
}
}
function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
const config = {
allowHTML: true,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start',
};
if (contentFn) {
config.content = contentFn;
}
if (onTriggerFn) {
config.onTrigger = onTriggerFn;
}
if (onUntriggerFn) {
config.onUntrigger = onUntriggerFn;
}
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
if (note) {
return note.innerHTML;
} else {
return "";
}
});
}
const xrefs = window.document.querySelectorAll('a.quarto-xref');
const processXRef = (id, note) => {
// Strip column container classes
const stripColumnClz = (el) => {
el.classList.remove("page-full", "page-columns");
if (el.children) {
for (const child of el.children) {
stripColumnClz(child);
}
}
}
stripColumnClz(note)
if (id === null || id.startsWith('sec-')) {
// Special case sections, only their first couple elements
const container = document.createElement("div");
if (note.children && note.children.length > 2) {
container.appendChild(note.children[0].cloneNode(true));
for (let i = 1; i < note.children.length; i++) {
const child = note.children[i];
if (child.tagName === "P" && child.innerText === "") {
continue;
} else {
container.appendChild(child.cloneNode(true));
break;
}
}
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(container);
}
return container.innerHTML
} else {
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(note);
}
return note.innerHTML;
}
} else {
// Remove any anchor links if they are present
const anchorLink = note.querySelector('a.anchorjs-link');
if (anchorLink) {
anchorLink.remove();
}
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(note);
}
if (note.classList.contains("callout")) {
return note.outerHTML;
} else {
return note.innerHTML;
}
}
}
for (var i=0; i<xrefs.length; i++) {
const xref = xrefs[i];
tippyHover(xref, undefined, function(instance) {
instance.disable();
let url = xref.getAttribute('href');
let hash = undefined;
if (url.startsWith('#')) {
hash = url;
} else {
try { hash = new URL(url).hash; } catch {}
}
if (hash) {
const id = hash.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
if (note !== null) {
try {
const html = processXRef(id, note.cloneNode(true));
instance.setContent(html);
} finally {
instance.enable();
instance.show();
}
} else {
// See if we can fetch this
fetch(url.split('#')[0])
.then(res => res.text())
.then(html => {
const parser = new DOMParser();
const htmlDoc = parser.parseFromString(html, "text/html");
const note = htmlDoc.getElementById(id);
if (note !== null) {
const html = processXRef(id, note);
instance.setContent(html);
}
}).finally(() => {
instance.enable();
instance.show();
});
}
} else {
// See if we can fetch a full url (with no hash to target)
// This is a special case and we should probably do some content thinning / targeting
fetch(url)
.then(res => res.text())
.then(html => {
const parser = new DOMParser();
const htmlDoc = parser.parseFromString(html, "text/html");
const note = htmlDoc.querySelector('main.content');
if (note !== null) {
// This should only happen for chapter cross references
// (since there is no id in the URL)
// remove the first header
if (note.children.length > 0 && note.children[0].tagName === "HEADER") {
note.children[0].remove();
}
const html = processXRef(null, note);
instance.setContent(html);
}
}).finally(() => {
instance.enable();
instance.show();
});
}
}, function(instance) {
});
}
let selectedAnnoteEl;
const selectorForAnnotation = ( cell, annotation) => {
let cellAttr = 'data-code-cell="' + cell + '"';
let lineAttr = 'data-code-annotation="' + annotation + '"';
const selector = 'span[' + cellAttr + '][' + lineAttr + ']';
return selector;
}
const selectCodeLines = (annoteEl) => {
const doc = window.document;
const targetCell = annoteEl.getAttribute("data-target-cell");
const targetAnnotation = annoteEl.getAttribute("data-target-annotation");
const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation));
const lines = annoteSpan.getAttribute("data-code-lines").split(",");
const lineIds = lines.map((line) => {
return targetCell + "-" + line;
})
let top = null;
let height = null;
let parent = null;
if (lineIds.length > 0) {
//compute the position of the single el (top and bottom and make a div)
const el = window.document.getElementById(lineIds[0]);
top = el.offsetTop;
height = el.offsetHeight;
parent = el.parentElement.parentElement;
if (lineIds.length > 1) {
const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]);
const bottom = lastEl.offsetTop + lastEl.offsetHeight;
height = bottom - top;
}
if (top !== null && height !== null && parent !== null) {
// cook up a div (if necessary) and position it
let div = window.document.getElementById("code-annotation-line-highlight");
if (div === null) {
div = window.document.createElement("div");
div.setAttribute("id", "code-annotation-line-highlight");
div.style.position = 'absolute';
parent.appendChild(div);
}
div.style.top = top - 2 + "px";
div.style.height = height + 4 + "px";
div.style.left = 0;
let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter");
if (gutterDiv === null) {
gutterDiv = window.document.createElement("div");
gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter");
gutterDiv.style.position = 'absolute';
const codeCell = window.document.getElementById(targetCell);
const gutter = codeCell.querySelector('.code-annotation-gutter');
gutter.appendChild(gutterDiv);
}
gutterDiv.style.top = top - 2 + "px";
gutterDiv.style.height = height + 4 + "px";
}
selectedAnnoteEl = annoteEl;
}
};
const unselectCodeLines = () => {
const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"];
elementsIds.forEach((elId) => {
const div = window.document.getElementById(elId);
if (div) {
div.remove();
}
});
selectedAnnoteEl = undefined;
};
// Handle positioning of the toggle
window.addEventListener(
"resize",
throttle(() => {
elRect = undefined;
if (selectedAnnoteEl) {
selectCodeLines(selectedAnnoteEl);
}
}, 10)
);
function throttle(fn, ms) {
let throttle = false;
let timer;
return (...args) => {
if(!throttle) { // first call gets through
fn.apply(this, args);
throttle = true;
} else { // all the others get throttled
if(timer) clearTimeout(timer); // cancel #2
timer = setTimeout(() => {
fn.apply(this, args);
timer = throttle = false;
}, ms);
}
};
}
// Attach click handler to the DT
const annoteDls = window.document.querySelectorAll('dt[data-target-cell]');
for (const annoteDlNode of annoteDls) {
annoteDlNode.addEventListener('click', (event) => {
const clickedEl = event.target;
if (clickedEl !== selectedAnnoteEl) {
unselectCodeLines();
const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active');
if (activeEl) {
activeEl.classList.remove('code-annotation-active');
}
selectCodeLines(clickedEl);
clickedEl.classList.add('code-annotation-active');
} else {
// Unselect the line
unselectCodeLines();
clickedEl.classList.remove('code-annotation-active');
}
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
</div> <!-- /content -->
</body></html>