Skip to main content

Crate llm_budget_window

Crate llm_budget_window 

Source
Expand description

§llm-budget-window

Time-windowed token + USD budget for LLM calls.

token-budget-pool caps total spend across concurrent tasks. This crate adds a time axis: cap spend per minute, per hour, per day, or any combination. Each recorded call is timestamped; older entries fall out of the window automatically.

§Quick example

use std::time::Duration;
use llm_budget_window::{BudgetWindows, Window, WindowBreached};

let bw = BudgetWindows::new(vec![
    Window::new("per_minute", Duration::from_secs(60))
        .with_token_cap(50_000)
        .with_usd_cap(1.0),
    Window::new("per_hour", Duration::from_secs(3600))
        .with_usd_cap(10.0),
]);

// record consumption; raises if ANY window would breach
bw.record(tokens(1000), usd(0.05)).unwrap();

// for very cheap calls, both windows have plenty of room
for _ in 0..50 {
    let _ = bw.record(tokens(100), usd(0.001));
}

§Memory

Each window keeps a VecDeque of (timestamp, tokens, usd) records. Old records age out on every record() and snapshot(). For very high call rates, set windows you actually need - a 1-day window holds every call from the last 24h.

Structs§

BudgetWindows
Thread-safe time-windowed budget across N windows.
Window
Configuration for one rolling window.
WindowBreached
Raised by BudgetWindows::record when a record would push any window’s running total past its cap.
WindowSnapshot
Immutable snapshot of one window’s current totals.