Expand description
Cost kernel — the API-equivalent pricing math, ported 1:1 from
usage_signal.py (the FALLBACK_PRICING table, model matcher, _tiered,
turn_cost, turn_cache_savings). This is the first slice of folding the
Python aggregator into Rust (ROADMAP E1). It is PURE — no I/O, no clock — so
it is pinned by a golden fixture generated from the Python (tests/),
guaranteeing byte-for-byte cost parity (see docs/ai/COST_MODEL.md).
Rates are USD per token. The bundled table mirrors the LiteLLM dataset
ccusage uses; the live LiteLLM fetch + 24h cache (the I/O half of
load_pricing) is a later slice — until then this is the offline table,
which is exactly what CONTEXTBAR_PRICING_OFFLINE=1 selects in the Python.
Structs§
- Rate
- One model’s per-token rates.
None= the category isn’t billed / has no tier (e.g. OpenAI models have no cache-writecw; flat models have no*_200k). Mirrors the Python short-rate dict keys exactly.
Constants§
- TIER_
THRESHOLD - Anthropic’s long-context tier threshold: tokens strictly above this in a
category bill at the
*_200krate (when the model carries one).
Statics§
- FALLBACK_
PRICING - Bundled offline rate table (USD/token), captured from LiteLLM — verbatim
from
FALLBACK_PRICINGinusage_signal.py. Order is preserved so the longest-prefix match inmatch_pricingis deterministic.
Functions§
- fallback_
table - The bundled offline table as a
Table— the deterministic baseline thatCONTEXTBAR_PRICING_OFFLINE=1selects in the Python, and what the golden tests pin against. - load_
pricing - Live + cached pricing resolution (native only — needs HTTP + filesystem).
Mirrors the Python
load_pricing: fresh 24h cache → live LiteLLM fetch (then cache) → stale cache → bundled fallback.CONTEXTBAR_PRICING_OFFLINEforces the offline (fallback) path. - match_
pricing - Resolve a transcript model id onto a rate using
table.Nonewhen unpriceable (cost 0 — an honest undercount, never a crash). Mirrorsmatch_pricing(model, table). - normalize_
model - Normalize a transcript model id: lowercase, strip provider prefixes, drop the 1M-context tag (pricing is identical to the base model).
- tiered
- Anthropic >200K tiering for one token category (ccusage-compatible).
- turn_
cache_ savings - NET USD that prompt caching saved this turn (can be slightly negative on a
write-heavy turn). Mirrors
turn_cache_savings. - turn_
cost - Estimated USD for one turn given its rate + token buckets. Arg order mirrors
the Python
turn_cost(rate, inp, cache_create, cache_read, outp).
Type Aliases§
- Table
- A resolved rate table (the bundled fallback merged with any live/cached LiteLLM rates). Keyed by normalized model id.