Expand description
Canonical per-million-token pricing for known LLM models.
This is the single source of truth for model pricing. Every subsystem (TUI cost badge, cost guardrail, benchmark runner) must delegate here rather than maintaining its own table.
Prices are best-effort retail USD per 1M tokens (input, output).
Unknown models fall back to a conservative default.
Functions§
- cache_
read_ multiplier - Multiplier applied to
input_pricefor cache-read tokens. Varies by provider family: - cache_
write_ multiplier - Multiplier applied to
input_pricefor cache-write tokens. Only Anthropic/Bedrock bill a surcharge for cache writes (1.25×). For providers that cache implicitly (OpenAI, Gemini, GLM) this is0.0because the write is bundled into the regular input price. - pricing_
for_ model - Return
(input_price_per_million, output_price_per_million)in USD for a given model identifier. Matching is case-insensitive substring. - session_
cost_ usd - Estimate the running cost (USD) of the current session, using the
global
crate::telemetry::TOKEN_USAGEcounters andpricing_for_model. Applies provider-specific prompt-cache multipliers viacache_read_multiplierandcache_write_multiplier.