pub struct ModelPricing {
pub input_per_million: f64,
pub output_per_million: f64,
pub cached_input_per_million: Option<f64>,
pub cache_write_per_million: Option<f64>,
pub batch_input_per_million: Option<f64>,
pub batch_output_per_million: Option<f64>,
pub flex_input_per_million: Option<f64>,
pub flex_output_per_million: Option<f64>,
pub prompt_cache_min_tokens: Option<u32>,
pub effective_at: DateTime<Utc>,
}Fields§
§input_per_million: f64USD per 1M input tokens.
output_per_million: f64USD per 1M output tokens.
cached_input_per_million: Option<f64>USD per 1M cached input tokens (Anthropic 10%, OpenAI 10%, Gemini 10%).
cache_write_per_million: Option<f64>USD per 1M cache-creation (cache-write) input tokens. Anthropic charges
~1.25× the base input rate for tokens written to the prompt cache.
None for providers with no documented write premium (cost path unchanged).
batch_input_per_million: Option<f64>USD per 1M batch (async) input tokens. Providers with a batch tier
(OpenAI / Anthropic / Gemini) bill async requests at ~50% of standard
input. None for providers with no batch tier.
batch_output_per_million: Option<f64>USD per 1M batch (async) output tokens (~50% of standard output).
None for providers with no batch tier.
flex_input_per_million: Option<f64>USD per 1M input tokens under OpenAI’s Flex service tier
(service_tier: "flex") — a synchronous-but-slower tier billed at Batch
API rates (~50% of standard). None for models/providers with no Flex
tier; presence is the eligibility gate (only models that carry a Flex
rate may be opted into service_tier=flex). See
developers.openai.com/api/docs/guides/flex-processing.
flex_output_per_million: Option<f64>USD per 1M output tokens under the Flex service tier (~50% of standard
output). None when the model has no Flex tier.
prompt_cache_min_tokens: Option<u32>Provider minimum prefix length, in tokens, before a cache_control
breakpoint actually caches (shorter prefixes silently don’t cache).
Anthropic varies this by model (2048–4096); None when not documented.
effective_at: DateTime<Utc>When this pricing took effect (for historical replay).
Implementations§
Source§impl ModelPricing
impl ModelPricing
Sourcepub fn cache_write_rate_per_million(&self, tier: CacheWriteTier) -> Option<f64>
pub fn cache_write_rate_per_million(&self, tier: CacheWriteTier) -> Option<f64>
USD per 1M cache-write (creation) tokens for the given TTL tier.
FiveMin→ the catalog’scache_write_per_million(the 5-minute/1.25× rate Anthropic applies to bareephemeralwrites).OneHour→ the documented 2× base-input rate, but only when a 5-min write premium is documented (i.e. the provider tiers cache writes at all). Providers with no write premium returnNonefor both tiers so the caller falls back to the plain input rate, unchanged.
Returns None when no write premium applies, so callers price the
remaining tokens at input_per_million.
Sourcepub fn flex_eligible(&self) -> bool
pub fn flex_eligible(&self) -> bool
Whether this model is eligible for OpenAI’s Flex service tier
(service_tier: "flex"). Eligibility is catalog-driven: a model is
flex-eligible iff it carries a Flex input rate. OpenAI lists Flex prices
only for supported models (gpt-5.x family); o3 / o4-mini are batch-only
“specialized models” and therefore carry no Flex rate.
Sourcepub fn flex_rates_per_million(&self) -> Option<(f64, f64)>
pub fn flex_rates_per_million(&self) -> Option<(f64, f64)>
The Flex (input, output) per-million rates when this model is
flex-eligible, else None. Both are present together for an eligible
row (the catalog carries the pair); a missing output rate falls back to
the standard output rate so a partially-populated row stays priceable.
Trait Implementations§
Source§impl Clone for ModelPricing
impl Clone for ModelPricing
Source§fn clone(&self) -> ModelPricing
fn clone(&self) -> ModelPricing
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more