Available on crate features
async and pricing only.Expand description
CostPreview – estimate the USD cost of a request before sending it.
The input side is exact: it hits /v1/messages/count_tokens to get the
tokenizer’s actual count. The output side is bounded: we use
request.max_tokens as the upper bound, since the actual number of
output tokens is unknown until generation finishes. Use
CostPreview::cost_for for a point estimate at any specific output
count.
Obtain via Messages::cost_preview.
Structs§
- Cost
Preview - Pre-flight cost estimate for a request.
- Count
Tokens Cache - Bounded cache for
count_tokensresults, keyed by a stable hash of the request body. Use to skip the network round-trip on repeated previews against unchanged inputs (long-running agent sessions, IDE integrations, etc.).
Functions§
- hash_
request - Hash a value’s serde-JSON serialization to a stable u64. Suitable as a
cache key for
CountTokensCache. Returns the empty-string hash on serialization failure (effectively groups malformed inputs together; shouldn’t happen for crate-owned types).