Skip to main content

Module cost_preview

Module cost_preview 

Source
Available on crate features async and pricing only.
Expand description

CostPreview – estimate the USD cost of a request before sending it.

The input side is exact: it hits /v1/messages/count_tokens to get the tokenizer’s actual count. The output side is bounded: we use request.max_tokens as the upper bound, since the actual number of output tokens is unknown until generation finishes. Use CostPreview::cost_for for a point estimate at any specific output count.

Obtain via Messages::cost_preview.

Structs§

CostPreview
Pre-flight cost estimate for a request.
CountTokensCache
Bounded cache for count_tokens results, keyed by a stable hash of the request body. Use to skip the network round-trip on repeated previews against unchanged inputs (long-running agent sessions, IDE integrations, etc.).

Functions§

hash_request
Hash a value’s serde-JSON serialization to a stable u64. Suitable as a cache key for CountTokensCache. Returns the empty-string hash on serialization failure (effectively groups malformed inputs together; shouldn’t happen for crate-owned types).