Skip to main content

EMBEDDING_REQUEST_MAX_TOKENS

Constant EMBEDDING_REQUEST_MAX_TOKENS 

Source
pub const EMBEDDING_REQUEST_MAX_TOKENS: usize = 30_000;
Expand description

Maximum token count for a SINGLE embedding request input (GAP-SG-02).

The qwen/qwen3-embedding-8b model used by the OpenRouter backend accepts roughly 32K tokens of context. This ceiling rejects an input above a safe margin BEFORE the HTTP request, using the conservative cl100k_base proxy in crate::tokenizer::count_tokens (which emits at least as many tokens as Qwen for the same text). Distinct from EMBEDDING_MAX_TOKENS (512), which is the per-chunk ceiling that drives chunking.