Expand description
Model context window limits registry.
There is intentionally no built-in per-model table. Real per-model values come from two sources, in priority order:
- provider runtime metadata (e.g. Copilot reports real context/output),
- user overrides persisted in
model_limits.json.
Anything without a match falls back to a single global default
(DEFAULT_MAX_CONTEXT_TOKENS / DEFAULT_MAX_OUTPUT_TOKENS). This keeps the
registry from going stale as models churn — see token_budget.rs.
Structs§
- Model
Limit - Model limit configuration (user-overridable).
- Model
Limits Registry - Registry for model limits with built-in defaults and user overrides.
Constants§
- DEFAULT_
MAX_ CONTEXT_ TOKENS - Global default context window applied to any model without a provider metadata value or a user override. 1M reflects the current mainstream range across frontier models (Claude 3.5, GPT-4o, Gemini 1.5, etc.).
- DEFAULT_
MAX_ OUTPUT_ TOKENS - Global default maximum output tokens.
- DEFAULT_
MODEL_ PATTERN - Sentinel pattern used for the single global fallback limit.
- DEFAULT_
SAFETY_ MARGIN - Default safety margin for token counting errors (floor; scales with context
window via
ModelLimit::get_safety_margin).
Functions§
- create_
budget_ for_ model - Create a token budget for a specific model.
- default_
model_ limit - Build the single global default limit (
1Mcontext /128Koutput). - get_
default_ config_ path - Get the default configuration file path.
- is_
default_ limit - Whether a user override is a no-op — identical to the global default, so it carries no information and need not be persisted (diff-only storage).
- load_
model_ limits_ from_ unified_ config - Load user model limits from the unified
config.jsonmodel_limitsvalue.