Skip to main content

Module limits

Module limits 

Source
Expand description

Model context window limits registry.

There is intentionally no built-in per-model table. Real per-model values come from two sources, in priority order:

  1. provider runtime metadata (e.g. Copilot reports real context/output),
  2. user overrides persisted in model_limits.json.

Anything without a match falls back to a single global default (DEFAULT_MAX_CONTEXT_TOKENS / DEFAULT_MAX_OUTPUT_TOKENS). This keeps the registry from going stale as models churn — see token_budget.rs.

Structs§

ModelLimit
Model limit configuration (user-overridable).
ModelLimitsRegistry
Registry for model limits with built-in defaults and user overrides.

Constants§

DEFAULT_MAX_CONTEXT_TOKENS
Global default context window applied to any model without a provider metadata value or a user override. 1M reflects the current mainstream range across frontier models (Claude 3.5, GPT-4o, Gemini 1.5, etc.).
DEFAULT_MAX_OUTPUT_TOKENS
Global default maximum output tokens.
DEFAULT_MODEL_PATTERN
Sentinel pattern used for the single global fallback limit.
DEFAULT_SAFETY_MARGIN
Default safety margin for token counting errors (floor; scales with context window via ModelLimit::get_safety_margin).

Functions§

create_budget_for_model
Create a token budget for a specific model.
default_model_limit
Build the single global default limit (1M context / 128K output).
get_default_config_path
Get the default configuration file path.
is_default_limit
Whether a user override is a no-op — identical to the global default, so it carries no information and need not be persisted (diff-only storage).
load_model_limits_from_unified_config
Load user model limits from the unified config.json model_limits value.