Expand description
Deferred MCP tool loading — stubs and activated-tool tracking.
When mcp.deferred_loading is enabled, MCP tool schemas are NOT eagerly
included in the LLM context window. Instead, only lightweight stubs (name +
description) are exposed in the system prompt. The LLM must call the built-in
tool_search tool to fetch full schemas, which moves them into the
ActivatedToolSet for the current conversation.
Structs§
- Activated
Tool Set - Per-conversation mutable state tracking which deferred tools have been
activated (i.e. their full schemas have been fetched via
tool_search). The agent loop consults this each iteration to decide which tool_specs to include in the LLM request. - Deferred
McpTool Set - Collection of all deferred MCP tool stubs discovered at startup.
Provides keyword search for
tool_search. - Deferred
McpTool Stub - A lightweight stub representing a known-but-not-yet-loaded MCP tool.
Contains only the prefixed name, a human-readable description, and enough
information to construct the full
McpToolWrapperon activation.
Constants§
- LOCAL_
MODEL_ EAGER_ SUFFIXES - Minimal set of operator tool suffixes loaded eagerly for local models
(e.g. Ollama). Everything else is deferred behind
tool_search. - OPERATOR_
MEMORY_ REFLEX_ TOOLS - Kumiho memory reflexes — kept eager for every provider hosting the
Operator seat so the agent can
engage/reflectwithout atool_searchindirection on every turn.
Functions§
- build_
deferred_ tools_ section - Build the
<available-deferred-tools>section for the system prompt. Lists only tool names so the LLM knows what is available without consuming context window on full schemas. Includes an instruction block that tells the LLM to calltool_searchto activate them. - is_
local_ model_ eager_ tool - Returns
trueif this MCP tool name should be eagerly loaded for a local model, based on whether its unprefixed suffix matchesLOCAL_MODEL_EAGER_SUFFIXES. - is_
operator_ seat_ eager_ tool - Returns
trueif this MCP tool name should be eagerly loaded for any provider hosting the Operator seat (cloud or local). Combines the curated operator essentials with the Kumiho memory reflexes. Cloud providers previously loaded all operator tools (100+), blowing out per-turn input tokens; the rest are discoverable viatool_search.