Module compact

Expand description

History compaction.

Manages conversation history size by summarizing older messages when the context window limit approaches. Implements three compaction strategies:

Auto-compact: triggered when estimated tokens exceed threshold
Reactive compact: triggered by API prompt_too_long errors
Microcompact: clears stale tool results to free tokens

§Thresholds

|<--- context window (e.g., 200K) -------------------------------->|
|<--- effective window (context - 20K reserved) ------------------>|
|<--- auto-compact threshold (effective - 13K buffer) ------------>|
|                                                    ↑ compact fires here

Structs§

CompactTracking: Tracking state for auto-compact across turns.
TokenWarningState: Token warning state for the UI.

Constants§

MAX_OUTPUT_TOKENS_RECOVERY_LIMIT: Maximum recovery attempts for max-output-tokens errors.

Functions§

auto_compact_threshold: Calculate the auto-compact threshold.
build_compact_summary_prompt: Build a compact summary request: asks the LLM to summarize the conversation up to a certain point.
compact_boundary_message: Create a compact boundary marker message.
compact_with_llm: Perform full LLM-based compaction of the conversation history.
effective_context_window: Calculate the effective context window (total minus output reservation).
max_output_recovery_message: Build the recovery message injected when max-output-tokens is hit.
microcompact: Perform microcompact: clear stale tool results to free tokens.
parse_prompt_too_long_gap: Parse a “prompt too long” error to extract the token gap.
should_auto_compact: Check whether auto-compact should fire for this conversation.
token_warning_state: Calculate token warning state for the current conversation.

Module compact

Module compact Copy item path

§Thresholds

Structs§

Constants§

Functions§

Module compact