Skip to main content

Module compact

Module compact 

Source
Expand description

History compaction.

Manages conversation history size by summarizing older messages when the context window limit approaches. Implements three compaction strategies:

  • Auto-compact: triggered when estimated tokens exceed threshold
  • Reactive compact: triggered by API prompt_too_long errors
  • Microcompact: clears stale tool results to free tokens

§Thresholds

|<--- context window (e.g., 200K) -------------------------------->|
|<--- effective window (context - 20K reserved) ------------------>|
|<--- auto-compact threshold (effective - 13K buffer) ------------>|
|                                                    ↑ compact fires here

Structs§

CompactTracking
Tracking state for auto-compact across turns.
TokenWarningState
Token warning state for the UI.

Constants§

MAX_OUTPUT_TOKENS_RECOVERY_LIMIT
Maximum recovery attempts for max-output-tokens errors.

Functions§

auto_compact_threshold
Calculate the auto-compact threshold.
build_compact_summary_prompt
Build a compact summary request: asks the LLM to summarize the conversation up to a certain point.
compact_boundary_message
Create a compact boundary marker message.
compact_with_llm
Perform full LLM-based compaction of the conversation history.
effective_context_window
Calculate the effective context window (total minus output reservation).
max_output_recovery_message
Build the recovery message injected when max-output-tokens is hit.
microcompact
Perform microcompact: clear stale tool results to free tokens.
parse_prompt_too_long_gap
Parse a “prompt too long” error to extract the token gap.
should_auto_compact
Check whether auto-compact should fire for this conversation.
token_warning_state
Calculate token warning state for the current conversation.