Expand description
Session history compression via the RLM router.
This module contains the context-window enforcement logic that keeps
the prompt under the model’s token budget. It is invoked automatically
at the start of every agent step by Session::run_loop.
§Strategy
- Estimate the current request token cost (system + messages + tools).
- If it exceeds 90% of the model’s usable budget, compress the prefix
of the conversation via
RlmRouter::auto_process, keeping the most recentkeep_lastmessages verbatim. - Progressively shrink
keep_last(16 → 12 → 8 → 6) until the budget is met or nothing more can be compressed.
The compressed prefix is replaced by a single synthetic assistant
message tagged [AUTO CONTEXT COMPRESSION] so the model sees a
coherent summary rather than a truncated tail.