Expand description
ReasoningBank: distilled reasoning strategy memory (#3342).
After each completed agent turn a three-stage async pipeline runs off the hot path:
- Self-judge (
run_self_judge) — a fast LLM evaluates success/failure and extracts the key reasoning steps. - Distillation (
distill_strategy) — a strategy summary (≤ 3 sentences) is generated from the reasoning chain, capturing the transferable principle. - Storage (
ReasoningMemory::insert) — the summary is written toSQLiteand, when Qdrant is available, embedded and indexed for vector retrieval.
At context-build time ReasoningMemory::retrieve_by_embedding fetches top-k
strategies by embedding similarity. The caller (in zeph-context) calls
ReasoningMemory::mark_used only for strategies actually injected into the prompt,
after budget truncation (C4 split from architect plan).
§LRU eviction
ReasoningMemory::evict_lru protects rows with use_count > HOT_STRATEGY_USE_COUNT
(default 10) from normal eviction. When all rows are hot and the table exceeds
2 × store_limit, a forced eviction pass deletes the oldest rows unconditionally
and emits a warn! so operators can tune store_limit upward.
§LRU eviction race note
Two concurrent turns may race on the count check in evict_lru. Either both evict
(over-eviction by at most top_k rows) or neither. This is acceptable for MVP —
the table remains bounded.
Structs§
- Outcome
Parse Error - Error returned when parsing an
Outcomefrom a string fails. - Process
Turn Config - Configuration for the
process_turnextraction pipeline. - Reasoning
Memory - SQLite-backed store for distilled reasoning strategies.
- Reasoning
Strategy - A distilled reasoning strategy row from the
reasoning_strategiestable. - Self
Judge Outcome - Parsed response from the self-judge LLM call.
Enums§
- Outcome
- Outcome of a reasoning strategy: whether the agent succeeded or failed.
Constants§
- REASONING_
COLLECTION - Qdrant collection name used for reasoning-strategy embeddings.
Functions§
- distill_
strategy - Run the distillation step.
- process_
turn - Run the full extraction pipeline for a single turn.
- run_
self_ judge - Run the self-judge step against a turn’s message tail.