Module curator

Expand description

v0.7.0 WT-1-B — atomisation curator.

The curator is the LLM-facing half of the atomisation engine: it consumes one long memory body, asks Gemma 4 (E2B at the smart tier, E4B at autonomous) to decompose it into atomic propositions, parses the structured JSON response, validates per-atom token budgets via tiktoken-rs::cl100k_base, and returns a Vec<Atom> ready for the substrate writer in super::Atomiser::atomise.

The curator is intentionally factored as a trait (Curator) so the substrate test suite can inject a deterministic mock (see tests/atomisation/core). The production implementation (LlmCurator) wraps an OllamaClient and is hot-path only when the daemon’s tier resolves to smart or higher.

§Retry contract

Malformed JSON responses retry up to curator_max_retries times (default 3) with exponential backoff (100 ms → 500 ms → 2500 ms). Each retry re-sends the original prompt verbatim — the LLM call is stateless on our side. After the final attempt fails, the curator surfaces CuratorError::MalformedResponse carrying the last parser diagnostic; super::Atomiser::atomise maps that to super::AtomiseError::CuratorFailed.

§Token-budget contract

Atoms slightly over budget are accepted as-is — the curator emits a warn-level log line and proceeds. The rationale is documented in the WT-1-B brief (“fail-soft: accept atoms slightly over budget rather than retry-loop”). The substrate writer is the authoritative gate on memory size (governed by validate::validate_content), not the curator.

Structs§

Atom: One proposed atom returned by the curator.
CuratorResponse: Top-level wire shape returned by the LLM.
LlmCurator: Production curator. Wraps an OllamaClient (or any crate::autonomy::AutonomyLlm-like surface — we re-use the existing generate shape via a free function rather than coupling to the autonomy trait, because the autonomy trait does not expose generate(prompt, system)).

Enums§

CuratorError: Curator-side error surface.

Constants§

CURATOR_SYSTEM_PROMPT: Verbatim system prompt sent to the LLM. The {max_atom_tokens} token is substituted at call time. The shape of the JSON response is pinned here — the parser depends on exactly this { atoms: [...] } envelope.

Traits§

Curator: Trait surface the super::Atomiser consumes.
LlmGenerate: Minimal generate surface the curator needs. Implemented for crate::llm::OllamaClient in the same module; the trait stays here (not in src/llm.rs) so external callers don’t accidentally pull it into their wire path.

Functions§

backoff_for_attempt: Exponential backoff schedule for the curator retry loop: 100 ms, 500 ms, 2500 ms. Indexed by zero-based retry attempt; out of range collapses to the last entry so a misconfigured retry cap does not surface a panic!.
enforce_token_budget: Token-budget guardrail — accept atoms within 25% of the budget, warn-log overshoots, drop atoms more than 25% over budget so a pathological response cannot pollute the memory store.
parse_response: Try to parse one candidate response body into a CuratorResponse.
render_system_prompt: Render the system prompt with the supplied token budget substituted.

Module curator

Module curator Copy item path

§Retry contract

§Token-budget contract

Structs§

Enums§

Constants§

Traits§

Functions§

Module curator