Expand description
v0.4.0 (P0-3) — Code-mode recall.
Cloudflare’s Code Mode MCP (announced 2026-04-24) showed that
agents calling tools through generated host-side code instead of
JSON tool envelopes can drop per-turn token cost by ~99.9%. The
same shape applies to recall: instead of the LLM paying for
tool_call(recall, {query: "..."}) plus tool_result([memory, memory, ...]) JSON each turn, the host hands the LLM a
sandboxed wasm host whose imports it can call as plain functions.
This crate ships:
- The host-side data shapes
CodeModeRecall,RecallBundle,ResourceBudget. - A pure host-side runner
run_code_mode_hostthat accepts a pre-built guest “program” (a list of recall calls) and produces aRecallBundle— used by the token-budget tests + themnemo recall --code-modeCLI in the binary crate. - The WIT world definition under
wit/mnemo-memory.witdescribing the import/export surface a real wasm guest must implement.
The wasmtime+wasi-stripping path lives behind the wasm feature
and is wired in a follow-up; the host-side contract is fully
tested today and is what mnemo-cli’s recall --code-mode
currently dispatches to.
Re-exports§
pub use runner::CodeModeError;pub use runner::CodeModeRecall;pub use runner::GuestProgram;pub use runner::RecallBundle;pub use runner::RecallStep;pub use runner::ResourceBudget;pub use runner::run_code_mode_host;pub use token::estimate_tokens;