spool-memory 0.2.3

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Repo snapshot

`spool` (formerly `spool`) is a local-first developer memory tool with four active runtime surfaces built on one shared Rust core, plus a desktop-facing facade layer for the next desktop shell:

1. `spool` CLI — retrieval, wakeup, and lifecycle read/write flows
2. `spool-mcp` — MCP stdio server exposing retrieval and lifecycle tools/prompts/resources
3. `spool-daemon` — narrow read-only lifecycle daemon over stdio
4. `spool-gui` — optional GUI workbench behind the `gui` feature
5. `src/desktop/` — pure Rust desktop facade exposing stable DTOs, error envelopes, and workflow-oriented service calls for future Tauri/desktop command layers

Feature flags:
- `gui` — enables the egui-based GUI workbench (`spool-gui` binary)
- `bm25` — enables tantivy-based BM25 full-text search for lifecycle records (hybrid retrieval via RRF fusion)

The current product direction is documented in `docs/PRODUCT.md`, `docs/ARCHITECTURE.md`, and `docs/SESSION_HANDOFF.md`.

## Common commands

### Build and check

```bash
cargo build
cargo check
cargo check -q
cargo check --bin spool
cargo check --bin spool-mcp
cargo check -q --bin spool-mcp
cargo check --bin spool-daemon
cargo check -q --bin spool-daemon
cargo check --features gui --bin spool-gui
cargo check -q --features gui --bin spool-gui
cargo check --features bm25
cargo check -q --features bm25
```

### Test

```bash
cargo test
cargo test -- --nocapture
cargo test -q
cargo test -q memory_lifecycle --lib
cargo test -q lifecycle_store --lib
cargo test -q lifecycle_service --lib
cargo test -q daemon_client --lib
cargo test -q mcp --lib
cargo test -q --features gui --lib
cargo test -q --features bm25 --lib
cargo test -q cli_smoke
cargo test -q mcp_smoke
cargo test -q daemon_smoke
cargo test -q retrieval_eval
cargo test -q e2e_lifecycle
cargo test -q mcp -- --nocapture
cargo test -q daemon -- --nocapture
```

### Useful targeted tests

```bash
cargo test get_should_rank_title_heading_and_wikilink_matches_above_body_only_matches
cargo test memory_read_commands_should_use_daemon_when_configured -- --nocapture
cargo test mcp_server_should_initialize_list_tools_and_run_lifecycle_calls -- --nocapture
cargo test mcp_lifecycle_reads_should_use_daemon_when_configured --test mcp_smoke
cargo test read_helpers_should_rebuild_shared_daemon_session_after_exit
cargo test lifecycle_read_tools_should_rebuild_daemon_session_after_exit
cargo test daemon_should_return_structured_error_for_invalid_json
cargo test daemon_should_return_structured_error_for_malformed_json_and_continue -- --nocapture
cargo test mcp_should_return_jsonrpc_parse_error_and_continue_serving -- --nocapture
cargo test daemon_client_should_rebuild_session_after_child_exit --lib -- --nocapture
```

### Lifecycle-focused regression entrypoints

```bash
cargo test -q cli_smoke
cargo test -q mcp_smoke
cargo test -q daemon_smoke
cargo test -q daemon_client --lib
```

### Source-of-truth files for documentation sync

- `Cargo.toml` — binary targets, feature flags, benchmark target, dependency shape
- `src/cli/args.rs` — CLI command surface and flags
- `src/lib.rs` — public module map
- `src/lifecycle_format.rs` — shared lifecycle labels plus list/detail/history markdown rendering
- `src/lifecycle_summary.rs` — shared lifecycle read/write payload and text helpers for queue/record/history reads plus create/action summaries; extend this shared layer before adding new CLI/GUI/MCP lifecycle response assembly
- `src/mcp.rs` — MCP tools, prompts/resources, JSON-RPC behavior
- `src/daemon.rs` — daemon read boundary and fallback path
- `src/desktop/` — desktop/Tauri-facing DTOs, error envelope, workflow service boundary, and parse/validation helpers
- `src/desktop/service.rs` — desktop facade service methods (context, wakeup, lifecycle, wiki_lint, read_wiki_index)
- `src/wiki_index.rs` — wiki INDEX generation and loading
- `src/wiki_lint.rs` — wiki lint checks (staleness, orphans, quality)
- `src/knowledge/` — auto-compile knowledge synthesis pages
- `tests/cli_smoke.rs`, `tests/mcp_smoke.rs`, `tests/daemon_smoke.rs` — stable external behavior and regression entrypoints
- `docs/SESSION_HANDOFF.md` — current product direction and latest implementation status

### Benchmarks

```bash
cargo bench --bench retrieval
```

### Rust ↔ TypeScript 契约 codegen

桌面 facade、domain、lifecycle 等面向前端的 DTO 通过 `ts-rs` 的 `#[derive(TS)]` 导出到
`frontend/src/lib/types/generated/`。改动 Rust 侧 DTO 后，跑一次：

```bash
cargo test --lib export_bindings
```

每个 `#[derive(TS)]` 类型会对应一个 `export_bindings_<type>` 测试，生成的 `.ts` 文件作为
前端真相源。`frontend/src/lib/types/desktop.ts` 和 `lifecycle.ts` 是 barrel re-export，
保留少量前端手写别名（如 `DesktopLifecycleAction`、`ImportCandidateDto`、
`desktopCommands` 常量）。不要手改 `generated/` 下的文件。

### Run binaries

#### Retrieval / explain / wakeup

```bash
cargo run -- \
  get \
  --config spool.example.toml \
  --task "实现 repo_path route 规划" \
  --cwd /abs/path/to/repo \
  --files src/engine/project_matcher.rs \
  --target codex \
  --format prompt

cargo run -- \
  explain \
  --config spool.example.toml \
  --task "planning route" \
  --cwd /abs/path/to/repo

cargo run -- \
  wakeup \
  --config spool.example.toml \
  --task "resume spool lifecycle work" \
  --cwd /abs/path/to/repo \
  --profile project \
  --format json
```

#### Lifecycle CLI

```bash
cargo run -- \
  memory list \
  --config spool.example.toml \
  --view pending-review \
  --format markdown

cargo run -- \
  memory show \
  --config spool.example.toml \
  --record-id <record-id>

cargo run -- \
  memory history \
  --config spool.example.toml \
  --record-id <record-id>

cargo run -- \
  memory record-manual \
  --config spool.example.toml \
  --title "简洁输出" \
  --summary "偏好简洁" \
  --memory-type preference \
  --scope user \
  --source-ref manual:cli \
  --actor long

cargo run -- \
  memory propose \
  --config spool.example.toml \
  --title "测试偏好" \
  --summary "先 smoke 再收口" \
  --memory-type workflow \
  --scope user \
  --source-ref session:1 \
  --actor codex \
  --reason "captured during review" \
  --evidence-refs session:1,obsidian://workflow

cargo run -- \
  memory accept \
  --config spool.example.toml \
  --record-id <record-id> \
  --actor long \
  --reason "approved after review"
```

Use `--daemon-bin target/debug/spool-daemon` on `memory list`, `memory show`, and `memory history` when you want daemon-backed lifecycle reads during CLI verification.

#### MCP / daemon / GUI

```bash
cargo run --bin spool-mcp -- --config spool.example.toml
cargo run --bin spool-mcp -- --config spool.example.toml --daemon-bin target/debug/spool-daemon
cargo run --bin spool-daemon -- --config spool.example.toml
cargo run --features gui --bin spool-gui
```

## CLI surface map

The top-level CLI commands are defined in `src/cli/args.rs`:

- `get` — routed retrieval output
- `explain` — route trace and match explanation
- `wakeup` — wakeup packet rendering
- `memory list` — pending review / wakeup-ready queue
- `memory show` — single record latest state
- `memory history` — append-only ledger history for one record
- `memory record-manual` — create an accepted manual memory
- `memory propose` — create a candidate AI proposal
- `memory accept` / `promote` / `archive` — lifecycle transitions
- `memory sync-vault` — 扫 ledger accepted/canonical 全量回写 vault canonical note;支持 `--dry-run` 预览;`--enrich` 对缺少结构化字段的记录进行启发式补充
- `memory import` — import session transcripts as memory candidates
- `memory import-git` — import recent git commits as memory candidates
- `memory dedup` — detect and report duplicate records
- `memory lint` — run wiki lint checks (staleness, orphans, quality)
- `memory sync-index` — rebuild wiki INDEX.md grouped by scope/project

Lifecycle read commands optionally accept `--daemon-bin ...` and must keep direct fallback behavior.

## Architecture overview

`spool` currently has four active runtime surfaces sharing the same Rust core, plus one desktop-facing facade layer:

1. `spool` CLI for retrieval, wakeup, and lifecycle operations
2. `spool-mcp` for MCP-compatible tool/prompt/resource access
3. `spool-daemon` as a narrow lifecycle-read stdio boundary
4. `spool-gui` as an optional review workbench behind the `gui` feature
5. `src/desktop/` as a pure Rust facade for future Tauri or other desktop command hosts

The retrieval pipeline is:

1. route project from `cwd`
2. restrict scanning to configured vault subtrees
3. score notes by task / files / modules / scenes / structured frontmatter
4. score lifecycle records by scope / memory_type / entities / tags / triggers / related_files / applies_to
5. apply cross-project penalty (`CROSS_PROJECT_PENALTY = 0.6`) to user/agent/team-scoped candidates when a project is matched — ensures project-scoped records rank above cross-project ones
6. 1-hop relation expansion via `related_records` and wikilinks (HOP_PENALTY = 0.7)
7. optional BM25+RRF hybrid retrieval when `bm25` feature enabled
8. render compact output as `prompt`, `markdown`, or `json`

The lifecycle pipeline is separate from vault retrieval:

- lifecycle events are append-only ledger records
- latest-state reads are projection-based
- in-process and disk-backed projection caches are rebuildable derivatives, not sources of truth
- manual memories start accepted
- AI-proposed memories start candidate
- only accepted / canonical memories are wakeup-eligible
- GUI, CLI, and MCP lifecycle flows are expected to converge on the shared service layer
- read-side queue/record/history payload shaping should prefer `src/lifecycle_summary.rs` helpers before adding surface-specific structs or ad-hoc JSON assembly
- markdown/text rendering should prefer `src/lifecycle_format.rs` and `src/lifecycle_summary.rs` helpers before adding wrapper-local formatters

The daemon boundary stays intentionally narrow:

- daemon commands are `ping`, `workbench`, `record`, and `history`
- daemon only serves lifecycle reads
- lifecycle writes still go through `LifecycleService`
- CLI and MCP may use daemon-backed reads, but both must keep direct fallback paths
- daemon client sessions are shared per `(daemon_bin, config_path)` and rebuilt once after transport failure
- malformed JSON on daemon and MCP stdio boundaries must return structured errors and keep serving later requests
- downstream lifecycle surfaces should treat `src/lifecycle_summary.rs` as the first stop for read-side queue / record / history payloads and wrapper text helpers before adding surface-local response shaping

## Key modules

### Retrieval and routing

- `src/app.rs` — top-level orchestration for retrieval commands
- `src/memory_gateway.rs` — shared retrieval/wakeup gateway used by CLI and MCP
- `src/config/` — config loading and path normalization
- `src/engine/` — project matching, scoring, candidate selection
- `src/engine/bm25.rs` — BM25 full-text index over lifecycle records (behind `bm25` feature, uses tantivy)
- `src/engine/scorer.rs` — structured field scoring (entities +6, tags +4, triggers +8, related_files +10, applies_to +8)
- `src/engine/selector.rs` — candidate selection with 1-hop relation expansion and optional BM25+RRF fusion
- `src/engine/selector/` — directory module (`mod.rs` + `tests.rs`) after test split refactor
- `src/vault/` — markdown scanning, frontmatter, wikilinks, sections
- `src/output/` — prompt / markdown / json renderers

### Lifecycle and memory operations

- `src/domain/` — lifecycle enums and core domain types (MemoryRecord includes entities, tags, triggers, related_files, related_records, supersedes, applies_to, valid_until)
- `src/lifecycle_store.rs` — append-only ledger storage plus latest-state projection/cache logic
- `src/lifecycle_service.rs` — service layer for record/propose/accept/promote/archive and query snapshots
- `src/lifecycle_format.rs` — shared lifecycle presentation helpers for state/action labels and list/detail/history rendering
- `src/lifecycle_summary.rs` — shared lifecycle payload and text helpers for queue/record/history reads plus create/action summaries; prefer extending this shared layer before adding surface-specific lifecycle response assembly
- `src/enrich.rs` — heuristic enrichment for backfilling structured fields (entities/tags/triggers) on older records; integrated as pre-step in auto-compile
- `src/knowledge/` — unified knowledge processing layer:
  - `cluster.rs` — clustering detection, consolidation, prune candidates (formerly consolidation.rs)
  - `compile.rs` — template + LLM synthesis, auto-compile with enrich pre-step
  - `mod.rs` — re-exports
- `src/sampling.rs` — shared SamplingClient trait used by both distill pipeline and knowledge compile
- `src/wiki_index.rs` — generate and load wiki INDEX.md grouped by scope/project
- `src/wiki_lint.rs` — staleness, orphan, and quality lint checks on lifecycle records
- `src/cli/commands.rs` — CLI lifecycle command surface
- `src/gui/app_shell.rs` — experimental GUI workbench for lifecycle review/history/create flows

### MCP and daemon boundaries

- `src/mcp/` — MCP stdio server directory module:
  - `mod.rs` — dispatch loop, tool handlers, prompts/resources, daemon-backed lifecycle reads; auto-compile fires on a dedicated thread after lifecycle writes (fire-and-forget, never blocks the dispatch loop)
  - `protocol.rs` — JSON-RPC envelope helpers (result/error/tool_success/failure)
  - `mcp_sampling.rs` — McpSamplingClient + StdSamplingChannel for `sampling/createMessage` reverse-calls
  - `schemas.rs` — tool/prompt/resource schema definitions
- `src/daemon.rs` — minimal stdio lifecycle read server and read helpers
- `src/daemon_client.rs` — shared daemon session pool keyed by `(daemon_bin, config_path)` with one-shot rebuild on transport failure
- `src/bin/spool_mcp.rs` — MCP entrypoint with `--config` and optional `--daemon-bin`
- `src/bin/spool_daemon.rs` — daemon entrypoint with `--config`
- `src/bin/spool_gui.rs` — GUI entrypoint behind the `gui` feature

Current MCP tool surface includes:

- retrieval: `memory_search`, `memory_explain`, `memory_wakeup`
- lifecycle reads: `memory_review_queue`, `memory_wakeup_ready`, `memory_get`, `memory_history`
- lifecycle writes: `memory_record_manual`, `memory_propose`, `memory_accept`, `memory_promote`, `memory_archive`
- analysis: `memory_check_contradictions`, `memory_lint`
- ingestion: `memory_import_session`, `memory_sync_vault`, `memory_distill_pending`
- resources/prompts: session handoff, current round plan, restart guide, queue review, wakeup generation, project context retrieval

## Important current behavior

### Desktop facade

`src/desktop/` is the desktop-facing Rust contract for future Tauri or other desktop hosts.

It should remain a thin workflow boundary that:

- validates desktop-facing inputs before crossing into core services
- exposes stable DTOs and `DesktopErrorEnvelope` instead of surface-local ad-hoc JSON
- reuses `app::run_with_overrides` for context rendering
- reuses `memory_gateway::execute` for wakeup generation
- reuses `src/daemon.rs` read helpers for workbench / record / history reads with optional daemon fallback semantics
- reuses `LifecycleService` for manual record / AI propose / lifecycle action writes
- prefers `src/lifecycle_summary.rs` shared payload helpers for read/write response shaping instead of inventing a second desktop-only payload contract
- stays pure Rust and independent from Tauri framework types

Current desktop workflows covered by this facade are:

- context rendering
- wakeup generation
- lifecycle workbench loading
- single-record lookup
- record history lookup
- manual memory creation
- AI proposal creation
- lifecycle action application
- wiki lint report (`wiki_lint`)
- wiki INDEX reading (`read_wiki_index`)

Desktop-facing failures should stay normalized into serializable envelopes:

- kinds: `input`, `config`, `routing`, `scan`, `runtime`
- fields: `kind`, `message`, `hint`, `explain`
- taxonomy should stay aligned with the current GUI failure model in `src/gui/app_shell.rs`

Keep desktop as a maintenance/inspection surface over the shared Rust core, not a parallel business core.

When adding future desktop host code, extend `src/desktop/` first rather than rebuilding lifecycle/retrieval orchestration in transport glue.

- retrieval/lifecycle truth stays in shared Rust services
- daemon use remains optional and read-only
- desktop DTOs should track shared Rust contracts rather than drift into a second business API
- parse helpers such as file/evidence list splitting should stay centralized in the facade layer
- if behavior diverges from CLI/MCP, fix the shared layer instead of adding a desktop-only workaround

Recent desktop-specific validation includes `cargo test -q desktop --lib` and `cargo check -q --lib`.

If desktop docs drift, resync from:

- `src/desktop/`
- `src/app.rs`
- `src/memory_gateway.rs`
- `src/daemon.rs`
- `src/lifecycle_service.rs`
- `src/lifecycle_summary.rs`
- `src/gui/app_shell.rs`
- `docs/SESSION_HANDOFF.md`

Keep desktop product direction consistent with the current plan: desktop is a local workbench on top of the shared Rust core, while Tauri/frontend shells remain presentation and interaction layers rather than new sources of business logic.

- current direction: Tauri + frontend shell
- current implemented backend-facing boundary: `src/desktop/`
- current non-goal: moving lifecycle/retrieval logic into frontend code
- current migration stance: keep `egui` transitional while the shared desktop facade stabilizes

### Daemon integration

Daemon support is intentionally narrow:

- commands: `ping`, `workbench`, `record`, `history`
- daemon only covers lifecycle reads; writes still go through `LifecycleService` directly
- CLI and MCP still keep direct read fallback if daemon transport fails
- shared daemon sessions are reused inside one process and rebuilt once if the child exits
- malformed JSON on MCP and daemon stdio boundaries should return structured errors instead of terminating immediately

### Lifecycle surface

Lifecycle work is split this way:

- query/read path: `src/daemon.rs`, `src/daemon_client.rs`, `src/lifecycle_service.rs`
- persistence/projection path: `src/lifecycle_store.rs`
- write/action path: `src/lifecycle_service.rs`
- **vault writeback path: `src/vault_writer.rs`** — lifecycle accepted/canonical 记录同步回写到 vault;`memory_type == "knowledge"` 路由到 `50-Memory-Ledger/Compiled/`,其他走 `50-Memory-Ledger/Extracted/<record_id>.md`
- **knowledge loop: `src/knowledge/`** — auto-compile knowledge synthesis pages from related records (template synthesis)
- **wiki index: `src/wiki_index.rs`** — generate/load INDEX.md grouped by scope/project
- **wiki lint: `src/wiki_lint.rs`** — staleness, orphan, quality checks
- CLI surface: `src/cli/commands.rs`
- MCP surface: `src/mcp.rs`
- GUI surface: `src/gui/app_shell.rs`

When extending lifecycle behavior, preserve these invariants:

- append-only ledger
- stable `record_id`
- invalid transitions do not append
- successful writes refresh projection-backed reads
- daemon remains optional rather than mandatory
- vault writeback is a **side effect** of lifecycle writes — writeback 失败(config/路径错误/权限)降级为 stderr warn,**不阻断** ledger 主路径。ledger 永远是 truth,vault canonical note 是投影

### Vault writeback

`src/vault_writer.rs` 按 `docs/OBSIDIAN_SCHEMA.md` 的 frontmatter 契约把 lifecycle 记忆渲染成 vault canonical note:

- 目标路径:`memory_type == "knowledge"` → `<vault_root>/50-Memory-Ledger/Compiled/<record_id>.md`;其他 → `<vault_root>/50-Memory-Ledger/Extracted/<record_id>.md`
- 状态分派:Accepted / Canonical → 写 note;Archived → 打 archived 标记;Draft / Candidate → 不回写
- 幂等 + body 保护:frontmatter 保存 `spool_body_hash`,下次回写时若磁盘 body 实际 hash 与 stored 不一致视为用户手改 → 保留用户 body 仅重写 frontmatter
- 入口:`writeback_from_config(config_path, entry)`(调用方一行调用) / `apply_writeback_for_entry(vault_root, entry)`(vault_root 已知时)
- 串联:`writeback_from_config` 成功后自动链式调用 wiki INDEX 刷新 + auto-compile
- hook 点:`cli/commands.rs` 的 record-manual/propose/accept/promote/archive,`mcp.rs` 同名 tool,`desktop/service.rs` 对应方法;所有写入路径成功后都调 writeback
- scorer 去重:`selector::excluded_record_ids_from_scored` 从 scored notes 的 frontmatter `record_id` 字段提取覆盖集,`select_lifecycle_candidates` 跳过已被 canonical note 覆盖的 record 避免双计
- 补齐/drift:CLI `memory sync-vault [--dry-run]` 和 MCP `memory_sync_vault` 扫 ledger 全量对账

### GUI

GUI is experimental and behind the `gui` feature. Do not assume GUI code is part of the default build.

### Caches and build artifacts

`target/` is normal Cargo output. Do not treat it as project content. It may contain debug/release binaries, incremental caches, criterion output, IDE check artifacts, and temporary files.

## Test map

- `tests/cli_smoke.rs` — stable CLI behavior and lifecycle CLI coverage
- `tests/mcp_smoke.rs` — MCP initialization, tools/prompts/resources, daemon-backed lifecycle reads
- `tests/daemon_smoke.rs` — daemon stdio behavior and malformed JSON handling
- `tests/retrieval_eval.rs` — retrieval quality evaluation harness (precision/recall/MRR metrics)
- `tests/e2e_lifecycle.rs` — end-to-end lifecycle integration tests (create/propose/accept/promote/archive flows)
- `src/mcp.rs` tests — MCP parameter validation, prompt/resource handlers, daemon-backed read behavior
- `src/daemon.rs` tests — read helper fallback/reuse/recovery behavior
- `src/daemon_client.rs` tests — shared session reuse, missing-record behavior, child rebuild behavior
- `src/engine/bm25.rs` tests — BM25 index build/search/empty-query behavior (requires `bm25` feature)

When changing daemon or MCP behavior, run both targeted tests and the full suite.

## Docs to read before larger changes

Start here for product/context decisions:

1. `docs/PRODUCT.md`
2. `docs/ARCHITECTURE.md`
3. `docs/ROADMAP.md`
4. `docs/PLANNING_ALIGNMENT.md`
5. `docs/SESSION_HANDOFF.md`

Use round-plan docs in `docs/*ROUND*_PLAN.md` to understand recent implementation intent and what was intentionally deferred.

Priority round docs for the current state:

- `docs/MCP_PROMPTS_ROUND_8_PLAN.md`
- `docs/DAEMON_ROUND_11_PLAN.md`
- `docs/DAEMON_INTEGRATION_ROUND_12_PLAN.md`
- `docs/LIFECYCLE_ROUND_3_PLAN.md`
- `docs/PERSISTENT_CACHE_ROUND_10_PLAN.md`