recall-echo 3.3.0

# recall-echo — Spec v2.0

## What It Is

recall-echo is a persistent three-layer memory system for pulse-null entities. It gives AI agents long-term recall across sessions — curated facts, recent session context, and searchable conversation archives. Designed as a native pulse-null plugin (Memory role) with standalone CLI support for administration.

Inspired by MemGPT (arxiv:2310.08560) — event-driven memory management for LLMs.

## The Problem

LLM coding agents start every session from zero. Built-in memory systems are typically a single flat file with no concept of short-term vs long-term memory, no session summaries, and no searchable history. Memory management that depends on the agent choosing to save things is circular — it forgets to remember.

recall-echo makes the entire memory lifecycle mechanical. When integrated with pulse-null, archival and checkpointing happen automatically via the plugin system. The agent writes to MEMORY.md during sessions. Everything else is enforced by the system.

## Architecture

```
┌──────────────────────────────────────────────────────┐
│              MEMORY ARCHITECTURE                      │
│                                                       │
│  Layer 1: CURATED (always in context)                 │
│  ┌───────────┐                                        │
│  │ MEMORY.md │  Facts, preferences, patterns          │
│  └───────────┘  Distilled & maintained by the agent   │
│                                                       │
│  Layer 2: SHORT-TERM (FIFO rolling window)            │
│  ┌───────────────┐                                    │
│  │ EPHEMERAL.md  │  Last N session summaries          │
│  └───────────────┘  Appended on archive, trimmed      │
│                                                       │
│  Layer 3: LONG-TERM (searched on demand)              │
│  ┌─────────────┐    ┌────────────────────────────┐    │
│  │ ARCHIVE.md  │───→│ conversations/             │    │
│  └─────────────┘    │  conversation-001.md       │    │
│                     │  conversation-002.md       │    │
│                     │  ...                       │    │
│                     └────────────────────────────┘    │
│                     YAML frontmatter + markdown       │
│                     LLM-summarized or algorithmic     │
└──────────────────────────────────────────────────────┘
```

All paths are relative to an entity root directory:

```
{entity_root}/memory/
├── MEMORY.md                    # Layer 1 — curated facts
├── EPHEMERAL.md                 # Layer 2 — rolling session window
├── ARCHIVE.md                   # Layer 3 — conversation index
├── conversations/               # Layer 3 — full archives
│   ├── conversation-001.md
│   ├── conversation-002.md
│   └── ...
└── .recall-echo.toml            # Optional configuration
```

## Dual-Mode Operation

recall-echo operates in two modes:

### 1. pulse-null Plugin (Primary)

As a native pulse-null plugin implementing the `Plugin` trait from echo-system-types:

- **Role**: `PluginRole::Memory` (required — exactly one per entity)
- **Factory**: `create(config, ctx) -> Box<dyn Plugin>`
- **Health**: Reports memory directory state (Healthy / Degraded / Down)
- **Setup**: Prompts for entity_root during pulse-null init wizard
- **Lifecycle**: pulse-null calls `archive::archive_session()` and `checkpoint::create_checkpoint()` automatically

The plugin does not contribute tools, scheduled tasks, or HTTP routes. It is a data layer — pulse-null orchestrates when archival and checkpointing happen.

### 2. Standalone CLI

For administration and use outside pulse-null:

```
recall-echo init [entity_root]       # Create memory directory structure
recall-echo status [entity_root]     # Health check with dashboard
recall-echo search <query>           # Line-level archive search
recall-echo search <query> --ranked  # File-ranked search with relevance scoring
recall-echo distill [entity_root]    # Analyze MEMORY.md, suggest cleanup
recall-echo consume [entity_root]    # Output EPHEMERAL.md content
```

## Layer Details

### Layer 1 — Curated Memory (MEMORY.md)

The source of truth. Distilled facts, user preferences, patterns, key decisions. Always loaded into agent context at session start.

- Lives at: `{entity_root}/memory/MEMORY.md`
- Size discipline: Keep under 200 lines
- Updated: During conversations when stable facts are confirmed
- Distillation: `recall-echo distill` analyzes heavy sections and suggests extracting them to topic files (e.g., `memory/debugging.md`)

### Layer 2 — Short-Term Memory (EPHEMERAL.md)

A FIFO rolling window of recent session summaries. Gives the agent immediate context about recent work.

- Lives at: `{entity_root}/memory/EPHEMERAL.md`
- Max entries: Configurable (default 5, range 1–50)
- Format: Separator-delimited entries with session ID, date, duration, message count, summary, and archive pointer
- Updated: Automatically when a session is archived

### Layer 3 — Long-Term Memory (conversations/)

Full conversation archives with structured metadata. Not loaded into context — searched on demand via Grep.

- Index: `{entity_root}/memory/ARCHIVE.md` (markdown table)
- Archives: `{entity_root}/memory/conversations/conversation-NNN.md`
- Format: YAML frontmatter + markdown sections

## Archive Format

Each conversation archive includes structured metadata and content:

```yaml
---
log: 5
date: "2026-03-06T10:30:00Z"
session_id: "abc123"
message_count: 34
duration: "30m"
source: "session"
topics: ["auth", "jwt", "middleware"]
---

## Summary
LLM-generated summary of the conversation.

**Decisions**: Key decisions made during the session.
**Action Items**: Follow-up tasks identified.

### User
(message content)

### Assistant
(message content)

## Tags
**Decisions**: decided to use JWT for auth
**Files**: src/auth.rs, src/middleware.rs
**Tools**: Read, Edit, Bash
```

Archives are created with two source types:
- `session` — end-of-session archive (appends to EPHEMERAL.md)
- `checkpoint` — pre-compaction checkpoint (does not append to EPHEMERAL.md)

## Summarization

recall-echo supports two summarization strategies:

1. **LLM-enhanced** (when an `LmProvider` is available via pulse-null): Structured JSON output with summary, topics, decisions, and action items. Conversations are condensed to ~4000 characters before sending to the model.

2. **Algorithmic fallback** (standalone or when LLM is unavailable): Keyword extraction with stop-word filtering and tool-target boosting. First user message as summary. No decisions or action items extracted.

Fallback is silent — if the LLM call fails, algorithmic extraction runs automatically.

## Tag Extraction

Archives include heuristic-based structured tags:

- **Decisions**: Detected via linguistic markers ("decided to", "let's use", "chose to")
- **Action items**: Detected via markers ("todo:", "need to", "follow up")
- **Files touched**: Extracted from tool inputs (paths containing `/` or `.`)
- **Tools used**: Collected from tool use entries in the conversation

Max: 5 decisions, 5 action items, 10 files per archive.

## Plugin Integration

### echo-system-types Dependency

recall-echo depends on echo-system-types v0.4.0 for:
- `Plugin` trait — lifecycle, health, meta, role, setup prompts, `as_any()`
- `PluginContext` — entity_root, entity_name, `Arc<dyn LmProvider>`
- `PluginRole::Memory`
- `HealthStatus` — Healthy, Degraded(reason), Down(reason)
- `Message`, `ContentBlock` — conversation types for archival

### Factory Pattern

```rust
pub async fn create(
    config: &serde_json::Value,
    ctx: &PluginContext,
) -> Result<Box<dyn Plugin>, Box<dyn Error + Send + Sync>>
```

Config accepts `entity_root` (string). Falls back to `ctx.entity_root` if not specified. Returns a fully initialized `RecallEcho` instance — no two-phase init.

### Health Checks

- **Healthy**: memory/, conversations/, and MEMORY.md all exist
- **Degraded**: memory/ exists but MEMORY.md or conversations/ missing
- **Down**: memory/ directory not found

## Configuration

Optional `.recall-echo.toml` in the memory directory:

```toml
[ephemeral]
max_entries = 5
```

All settings have sensible defaults. Missing file or invalid values fall back silently.

## Technology

- **Language**: Rust
- **License**: AGPL-3.0
- **Dependencies**: echo-system-types, clap, serde, serde_json, dirs
- **Zero external deps** for YAML, TOML, and date handling (hand-rolled parsers)
- **Dev dependencies**: tempfile, tokio

## Testing

80 tests covering:
- Archive operations (creation, indexing, numbering)
- Checkpoint functionality
- Conversation processing (flattening, markdown rendering, topic extraction)
- Ephemeral FIFO window (append, trim, parse)
- Dashboard rendering and health assessment
- Search (line-level and ranked)
- Summarization (algorithmic)
- Status reporting
- Tags and frontmatter parsing

All tests run in isolated temporary directories — no production memory touched.

## Resolved Decisions

1. **Entity-root model**: All paths relative to entity_root, not hardcoded home dir. Supports multi-entity scenarios.
2. **Markdown archives**: Replaced JSONL with markdown + YAML frontmatter. Human-readable, Grep-searchable.
3. **FIFO ephemeral**: Rolling window of N entries (default 5) instead of single session summary.
4. **LLM summaries with fallback**: Graceful degradation when no provider available.
5. **No hooks or rules**: pulse-null manages lifecycle. recall-echo is purely a data layer + CLI admin tool.
6. **Zero external date/YAML/TOML deps**: Hand-rolled parsers for minimal bloat.
7. **Plugin role**: Memory (required, exactly one). No tools, no tasks, no routes contributed.
8. **Async throughout**: Factory and archival functions are async to match pulse-null's architecture.