mini_claw 0.1.7

Lightweight Rust agent platform for Raspberry Pi
# AGENTS.md

Goal: build a tiny Rust daemon that can run on a Pi, talks via channels such as Discord, calls LLMs, runs tools, and supports heartbeat + cron proactive triggers.

## Rules
- Ship the smallest thing that works. Cut scope, don't add features "just in case".
- **Always run `cargo fmt` and `cargo clippy` before committing or pushing to main.** Fix all warnings before pushing. No exceptions.

## MVP (in this order)
1) Load `config.yaml` (agents, discord token, model providers)
2) Connect Discord bot, receive messages, send replies (generic pattern to allow for future channels)
3) Agent runtime: build prompt from workspace markdown + short session history
4) LLM provider trait + OpenAI implementation (API key first; OAuth later)
5) Tool runner: only `read`, `write`, `exec` in workspace
6) Heartbeat: `tokio::time::interval` + `HEARTBEAT.md` + `HEARTBEAT_OK`
7) Cron: persisted jobs + `cron` scheduler + dispatch to agent

## Workspace contract
Per agent workspace contains optional:
- `SOUL.md`, `TOOLS.md`, `HEARTBEAT.md`
- `skills/` (later)
- `sessions/*.jsonl`

Never read/write outside the agent workspace unless explicitly configured.

## Code shape
Crates/modules:
- `config` (serde load + validate)
- `discord` (connector)
- `agent` (prompt build, session store)
- `models` (provider trait + openai + azure_openai)
- `tools` (builtin tools + sandbox rules)
- `tools/memory_tool` (save/recall persistent memory)
- `tools/skill_author_tool` (create_skill/list_skills)
- `scheduler` (heartbeat + cron)
- `main` (wires everything)

## "Done" criteria for a milestone
- Works end-to-end on a Pi with `cargo run --release`
- Logs are readable; failures don't crash the whole process
- No hidden background magic: all actions are explicit and observable

## Stretch-goals (keep in mind)
- Web UI, websockets, pairing flows, multi-channel routing, mobile nodes

## Inspiration
Pull from and take inspiration from the project "https://github.com/openclaw/openclaw" if needed via search tools.

Agents now use a canonical per-agent workspace at agents/<id>/workspace. Each workspace contains agent-specific files (`SOUL.md`, `TOOLS.md`, `HEARTBEAT.md`, `sessions/`) and a `BOOTSTRAP.md` template to document onboarding. See CONSOLIDATION.md and agents/default/BOOTSTRAP.md for the consolidation plan and bootstrap template.

---

## Phase 2 — Feature Plan

### F1: Persistent Memory (cross-session knowledge store)

**Why:** Sessions rotate — when one ends, the agent forgets everything.
Memory lets the agent accumulate knowledge that persists across sessions
(user preferences, project facts, recurring tasks).

**Design:**
- Storage: one JSONL file per agent at `agents/<id>/workspace/memory.jsonl`
- Each entry: `{ "key": "...", "value": "...", "tags": [...], "timestamp": <epoch_ms> }`
- Key is a short slug (e.g. `"user_timezone"`, `"project_stack"`); value is free-text
- Duplicate keys overwrite (last-write-wins on the slug)

**Two new tools registered in `tools/mod.rs`:**
- `save_memory { key, value, tags? }` — append/upsert to memory.jsonl
- `recall_memory { query?, tag?, limit? }` — search entries by substring or tag; returns top N
  - If `query` is provided and an embedding model is configured, use cosine similarity
  - Otherwise fall back to case-insensitive substring match on key+value

**Embedding support (optional, Azure path):**
- New trait method on `ModelProvider`: `async fn embed(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>>`
  - Default impl returns `Err("not supported")`
  - `AzureOpenAIProvider` overrides with a call to the Azure embeddings endpoint
- `recall_memory` checks if an embedding provider is configured; if yes, embed the query
  and rank stored memories by cosine similarity; if no, substring search
- Embeddings are cached in `memory_embeddings.bin` (simple flat `Vec<f32>` per entry)
  to avoid re-embedding on every recall

**Prompt injection:**
- At prompt-build time (`agent/mod.rs`), load all memory entries and inject them
  as a `<memory>` block in the system prompt (after SOUL.md, before TOOLS.md)
- Cap at ~2K tokens worth of entries; prioritise by recency + relevance

**Files touched:**
- New: `src/tools/memory_tool.rs`
- Edit: `src/tools/mod.rs` (register + dispatch)
- Edit: `src/agent/mod.rs` (inject memory into prompt)
- Edit: `src/models/mod.rs` (add `embed()` default method to trait)

---

### F2: Azure OpenAI / Azure AI Foundry Provider

**Why:** User wants to use Azure-hosted models including embedding models.
Azure OpenAI uses a different endpoint pattern and auth header than vanilla OpenAI.

**Design:**
- New file: `src/models/azure_openai.rs`
- Struct `AzureOpenAIProvider`:
  - `endpoint: String` — e.g. `https://<resource>.openai.azure.com`
  - `api_key: String` — Azure API key
  - `deployment: String` — deployment name (replaces model in URL)
  - `api_version: String` — e.g. `"2024-10-21"`
  - `embedding_deployment: Option<String>` — separate deployment for embeddings
- Chat completions URL: `{endpoint}/openai/deployments/{deployment}/chat/completions?api-version={api_version}`
- Embeddings URL: `{endpoint}/openai/deployments/{embedding_deployment}/embeddings?api-version={api_version}`
- Auth header: `api-key: {api_key}` (NOT `Authorization: Bearer`)
- Implements `ModelProvider` trait including `send_chat`, `send_chat_with_functions`, and `embed()`

**Config extension:**
```yaml
models:
  - id: azure-gpt4
    provider: azure-openai
    model: gpt-4o                    # deployment name
    api_key: $AZURE_OPENAI_API_KEY
    endpoint: https://myresource.openai.azure.com
    api_version: "2024-10-21"        # optional, has default
  - id: azure-embeddings
    provider: azure-openai
    model: text-embedding-3-small
    api_key: $AZURE_OPENAI_API_KEY
    endpoint: https://myresource.openai.azure.com
```

**Config struct changes:**
- Add optional fields to `ModelConfig`: `endpoint`, `api_version`
- Remove `deny_unknown_fields` from `ModelConfig` (already blocks new fields)
- Wire into `build_provider()` match arm: `"azure-openai" | "azure_openai" | "azure"`

**Files touched:**
- New: `src/models/azure_openai.rs`
- Edit: `src/models/mod.rs` (add `embed()` to trait, `build_provider` arm, re-export)
- Edit: `src/config/mod.rs` (add `endpoint`, `api_version` to `ModelConfig`)

---

### F3: Webhook Ingest Endpoint

**Why:** External systems (GitHub, Sentry, Stripe, IFTTT, Home Assistant)
can POST events that trigger agent actions — the agent becomes reactive
to the real world, not just chat messages.

**Design:**
- New route: `POST /api/webhook/:agent_id`
- Accepts arbitrary JSON body
- Wraps it in an `IncomingMessage` with `source: "webhook"` and dispatches
  to the agent's message bus
- Agent receives it as a system message: `"Webhook received:\n```json\n{body}\n```"`
- Optional: `?secret=<token>` query param validated against a per-agent
  `webhook_secret` in config (simple shared secret, not HMAC for now)
- Returns `202 Accepted` with `{ "status": "dispatched", "agent": "<id>" }`

**Config extension:**
```yaml
agents:
  - id: default
    webhook_secret: $WEBHOOK_SECRET   # optional
```

**Files touched:**
- Edit: `src/gateway/mod.rs` (add route + handler)
- Edit: `src/config/mod.rs` (add `webhook_secret` to `AgentConfig`)
- Edit: `src/comm/mod.rs` (ensure IncomingMessage supports `source` field)

---

### F4: Skill Self-Authoring Tool

**Why:** The agent should be able to create new skills for itself at runtime,
just like OpenClaw agents do. "Build a skill that checks the weather" →
agent writes SKILL.md + skill.yaml → hot-reloads the registry.

**Design:**
- New tool: `create_skill { name, description, instructions, scope? }`
  - `scope` defaults to `"agent"` (per-agent skill)
  - Creates `agents/<id>/workspace/skills/<name>/SKILL.md` with proper frontmatter
  - Creates `agents/<id>/workspace/skills/<name>/skill.yaml`
  - Calls `SkillRegistry::reload()` to hot-reload
- New tool: `list_skills {}` — returns the agent's current skill list

**SkillRegistry changes:**
- Add `pub fn reload(&mut self)` that re-scans both global and agent skill dirs
- Expose a global `reload_skills()` function that acquires the registry lock and reloads
- These already have the scanning logic in `discover()` — reload just calls it again

**Files touched:**
- New: `src/tools/skill_author_tool.rs`
- Edit: `src/tools/mod.rs` (register + dispatch)
- Edit: `src/skills/mod.rs` (add `reload()` method)

---

### F5: Enhance `pinchy init` Wizard

**Why:** The existing wizard (`app_onboard`) handles config creation and
copilot login but doesn't guide through API key entry, Azure setup, or
webhook configuration. It should be the friction-free entry point.

**Enhancements:**
1. **Provider-aware key prompts:**
   - If user picks `openai` → prompt for `OPENAI_API_KEY`, save to secrets
   - If user picks `azure-openai` → prompt for endpoint, API key, deployment names
   - If user picks `copilot` → trigger device flow (already implemented)

2. **Discord setup step:**
   - Ask "Connect Discord?" → if yes, prompt for bot token, save to secrets

3. **Embedding model step:**
   - Ask "Configure an embedding model for memory search?" → if yes,
     prompt for provider + deployment, add to config.models

4. **Summary with next-steps:**
   - Print URLs: dashboard at `http://localhost:8080`
   - Print: "Send your first message: `pinchy chat hello`"

**Files touched:**
- Edit: `src/cli/mod.rs` (extend `app_onboard` + `interactive_onboard_tui`)
- Edit: `src/config/mod.rs` (if new fields needed)

---

## Implementation Order

```
 F2: Azure OpenAI Provider       ← unlocks embedding support
  │
  ▼
 F1: Persistent Memory           ← uses embeddings from F2 for recall
  │
  ▼
 F3: Webhook Ingest              ← independent, small surface area
  │
  ▼
 F4: Skill Self-Authoring        ← independent, builds on existing skill infra
  │
  ▼
 F5: Enhanced Init Wizard        ← ties all new features into onboarding
```

Each feature is designed to be independently shippable and testable.
F2 → F1 have a dependency (embeddings power semantic memory search).
F3, F4, F5 are independent of each other and can be reordered.

**Workspace additions:**
- `memory.jsonl` — persistent knowledge store
- `memory_embeddings.bin` — cached embedding vectors

## Remember
- Ensure strict type safety and error handling; avoid panics
- Ensure TypeScript / Rust boundary is well-defined and types are kept in sync