sparrow-cli 0.4.0

# Changelog


All notable changes to Sparrow will be documented in this file.

## [0.3.6] — 2026-06-05

Public distribution pass for one-click installation and simplified launch.

### Added
- `sparrow launch`: runs first-launch setup when needed, then opens the
  WebView cockpit on port 9339. `sparrow launch --tui` keeps the same
  onboarding path but starts the terminal cockpit.
- Windows one-click installer (`install.ps1`) that installs to the user-local
  Sparrow bin directory, adds it to the user PATH, and launches Sparrow.
- Linux/macOS one-click installer (`install.sh`) that downloads the latest
  release artifact from `ucav/Sparrow`, falls back to a source build when an
  artifact is unavailable, and launches Sparrow.
- Release/distribution regression tests that pin installer repository targets,
  expected artifact names, and docs coverage for `sparrow launch`.

### Fixed
- The public installer no longer points at the old placeholder
  `sparrow-dev/sparrow` repository.
- The Release workflow now publishes `sparrow-windows-x86_64.exe` without
  accidentally appending a second `.exe`.
- The Release workflow explicitly grants `contents: write` for GitHub release
  publishing.

### Validation
- `cargo fmt --all -- --check`
- `cargo check --all-targets`
- `cargo clippy --all-targets -- -D warnings`
- `cargo build --release`
- `cargo test`
- `bash -n install.sh`
- PowerShell parser validation for `install.ps1`

## [0.3.5] — 2026-06-04

Public polish pass for the WebView cockpit and repository surface.

### Added
- Prompt caching request controls for Anthropic Messages and
  OpenAI-compatible/Responses providers. Anthropic stable system prompts receive
  `cache_control`; OpenAI-compatible requests receive `prompt_cache_key` and
  `prompt_cache_retention`.
- WebView plan mode now has explicit accept, edit, and reject actions.
- Diff cards and the side diff panel now render patches as per-hunk review
  blocks with accept/reject states.
- Local syntax highlighting for streamed code cards, covering common languages
  without a CDN dependency.
- Regression tests that lock code-card styling, metric-spam behavior, and
  WebView typography, plan actions, provider cache payloads, and hunked diffs.

### Fixed
- `CostUpdate` events no longer print one transcript line per streamed metric
  tick; they update the live meters only.
- Empty prose fragments between adjacent fenced code blocks are removed, so
  collapsed cards no longer create large blank gaps.
- Code cards are denser, easier to scan, and preserve copy/line-count behavior.

### Polished
- WebView typography now uses a system UI stack with a modern monospace stack
  for code.
- Root-level scratch logs, local handoffs, and presentation-only artifacts were
  removed from the public tree.
- README status and release metadata now reflect the v0.3.5 build.

## [0.3.3] — 2026-06-03

Public-release hardening pass. Fixes every issue from hands-on testing and
stabilizes the WebView cockpit for local use.

### Fixed (critical)

- **Context truly persists now.** The engine emits the assistant response as
  `ThinkingDelta`, not `Message`, so the previous capture never fired. The
  WebView command loop now accumulates streamed deltas and flushes one
  assistant turn into a persistent history on `RunFinished`. Context survives
  model switches AND separate prompts.
- **Sessions persist across restarts.** The conversation is saved to the
  SQLite `SessionStore` under id `webview` and re-hydrated on launch — it
  appears in the `/sessions` panel and reloads next time.
- **Counters animate live and never stick at zero** (direct `textContent`
  write + easing layer; the rAF-only path failed in background tabs).
- **All models are visible.** `/models` merges live-discovered models with the
  curated registry — NVIDIA jumps from 6 to 82 models in the picker/config.

### Added

- **Stop button** next to Run aborts the active task (`POST /stop`).
- **Mid-run injection**: typing while a task runs injects the text into the
  live run's context instead of starting a new run.
- **Collapsible code cards**: assistant code output is split out of the stream
  into `<details>` blocks with language label, line count, and copy button.
- **Auto-collapsing run summary** that re-expands on click.

### Polished UI

- Fixed vertical char-by-char text wrapping (`overflow-wrap` instead of the
  deprecated `word-break:break-word`).
- Drawer shows only the clicked panel, not all metrics stacked.
- Context bar fills with a green → amber → coral → red gradient.
- Animated pulsing green dot on the active session.
- Removed the duplicated dollar figure; budget shows `unlimited` when unset.
- Removed the misaligned fixed "live" tag (the chrome bar already has one).

## [0.3.2] — 2026-06-03


### Fixed (critical bugs from user report)

- **Conversation context was dropped on every new run/model switch.**
  `main.rs` now keeps a persistent `Arc<Mutex<Vec<Msg>>>` of the dialog
  shared across runs; every command pulls prior turns into `Task.context`
  (previously hardcoded to `vec![]`) and pushes user+assistant messages
  back, capped at the last 40 turns.
- **Cockpit counters never updated** because the rAF-based `countUp`
  didn't fire in background tabs / headless contexts. `refreshTokens()`
  now writes `textContent` directly first, then runs `countUp` as an
  optional easing layer. Tokens, cost, and budget all animate live.
- **No way to see which providers / models are configured.** The full
  35-provider · 60+-model registry is now visible in the config panel
  as expandable cards (was a single dropdown showing one provider at a
  time). Search, default-route picker, and per-provider key input added.

### Added

- **`POST /conversation/reset`** endpoint + UI button (`⟲ new conversation`)
  that clears retained context when the user wants a fresh start.
- **Live "context retained · turn #N"** indicator on every `RunStarted` from
  turn 2 onward; **"● context preserved"** marker on every `ModelSwitched`.
- **BUDGET pill** in cockpit row: live `<pct>% / $<daily> day` with color
  gradient (green → gold → coral → red) as the cap is approached.
- **`/conversation/reset`**, **`/status`**, **`/file`**, **`/models`**
  endpoints fully wired.
- **Config panel** rebuilt with 3 tabs (providers&models / routing&autonomy /
  permissions). Each provider card shows LED health · adapter · notes ·
  per-model row with ★ recommended, ctx window, $/Mtok cost, "set default"
  button, and API key input.

### Polished

- Welcome panel: shows `<N> providers · <M> models · <K> with key` and
  the boot timestamp.
- Route bar: 3-hop fallback chain with animated arrows.
- Tool cards: summary like `2 files · 6 KB` for fs_read, `<N> hits` for
  search, matching the validated mockup.

## [0.3.1] — 2026-06-03


### Added — Rust

- `GET /models` endpoint: returns the full provider registry (35 providers,
  60+ models) with `context_window`, `cost_in`, `cost_out`, and `recommended`
  fields — used by the WebView model picker and provider health dots.
- `RunRequest.model_override`: optional field; when set the backend injects
  `__model:X__` into the task string.
- Engine parses and strips `__model:X__` prefix; filters the routing chain to
  the requested brain ID (falls back to full chain if not found).
- `list_agents` scans three directories (user config, `agents/`, `.sparrow/agents/`)
  and deduplicates by name — the 5 repo soul files appear in the Crew panel
  without any install step.
- `RouteSelected` event now carries `context_window: u64` from the primary
  brain's `caps()`, fixing the hardcoded 32k display in the WebView.

### Added — WebView (`console.html`)

- **Model picker dropdown**: the `⬡ auto` pill expands to a full popover
  grouped by provider; each model shows context window and cost; provider rows
  show a green/dim LED based on `has_credential` from `/config`; selection
  stored in `localStorage` and injected into `POST /run`.
- **Side diff viewer**: `.diff-panel` slides in from the right on every
  `DiffApplied` event; shows file path, `+N −M` stats, and colour-coded diff
  lines; `D` key and `✕` button toggle it; `Esc` closes.
- **Run summary card**: replaces the bare `✓ done` log line with a structured
  card showing cost, token count, files edited, checkpoint count, and duration.
- **Dynamic context bar**: `contextLimit` updates from `RouteSelected.context_window`
  and shows the real model limit (200k for Claude, 131k for NVIDIA/Llama4, etc.).
- **Provider health dots**: model picker dropdown renders a green LED next to
  providers that have a credential configured; dim dot for unconfigured.
- **Crew panel live status**: `AgentSpawned` and `AgentStatus` events update
  each agent row's status pill and note in real time via `updateCrewLiveStatus()`.
- **Checkpoint scrubber**: `addCheckpointNode()` stores checkpoint IDs; each
  node is clickable — confirms and issues `/rewind <id>` via the composer.
- **Dynamic home screen**: `bootIntro()` is now async; fetches `/models` and
  `/config` at startup to show real provider counts and active-provider count.
- **Hero stats**: `hero-active` shows count of providers with credentials;
  `hydrateHero()` caches configured provider IDs for the model picker dots.
- **Keyboard shortcuts**: `D` toggles diff panel; `P` opens model picker;
  both disabled when focus is inside an input/textarea.
- **Escape**: closes diff panel and model picker dropdown in addition to
  existing modals (approval, config, palette, history draft restore).

### Added — Agents / Docs

- Agent SOUL files enriched: `description`, `color`, `max_turns` added to all
  five repo agents (planner, coder, verifier, debugger, researcher); prompts
  tightened with step-by-step rules and better model assignments.
- `docs/keyboard.md`: updated with new shortcuts (`D`, `P`, `Esc` additions)
  and a WebView panels reference table with rail icons and descriptions.

## [0.3.0] — 2026-06-02


WebView Cockpit v0.3.0 is shipped. The console now matches the presentation
direction as a real local cockpit rather than a static mockup: every panel is
fed by Sparrow endpoints, the composer is usable as a primary command surface,
and both Captain/Paper themes are covered by tests.

### Added

- Three-column WebView cockpit: icon rail, persistent drawer, main event
  stream, live cockpit metrics, boot animation, route chain, token/cost/context
  counters, and dynamic swarm row from `GET /agents`.
- Drawer panels wired to real endpoints: `/sessions`, `/memory`, `/plugins`,
  `/tools`, `/permissions`, `/security`, and `/artifacts`.
- Typed event renderers for tool cards, diff cards, compaction banners,
  checkpoint timeline nodes, streaming text caret, skill notifications, route
  updates, autonomy changes, token/cost updates, and model fallback labels.
- Command composer upgrades: `Cmd/Ctrl+K` slash palette, inline `@<agent>`
  picker, server-backed `/history`, multiline `Shift+Enter`, persisted draft,
  paste attachment handling, real `POST /upload`, drag-and-drop overlay, and
  live context meter.
- Approval modal wired to `POST /approval`, replacing inline-only approval
  controls for new approval events.
- Paper theme coverage, theme query override for screenshots/tests, persisted
  theme toggle, and reduced-motion fallback that disables animations, embers,
  transitions, and the boot overlay.
- WebView keyboard shortcuts documented in `docs/keyboard.md`.

### Validation

- `cargo fmt --all -- --check`
- `cargo clippy --all-targets -- -D warnings`
- `cargo test --all-targets`
- `cargo build --release`
- HTTP smoke: `GET /`, `/history`, `/agents`, `POST /upload`, and
  `GET /artifacts`.

## [0.2.0] — 2026-06-02


Autonomy mega-prompt phases 1–13 are landed and the deferred items have been
closed. Every WebView/CLI surface advertised in the README is wired and tested.

### Added — phases 1–11

- Read-only plan mode + slash command loader (`sparrow plan`, `/plan`).
- Permission modes (`read-only` / `plan` / `supervised` / `trusted` /
  `autonomous` / `emergency-stop`) and lifecycle hooks (`PreRun`, `PreToolUse`,
  `PostToolUse`, `PreCheckpoint`, `PostCheckpoint`, …).
- Declarative agents with `.agent.md` frontmatter and `agent mention`.
- Bounded `MEMORY.md` / `USER.md`, anti-injection memory guards, SQLite FTS
  session search, `memory` tool.
- Progressive skills (references + templates + scripts + assets loaded on
  invoke), plugin manifests, scanner, namespaced commands.
- Tool metadata (toolset / risk / auth / mutation / network / exec) with
  surface-aware filtering.
- Scoped gateway sessions, `gateway health`, `gateway abort`, sessions
  list/export/cleanup.
- `sparrow security audit [--json]` and WebView `/security`.
- Sandbox policy: protected paths, env allowlist, Docker/SSH/Worktree backends,
  honest errors when vendor CLIs (Modal/Daytona/Vercel/Singularity) are missing.
- `vision`, `image_generate`, `text_to_speech`, `transcribe` tools and WebView
  `POST /upload` (10 MB cap, classified) + `GET /artifacts`.
- `sparrow github review|status|logs` CLI plus composite `action.yml` and a
  sample PR-review workflow.

### Added — phase 12 (context & compaction)

- `ContextMeter` over prompt/memory/tools/attachments/transcript.
- `HookEvent::PreCompact` and `PostCompact`, `Event::Compacted`.
- `sparrow compact` writes a durable `HandoffDoc` Markdown.
- Engine auto-compaction: when the transcript exceeds 120k chars the loop
  collapses earlier messages, writes the handoff to `.sparrow/handoff/`, and
  emits `Event::Compacted` so the UI can render the pass.

### Added — phase 13 (UI / TUI)

- WebView `GET /sessions` endpoint.
- TUI `@<name>` inline agent picker with autocomplete.
- Theme variants `captain`, `midnight`, `paper` selected via `$SPARROW_THEME`.
- `docs/keyboard.md` listing every shortcut.

### Added — v0.2.0 stability pass

- `gateway::RunRegistry`: gateway daemon now registers spawned runs and
  honours `sparrow gateway abort <run>` by actually cancelling the matching
  task (previously only a signal file was written).
- Engine reflects auto-compaction in the event stream; the threshold and the
  `keep_last` window are constants so the behaviour is reproducible.
- Docs updated: `docs/keyboard.md`, `docs/security.md`, `docs/sandboxing.md`,
  `docs/media.md`, `docs/github-action.md`, `docs/compaction.md`,
  `docs/cli-reference.md`.
- README status table promoted from Alpha → Stable for everything covered by
  unit + integration tests.
- New regression suite `tests/v0_2_stability.rs` pinning the run-registry
  abort path, the TUI agent picker, and `CARGO_PKG_VERSION = 0.2.0`.

### Validation

All of the following pass on `master` at the v0.2.0 cut:
- `cargo fmt --all -- --check`
- `cargo clippy --all-targets -- -D warnings`
- `cargo test --all-targets` (26 test binaries, 188 assertions)
- `cargo build --release`

## [0.1.0] — 2026-05-31


Initial kernel and feature surface.

### Added

- Complete Rust kernel: config, auth, provider, tools, sandbox, router, engine, autonomy.
- 35-provider registry with model tags and recommended defaults.
- Agentic engine loop with budget-aware routing and fallback chains.
- Multi-agent swarm orchestrator (Planner → Coder → Verifier).
- 4-tier persistent memory (SQLite): repo, identity, task, shared.
- Self-improving skill system with Curator.
- Cron scheduler with job persistence.
- Run recorder/replayer with transcript format (`inputs.json` + `events.jsonl`).
- Terminal TUI (ratatui) with cockpit, scroll, ASCII mascot.
- WebView console (HTTP + WebSocket + JS event client).
- Gateway transports: Telegram, Discord, Slack, WebSocket API, plus experimental extra adapters.
- CLI grammar (clap) with 20+ subcommands.
- `--json` NDJSON output for CI/hooks.
- Install script (`install.sh`).
- 84 local tests across unit/integration/bench harnesses.
- Branding assets: SVG mascot, cockpit mark, ASCII variant, presentation HTML.
- Honest module audit at `docs/AUDIT.md`.
- README status model based on Stable/Alpha/Partial/Experimental/Planned evidence.