galdr 0.16.0

Record & Replay for agent skills — capture a session's tool calls and distill them into a reproducible skill. Local-first.
//! galdr's own skill: the `SKILL.md` that teaches an agent how to drive galdr.
//!
//! This is galdr eating its own dogfood. The skill is **embedded in the binary** and
//! stamped with the crate version, so it can never drift from the CLI it documents —
//! upgrading galdr regenerates it. `galdr setup skill` installs it into the
//! open-standard skills root and links it into every harness, so an agent in Claude
//! Code, Codex, or Cursor knows how to record → distill → replay without being told.
//!
//! A hand-maintained skill that lags the CLI would violate galdr's whole thesis
//! ("the recorded run is the source of truth, not a stale doc"). Generating this one
//! from the binary is how galdr keeps faith with that thesis for its own skill.

use anyhow::Result;

use crate::link::{self, LinkResult};
use crate::paths;

/// The skill's name and install directory stem.
pub const SKILL_NAME: &str = "galdr";

/// The marker `galdr doctor` greps for to detect a skill that lags the binary.
const VERSION_MARKER: &str = "galdr-skill-version:";

/// Renders galdr's own `SKILL.md`, stamped with the current version.
pub fn render() -> String {
    BODY.replace("{{VERSION}}", env!("CARGO_PKG_VERSION"))
}

/// Extracts the version stamp from a skill's markdown, wherever the marker line sits.
fn parse_version(md: &str) -> Option<String> {
    md.lines().find_map(|line| {
        let idx = line.find(VERSION_MARKER)?;
        let value: String = line[idx + VERSION_MARKER.len()..]
            .trim()
            .chars()
            .take_while(|c| c.is_ascii_digit() || *c == '.')
            .collect();
        (!value.is_empty()).then_some(value)
    })
}

/// Writes galdr's skill into the open-standard root and links it into every
/// installed harness. Returns the link results. `galdr` is the only writer of the
/// skills directory, including for its own skill.
pub fn install() -> Result<Vec<LinkResult>> {
    let content = render();
    // galdr holds its own skill to the same content gate every distilled skill must
    // pass. The embedded skill ships clean, so this is a cheap belt-and-suspenders that
    // catches a regression (a secret, a personal path) before it can be installed.
    let ctx = crate::validate::ValidationCtx::new(false, false);
    crate::distill::gate_or_bail(&content, &ctx)?;
    let dir = paths::skill_dir(SKILL_NAME)?;
    paths::ensure_not_symlinked(&dir)?;
    std::fs::create_dir_all(&dir)?;
    std::fs::write(dir.join("SKILL.md"), content)?;
    link::link_skill(SKILL_NAME)
}

/// The installed galdr skill's version, if it is installed. Used by `doctor` to
/// flag drift against the running binary.
pub fn installed_version() -> Option<String> {
    let path = paths::skill_dir(SKILL_NAME).ok()?.join("SKILL.md");
    let md = std::fs::read_to_string(path).ok()?;
    parse_version(&md)
}

/// True when the running binary's version matches the installed skill's stamp.
pub fn is_current() -> bool {
    installed_version().as_deref() == Some(env!("CARGO_PKG_VERSION"))
}

const BODY: &str = r#"---
name: galdr
description: "Record a task the agent did well once and turn it into a reusable skill. Use galdr when you have completed a repeatable, multi-step task and want to crystallize it into a SKILL.md the harness can replay. Covers recording, distilling, linking, and outcome capture."
---

# galdr — Record & Replay for agent skills

galdr records the tool calls you (the agent) already emit and distills them into a
reusable skill. Record a task once; replay it as a skill that adapts to new inputs.
This skill teaches you how to drive galdr. It ships with galdr and is regenerated by
`galdr setup skill`, so it never drifts from the CLI.

## When to use

- You just completed a repeatable, multi-step task and want to make it reusable.
- The user says "record this", "make a skill from this", "save this workflow", or asks
  to capture a recipe for next time.
- A task is likely to recur: start a recording before you begin, so the run itself
  becomes the skill.
- The task involves driving a GUI through Computer Use — those clicks, types, and
  screenshots are tool calls too, so galdr records them (keeping the action, dropping
  the screenshot) and distills them into a semantic GUI skill.

Do not use for one-off throwaway work, or for secret-heavy sessions unless asked.

## The loop: record → distill → replay

1. Check nothing is already recording: `galdr rec status`. Never start a nested recording.
2. Start: `galdr rec start <short-slug>`.
3. Do the task normally. Your tool calls are captured automatically by the PostToolUse
   hook — no extra steps, nothing to narrate.
4. Stop before writing your final report: `galdr rec stop`. It closes the span and prints
   the step count — there is no id to copy.
5. Distill into a draft, then **author it**. `galdr distill --name <name>` renders a
   faithful draft from the span (real steps, secrets redacted) and prints an authoring
   brief. A replay of your tool calls is not yet a skill: galdr captures *what* ran, you
   supply *why*. Read the span and write the real skill — the problem it solves and when
   to reach for it, the values that vary as named inputs, each step's intent, how to
   verify success, the gotchas — then install your version: `galdr distill --from <file>`.
   It must clear the content gate (secrets, personal paths, broken skills are refused).
   **You choose the name** — galdr does not guess one; pick something descriptive and
   memorable (e.g. `cargo-preflight`, `rust-greenlight`), not a mechanical label. Shortcut:
   `galdr distill --fast` installs the mechanical render as-is — a floor, not a finished
   skill; reach for it only when authoring adds nothing.
6. Replay: once authored, the skill is discoverable by name in this harness and every
   other one galdr detected. Invoke it later with new inputs; interpret it, don't replay
   verbatim.

## Inputs

- `<short-slug>` — a short name for the recording (becomes the skill name `galdr-<slug>`).
- `[reference]` — which recording a command acts on: omit it for the most recent, or pass
  a recording name or a short id prefix. `distill`, `show`, `export`, and `outcome --rec`
  all resolve a recording this way, so you never copy a 26-character id.

## Commands (the CLI is AI-first: add `--json` to any read command for structured output)

- `galdr rec status` — is a recording active, and how many steps so far.
- `galdr rec start <slug>` / `galdr rec stop` — open and close a recording.
- `galdr list [--json]` — closed recordings.
- `galdr show [reference] [--json]` — a recording's steps (the most recent if omitted).
- `galdr distill [reference] [--name <name>]` — render a faithful **draft** and print an
  authoring brief; with no reference it uses the recording you just made. Read the span,
  write the real skill, and install it with `galdr distill --from <file>`. galdr captures
  the steps; you supply the judgment that makes it a skill. `--name` sets the skill name
  (galdr does not guess; without it the name is `galdr-<recording-slug>`). `--fast`
  installs the mechanical render as-is (a floor); `--auto` lets a local model author it.
- `galdr skills [--json]` — installed skills, each marked `galdr` (distilled) or `external`.
- `galdr link` — re-link galdr skills into every installed harness's skills directory.
- `galdr harnesses [--json]` — which harnesses are installed and whether galdr is wired in.
- `galdr outcome usage --skill <name> [--rec <reference>] --outcome success|partial|failed` —
  after you later USE a distilled skill, record how it went (`--rec` defaults to the most
  recent recording). This is the training signal that tells galdr which skills are worth
  keeping; record it honestly.
- `galdr suggest [--min-count <n>] [--top <n>] [--json]` — skill opportunities: repeated
  tasks (same step shape across recordings) not yet distilled, deduped against installed
  skills and ranked by repeatability. Turns "worth a skill?" into a queryable signal.
- `galdr bench [--skill <name>] [--json]` — replay reliability: aggregates the outcomes
  you recorded into a per-skill clean-replay hit-rate and effort cost (retries,
  interventions). Measures the production hit-rate, not just the skill's shape.
- `galdr validate [<skill> | --all | --file <path>] [--strict] [--json]` — run the
  install-time content gate over a skill (or a file): security (secrets, personal/PII
  paths, dangerous commands), practicality (a real, complete skill), and optimization
  (a precise description, no recording noise). Exits non-zero if anything blocks. Add
  `--strict` to also treat warnings (documented dangers, a weak description) as blocking.
- `galdr doctor` — diagnose config, catalog, sensor wiring, and skill discoverability.
  It also flags any installed skill that would fail the content gate.

## Steps (the recipe, generalized)

1. `galdr rec status` → confirm no active recording.
2. `galdr rec start <slug>`.
3. Perform the task using your normal tools.
4. `galdr rec stop` → closes the recording (no id to copy).
5. `galdr distill --name <name>` → a faithful draft + an authoring brief. Read the span,
   write the real skill, and install it: `galdr distill --from <file>`.
6. (Later, on reuse) `galdr outcome usage --skill galdr-<slug> --outcome <result>` (`--rec` defaults to the latest recording).

## Verification

- During work, `galdr rec status` shows the step count climbing.
- The default distill leaves the skill as a `draft` (readiness docked) until you author
  it; after `galdr distill --from`, `galdr skills` lists it as `final`, readiness high,
  origin `galdr`.
- The skill is reachable in each harness's skills directory; `galdr doctor` reports
  "galdr skill(s) discoverable across N harness(es)".

## Rules and robustness

- One recording at a time. Always `galdr rec status` before `rec start`.
- Stop the recording before your final summary, so the report's own tool calls are not recorded.
- The sensor never breaks your session: if galdr fails internally it exits cleanly and records nothing.
- Everything is local. The raw recording lives only under `~/.galdr`; nothing leaves the machine.
- Every install passes a content gate: a leaked secret, a personal path, or a skill that
  is not really a skill is refused, so what you distill is safe to share. Check any skill
  with `galdr validate`.
- If a distilled skill is not showing up in a harness, run `galdr link`, then `galdr doctor`.

<!-- galdr-skill-version: {{VERSION}} -->
"#;

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn render_is_a_valid_complete_skill() {
        let md = render();
        // It must satisfy galdr's own skill validator (open-standard anatomy).
        assert!(crate::distill::validate_skill_md(&md).is_ok(), "{md}");
        // No leftover placeholder.
        assert!(!md.contains("{{VERSION}}"));
    }

    #[test]
    fn render_documents_the_core_commands() {
        let md = render();
        for cmd in [
            "galdr rec start",
            "galdr rec stop",
            "galdr distill",
            "galdr link",
            "galdr outcome usage",
            "galdr doctor",
        ] {
            assert!(md.contains(cmd), "skill should document `{cmd}`");
        }
    }

    #[test]
    fn self_skill_passes_gate() {
        // galdr's own skill must clear the same content gate it imposes on every
        // distilled skill — and clear it impeccably (no warnings, strict-clean).
        let md = render();
        let ctx = crate::validate::ValidationCtx::new(false, false);
        let report = crate::validate::validate_skill(&md, &ctx);
        assert!(!report.has_blocking(false), "{report}");
        assert!(
            !report.has_blocking(true),
            "self-skill should be strict-clean:\n{report}"
        );
    }

    #[test]
    fn version_stamp_round_trips() {
        // The stamp render writes is the one `installed_version` parses back.
        let md = render();
        assert_eq!(
            parse_version(&md).as_deref(),
            Some(env!("CARGO_PKG_VERSION"))
        );
    }
}