galdr 0.12.0

Record & Replay for agent skills — capture a session's tool calls and distill them into a reproducible skill. Local-first.
//! galdr's own skill: the `SKILL.md` that teaches an agent how to drive galdr.
//!
//! This is galdr eating its own dogfood. The skill is **embedded in the binary** and
//! stamped with the crate version, so it can never drift from the CLI it documents —
//! upgrading galdr regenerates it. `galdr setup skill` installs it into the
//! open-standard skills root and links it into every harness, so an agent in Claude
//! Code, Codex, or Cursor knows how to record → distill → replay without being told.
//!
//! A hand-maintained skill that lags the CLI would violate galdr's whole thesis
//! ("the recorded run is the source of truth, not a stale doc"). Generating this one
//! from the binary is how galdr keeps faith with that thesis for its own skill.

use anyhow::Result;

use crate::link::{self, LinkResult};
use crate::paths;

/// The skill's name and install directory stem.
pub const SKILL_NAME: &str = "galdr";

/// The marker `galdr doctor` greps for to detect a skill that lags the binary.
const VERSION_MARKER: &str = "galdr-skill-version:";

/// Renders galdr's own `SKILL.md`, stamped with the current version.
pub fn render() -> String {
    BODY.replace("{{VERSION}}", env!("CARGO_PKG_VERSION"))
}

/// Extracts the version stamp from a skill's markdown, wherever the marker line sits.
fn parse_version(md: &str) -> Option<String> {
    md.lines().find_map(|line| {
        let idx = line.find(VERSION_MARKER)?;
        let value: String = line[idx + VERSION_MARKER.len()..]
            .trim()
            .chars()
            .take_while(|c| c.is_ascii_digit() || *c == '.')
            .collect();
        (!value.is_empty()).then_some(value)
    })
}

/// Writes galdr's skill into the open-standard root and links it into every
/// installed harness. Returns the link results. `galdr` is the only writer of the
/// skills directory, including for its own skill.
pub fn install() -> Result<Vec<LinkResult>> {
    let content = render();
    // galdr holds its own skill to the same content gate every distilled skill must
    // pass. The embedded skill ships clean, so this is a cheap belt-and-suspenders that
    // catches a regression (a secret, a personal path) before it can be installed.
    let ctx = crate::validate::ValidationCtx::new(false, false);
    crate::distill::gate_or_bail(&content, &ctx)?;
    let dir = paths::skill_dir(SKILL_NAME)?;
    paths::ensure_not_symlinked(&dir)?;
    std::fs::create_dir_all(&dir)?;
    std::fs::write(dir.join("SKILL.md"), content)?;
    link::link_skill(SKILL_NAME)
}

/// The installed galdr skill's version, if it is installed. Used by `doctor` to
/// flag drift against the running binary.
pub fn installed_version() -> Option<String> {
    let path = paths::skill_dir(SKILL_NAME).ok()?.join("SKILL.md");
    let md = std::fs::read_to_string(path).ok()?;
    parse_version(&md)
}

/// True when the running binary's version matches the installed skill's stamp.
pub fn is_current() -> bool {
    installed_version().as_deref() == Some(env!("CARGO_PKG_VERSION"))
}

const BODY: &str = r#"---
name: galdr
description: "Record a task the agent did well once and turn it into a reusable skill. Use galdr when you have completed a repeatable, multi-step task and want to crystallize it into a SKILL.md the harness can replay. Covers recording, distilling, linking, and outcome capture."
---

# galdr — Record & Replay for agent skills

galdr records the tool calls you (the agent) already emit and distills them into a
reusable skill. Record a task once; replay it as a skill that adapts to new inputs.
This skill teaches you how to drive galdr. It ships with galdr and is regenerated by
`galdr setup skill`, so it never drifts from the CLI.

## When to use

- You just completed a repeatable, multi-step task and want to make it reusable.
- The user says "record this", "make a skill from this", "save this workflow", or asks
  to capture a recipe for next time.
- A task is likely to recur: start a recording before you begin, so the run itself
  becomes the skill.
- The task involves driving a GUI through Computer Use — those clicks, types, and
  screenshots are tool calls too, so galdr records them (keeping the action, dropping
  the screenshot) and distills them into a semantic GUI skill.

Do not use for one-off throwaway work, or for secret-heavy sessions unless asked.

## The loop: record → distill → replay

1. Check nothing is already recording: `galdr rec status`. Never start a nested recording.
2. Start: `galdr rec start <short-slug>`.
3. Do the task normally. Your tool calls are captured automatically by the PostToolUse
   hook — no extra steps, nothing to narrate.
4. Stop before writing your final report: `galdr rec stop` (prints the rec_id).
5. Distill: `galdr distill <rec_id> --name <name>` → a complete SKILL.md, installed and
   linked into every installed harness. It must clear the content gate first (secrets,
   personal paths, and broken skills are refused), so what lands is safe to share.
   **You choose the name** — galdr does not guess one. Pick something descriptive,
   memorable, and original (e.g. `cargo-preflight`, `rust-greenlight`), not a mechanical
   label. Omit `--name` and galdr falls back to `galdr-<recording-slug>`.
6. Replay: the new skill is discoverable by name in this harness and every other one
   galdr detected. Invoke it later with new inputs; interpret it, don't replay verbatim.

## Inputs

- `<short-slug>` — a short name for the recording (becomes the skill name `galdr-<slug>`).
- `<rec_id>` — the id `galdr rec stop` prints; pass it to `distill`, `show`, `export`.

## Commands (the CLI is AI-first: add `--json` to any read command for structured output)

- `galdr rec status` — is a recording active, and how many steps so far.
- `galdr rec start <slug>` / `galdr rec stop` — open and close a recording.
- `galdr list [--json]` — closed recordings.
- `galdr show <rec_id> [--json]` — a recording's steps.
- `galdr distill <rec_id> [--name <name>]` — render and install a complete skill in one
  step. `--name` sets the skill name (you bring the naming judgment; galdr does not guess);
  without it the name is `galdr-<recording-slug>`. Variants: `--draft` writes scaffolding
  for you to refine then install with `--from <file>`; `--auto` lets a local model write it.
- `galdr skills [--json]` — installed skills, each marked `galdr` (distilled) or `external`.
- `galdr link` — re-link galdr skills into every installed harness's skills directory.
- `galdr harnesses [--json]` — which harnesses are installed and whether galdr is wired in.
- `galdr outcome usage --skill <name> --rec <rec_id> --outcome success|partial|failed` —
  after you later USE a distilled skill, record how it went. This is the training signal
  that tells galdr which skills are worth keeping; record it honestly.
- `galdr suggest [--min-count <n>] [--top <n>] [--json]` — skill opportunities: repeated
  tasks (same step shape across recordings) not yet distilled, deduped against installed
  skills and ranked by repeatability. Turns "worth a skill?" into a queryable signal.
- `galdr bench [--skill <name>] [--json]` — replay reliability: aggregates the outcomes
  you recorded into a per-skill clean-replay hit-rate and effort cost (retries,
  interventions). Measures the production hit-rate, not just the skill's shape.
- `galdr validate [<skill> | --all | --file <path>] [--strict] [--json]` — run the
  install-time content gate over a skill (or a file): security (secrets, personal/PII
  paths, dangerous commands), practicality (a real, complete skill), and optimization
  (a precise description, no recording noise). Exits non-zero if anything blocks. Add
  `--strict` to also treat warnings (documented dangers, a weak description) as blocking.
- `galdr doctor` — diagnose config, catalog, sensor wiring, and skill discoverability.
  It also flags any installed skill that would fail the content gate.

## Steps (the recipe, generalized)

1. `galdr rec status` → confirm no active recording.
2. `galdr rec start <slug>`.
3. Perform the task using your normal tools.
4. `galdr rec stop` → note the rec_id.
5. `galdr distill <rec_id>` → a complete, discoverable skill.
6. (Later, on reuse) `galdr outcome usage --skill galdr-<slug> --rec <new_rec_id> --outcome <result>`.

## Verification

- During work, `galdr rec status` shows the step count climbing.
- After distilling, `galdr skills` lists the new skill as `final`, readiness high, origin `galdr`.
- The skill is reachable in each harness's skills directory; `galdr doctor` reports
  "galdr skill(s) discoverable across N harness(es)".

## Rules and robustness

- One recording at a time. Always `galdr rec status` before `rec start`.
- Stop the recording before your final summary, so the report's own tool calls are not recorded.
- The sensor never breaks your session: if galdr fails internally it exits cleanly and records nothing.
- Everything is local. The raw recording lives only under `~/.galdr`; nothing leaves the machine.
- Every install passes a content gate: a leaked secret, a personal path, or a skill that
  is not really a skill is refused, so what you distill is safe to share. Check any skill
  with `galdr validate`.
- If a distilled skill is not showing up in a harness, run `galdr link`, then `galdr doctor`.

<!-- galdr-skill-version: {{VERSION}} -->
"#;

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn render_is_a_valid_complete_skill() {
        let md = render();
        // It must satisfy galdr's own skill validator (open-standard anatomy).
        assert!(crate::distill::validate_skill_md(&md).is_ok(), "{md}");
        // No leftover placeholder.
        assert!(!md.contains("{{VERSION}}"));
    }

    #[test]
    fn render_documents_the_core_commands() {
        let md = render();
        for cmd in [
            "galdr rec start",
            "galdr rec stop",
            "galdr distill",
            "galdr link",
            "galdr outcome usage",
            "galdr doctor",
        ] {
            assert!(md.contains(cmd), "skill should document `{cmd}`");
        }
    }

    #[test]
    fn self_skill_passes_gate() {
        // galdr's own skill must clear the same content gate it imposes on every
        // distilled skill — and clear it impeccably (no warnings, strict-clean).
        let md = render();
        let ctx = crate::validate::ValidationCtx::new(false, false);
        let report = crate::validate::validate_skill(&md, &ctx);
        assert!(!report.has_blocking(false), "{report}");
        assert!(
            !report.has_blocking(true),
            "self-skill should be strict-clean:\n{report}"
        );
    }

    #[test]
    fn version_stamp_round_trips() {
        // The stamp render writes is the one `installed_version` parses back.
        let md = render();
        assert_eq!(
            parse_version(&md).as_deref(),
            Some(env!("CARGO_PKG_VERSION"))
        );
    }
}