harness

Agent = Model + Harness. This crate is the Harness — the modular scaffolding around an LLM that turns it into an autonomous coding agent.

Rust framework for building production coding agents, based on the harness engineering discipline as written up by Böckeler (Thoughtworks, 2026) and Lopopolo (OpenAI, 2026). See DESIGN.md for the full architectural rationale.

What you get

Layer	What it does	Crate
Model	OpenAI-compatible + Anthropic-native + scriptable mock	`harness-models`
Tools	fs (read/write/edit/list), shell (risk-classified allowlist)	`harness-tools-fs`, `harness-tools-shell`
Sensors	`cargo check` + `cargo clippy` produce LLM-friendly `Signal`s; auto-fix patches apply automatically	`harness-sensors-rust`
Skills	strict agentskills.io validator + `#[skill]` proc-macro + export to spec-compliant directory	`harness-skills`, `harness-macros`
Guides	feedforward Markdown context, scoped by task	`#[guide]` + `harness-templates`
Hooks	27-event lifecycle bus with deny/inject/mutate	`harness-hooks` + `#[hook]`
Compactor	5-stage progressive compaction (auto-triggered by budget)	`harness-compactor`
Loop	ReAct + tool-call dispatch + sensor feedback + auto-fix	`harness-loop`
Blueprint	deterministic + agent state machine with retry/fallback	`harness-blueprint`
Sandbox	git worktree isolation (container/VM in v0.2)	`harness-sandbox`
CLI	`harness skills validate / list / export`, `harness new`	`harness-cli`

60-second tour

1. Build a minimal agent

use harness::prelude::*;
use harness_loop::AgentLoop;
use harness_models::{OpenAiCompat, providers::DEEPSEEK};
use harness_tools_fs::{ListDir, ReadFile};
use harness_context::default_world;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), harness::HarnessError> {
    let model = OpenAiCompat::with_key(
        DEEPSEEK,
        "deepseek-chat",
        std::env::var("DEEPSEEK_API_KEY").unwrap(),
    );
    let mut world = default_world(".");
    let outcome = AgentLoop::new(model)
        .with_tool(Arc::new(ReadFile))
        .with_tool(Arc::new(ListDir))
        .run(
            Task { description: "Find Cargo.toml and tell me the workspace name".into(),
                   source: None, deadline: None },
            &mut world,
        )
        .await?;
    println!("{outcome:?}");
    Ok(())
}

2. Register a skill, tool, guide, sensor, or hook with a proc-macro

/// Greet the user politely. Use when the user explicitly asks for a friendly hello.
#[harness::skill(name = "polite-hello", harness(kind = "inferential", risk = "read-only"))]
async fn polite_hello(_ctx: &mut Context, _w: &mut World) -> Result<(), harness::SkillError> {
    Ok(())
}

#[harness::tool(name = "reverse", risk = "read-only",
    schema = r#"{"type":"object","properties":{"text":{"type":"string"}}}"#)]
async fn reverse(args: serde_json::Value, _w: &mut World)
    -> Result<ToolResult, ToolError> { /* ... */ }

#[harness::guide(scope = "always", kind = "inferential")]
async fn project_intro(ctx: &mut Context, _w: &World) -> Result<(), GuideError> {
    ctx.guides.push(Block::Text("Always reply in two sentences.".into()));
    Ok(())
}

#[harness::sensor(stage = "self-correct", kind = "computational")]
async fn no_unwrap(action: &Action, w: &World) -> Result<Vec<Signal>, SensorError> { /* ... */ }

#[harness::hook(event = "PreToolUse", name = "audit")]
fn audit(ev: &Event<'_>, _w: &mut World) -> HookOutcome {
    tracing::info!(?ev); HookOutcome::Allow
}

All five auto-register via inventory; AgentLoop::with_macro_hooks() and SkillRegistry::with_macro_skills() pick them up at runtime.

3. Hybrid deterministic + agent state machine

use harness_blueprint::{Blueprint, Node, NodeOutput, Transition};

let bp = Blueprint::new()
    .add("fmt",    Node::deterministic(|w| Box::pin(async move {
        w.runner.exec("cargo", &["fmt", "--all"], Some(w.repo.root.as_path())).await?;
        Ok(NodeOutput { transition: Transition::Next, data: Default::default() })
    })))
    .add("work",   Node::agent(|w| Box::pin(async move { /* run AgentLoop */ })))
    .add("test",   Node::deterministic(|w| Box::pin(async move { /* cargo test */ })))
    .edge("fmt", "work").edge("work", "test")
    .branch_on_failure("test", "work", 2);

4. Validate / export skills for any spec-compliant agent

$ harness skills validate ./skills/format-rust
✓ valid: format-rust — Run cargo fmt across the workspace.

$ harness skills export ./out --from ./skills
✓ ./out/format-rust/SKILL.md
✓ ./out/review-axum/SKILL.md
exported 2 skill(s) to ./out

The exported directory is consumable by Claude Code, Cursor, Codex, or any agent that follows the agentskills.io spec.

5. Scaffold a new agent project

$ harness new my-agent
✓ created my-agent/
  └─ Cargo.toml
  └─ src/main.rs   # minimal agent with one tool and one skill

Testing & verification

$ cargo test --workspace
... 123 tests passing

Three layers of verification:

Unit tests (per crate) — pure logic, no I/O.
AgentLoop integration tests (harness-loop/tests/agent_loop.rs) — MockModel drives the full pipeline with scripted responses; zero network, deterministic.
Golden-path test (harness-loop/tests/golden_path.rs) — every component (guide, tool, sensor, auto-fix, hook, compactor) exercised at once against a tmp workspace, final on-disk state asserted.
Live demo (examples/crate-keeper) — runs against DeepSeek (flash or pro tier) for wire-format validation that mocks can't catch.

Examples

See examples/README.md for full descriptions. In order of increasing surface area:

examples/deepseek-hello — smallest possible Hello-world against DeepSeek.
examples/crate-keeper — MockModel smoke test; no network.
examples/personal-assistant — scheduling agent with UserProfile, REPL, brief mode.
examples/investor-bot — autonomous web research with multi-engine search fallback + retry.

Status

v0.0.1 — ✅ initial publish (15 of 18 crates).
v0.0.2 — ✅ shipped. UserProfile + ProfileGuide, optional harness-rs-daemon scheduler, retry/backoff in model adapters, MCP server with resources + prompts, session record/replay, multi-engine search, #[non_exhaustive] sweep, security gates on FixPatch::RunCommand + shell_read. See CHANGELOG.
v0.1+ — ContainerSandbox / VmSandbox / first-class blueprint Node::Agent are on the road.

License

Dual-licensed under MIT OR Apache-2.0.

harness-rs-hooks 0.0.3