crux

An agentic DSL for Rust -- inspectable, serializable, replayable agent orchestration.

cruxx is not a standalone language. It's a set of macros, traits, and types that make agentic control flow explicit in the Rust type system.

Quick example

use cruxx::prelude::*;

#[cruxx::agent]
async fn plan_trip(goal: String) -> Crux<Itinerary> {
    let research = x.step("research", || async {
        Ok(search_web(&goal).await?)
    }).await?;

    let draft = x.delegate::<DraftAgent>("draft", research)
        .with_budget(Budget::tokens(4000))
        .run().await?;

    x.speculate("finalize", vec![
        ("cheap", Box::pin(async { finalize_cheap(&draft).await })),
        ("fast",  Box::pin(async { finalize_fast(&draft).await })),
        ("safe",  Box::pin(async { finalize_safe(&draft).await })),
    ]).pick_best_by(|r| r.confidence).await
}

Example

use cruxx::prelude::*;

#[cruxx::agent]
async fn review_pr(pr: PullRequest) -> Crux<ReviewReport> {
    // Fan out: fetch diff and CI results in parallel
    let (diff, ci) = x.join_all([
        x.step("fetch_diff", || git::diff(&pr.base, &pr.head)),
        x.step("fetch_ci",   || ci::latest_run(&pr.repo, &pr.head)),
    ]).await?;

    // Delegate deep analysis to a specialist; escalate if confidence is low
    let analysis = x.delegate::<SecurityAnalysisAgent>("security", &diff)
        .with_budget(Budget::tokens(8000))
        .on_low_confidence(0.75, escalate_to_human)
        .on_step_failure(Recovery::Retry(2))
        .await?;

    // Race three review styles; keep whichever scores highest
    let review = x.speculate("style", [
        ("strict",  || apply_strict_style(&analysis, &ci)),
        ("lenient", || apply_lenient_style(&analysis, &ci)),
        ("summary", || apply_summary_style(&analysis, &ci)),
    ]).pick_best_by(|r| r.confidence).await?;

    x.step("emit", || build_report(pr, analysis, review)).await
}

Every x.step, x.delegate, x.speculate call is recorded in the Crux<T> value the function returns. That value is:

Inspectable: cruxx.causal_chain(), cruxx.delegations(), cruxx.rejected_branches()
Serializable: serde_json::to_string(&cruxx) just works
Replayable: Crux::replay_from(snapshot) resumes after a crash
Composable: cruxx_a | cruxx_b, Crux::join_all([...])

Crates

Crate	Description
`cruxx`	Facade crate, re-exports `cruxx-core` + `cruxx-macros`
`cruxx-core`	Core types, traits, and runtime
`cruxx-types`	Serializable wire-format types (`Crux<T>`, `Step`, `Budget`, `RecoveryKind`)
`cruxx-macros`	`#[cruxx::agent]`, `#[cruxx::harness]`, `#[cruxx::evolve]` macros
`cruxx-script`	YAML-driven pipeline scripting
`cruxx-agentic`	Step handlers: shell, fs, git, json, llm, container, harness
`cruxx-model`	Canonical model ID types and provider-specific parsers
`cruxx-plugin`	Subprocess plugin host for pipelines
`cruxx-planner`	`EvolutionPlanner`: metrics-driven harness profile evolution

Features

Enable via cruxx:

Feature	Default	Description
`tokio-runtime`	yes	Async runtime support via tokio + futures
`redb`	no	Persistent `TaskRegistry` backend via redb (pure-Rust)
`tracing`	no	Instrument with `tracing` spans
`baml`	no	BAML-backed LLM extraction (`llm::extract`, `llm::decompose`, `llm::plan`)

Core concepts

Crux<T>: the execution trace. Every step, delegation, speculation, and failure is a first-class value you can inspect, serialize, and replay.

CruxCtx: the runtime context threaded through agent execution. Provides step(), delegate(), speculate(), pipe(), join_all(), route_on_confidence().

Agent trait: the single-method interface all agents implement. The #[cruxx::agent] macro generates this for you.

TaskRegistry<B>: typed task management with submit, checkpoint, replay, and status transitions. Pluggable backend (InMemoryBackend, RedbBackend).

Lifecycle hooks: on_low_confidence, on_step_failure, on_budget_exceeded with recovery actions (skip, retry, escalate, substitute).

Replay: strict or lenient mode. Strict rejects hash mismatches; lenient skips removed steps and returns cache misses for changed ones.

HarnessProfile: resource specification for a container or process harness (image, env, limits). Paired with ResourceHints for advisory scheduling metadata and HarnessDiff to describe incremental profile changes.

SafetyPolicy trait: port for user-defined approval logic. Receives a proposed HarnessDiff and returns Approved, Rejected, or RequiresApproval. Two adapters ship in cruxx-agentic: AutoApproveGate (always approves) and TerminalApprovalGate (interactive stdin prompt).

EvolutionPlanner (cruxx-planner): drives deterministic, metrics-based profile evolution. Accepts RunMetrics and emits a HarnessDiff describing resource adjustments. EvolutionOutcome records the result of applying a diff.

Orchestrator patterns

The harness::evolve and harness::canary pipeline handlers expose container lifecycle management as first-class pipeline steps.

steps:
  - name: evolve_profile
    handler: harness::evolve
    args:
      profile: base
      metrics_from: run_metrics

  - name: canary
    handler: harness::canary
    args:
      image: myapp:next
      traffic_percent: 10

Use #[cruxx::harness] to annotate a struct as a managed harness, and #[cruxx::evolve] to mark an async fn as an evolution entry point (injects EvolutionPlanner + CruxCtx):

#[cruxx::harness]
struct ApiServer { image: String, replicas: u32 }

#[cruxx::evolve]
async fn scale_on_p99(metrics: RunMetrics) -> Crux<EvolutionOutcome> {
    let diff = planner.suggest(&metrics).await?;
    x.step("apply", || harness.apply_diff(&diff)).await
}

The on_approval_required lifecycle hook fires when SafetyPolicy returns RequiresApproval, giving agents an opportunity to pause, log, or escalate before a diff is applied.

Installation

[dependencies]
cruxx = "0.1"

# With persistent storage (redb, pure-Rust):
# cruxx = { version = "0.1", features = ["redb"] }

Requires Rust 1.85+ (edition 2024).

Running pipelines

cruxx run executes YAML pipelines using the built-in handler registry. Build it with the baml feature to enable LLM extraction:

cargo build -p cruxx-agentic --features baml --bin cruxx-run

Set your API key — BAML picks it up automatically:

export ANTHROPIC_API_KEY=sk-ant-...   # Claude (default BAML client)
# or
export OPENAI_API_KEY=sk-...          # OpenAI

Summarize text:

cruxx run examples/extract_summary.crux examples/input_summary.json

Pipeline: extract_summary
Status:   OK
Duration: 1823.4ms
Steps:    2

Trace:
   1. [  OK] summarize (1821ms)
   2. [  OK] log_output (1ms)

Output:
{
  "summary": "Crux is an agentic DSL for Rust that makes control flow explicit in the type
system via Crux<T> values.",
  "key_points": [
    "Every execution unit is a first-class Crux<T> value",
    "CruxCtx provides step(), delegate(), speculate(), pipe(), join_all()",
    "TaskRegistry supports InMemoryBackend and RedbBackend"
  ],
  "word_count": 89
}

Extract named entities:

cruxx run examples/extract_entities.crux examples/input_entities.json

Pipeline: extract_entities
Status:   OK
Duration: 1540.2ms
Steps:    2

Trace:
   1. [  OK] extract (1538ms)
   2. [  OK] log_output (1ms)

Output:
{
  "entities": [
    { "name": "Crux",        "entity_type": "Software",   "description": "Agentic DSL for Rust" },
    { "name": "CruxCtx",     "entity_type": "Component",  "description": "Runtime context" },
    { "name": "RedbBackend", "entity_type": "Component",  "description": "Persistent KV adapter" }
  ]
}

Available handlers

Always available:

Handler	Key args	Description
`shell::exec`	`cmd`	Run shell command, ignore exit code
`shell::capture`	`cmd`	Run shell command, fail on non-zero exit
`fs::read`	`path`	Read a file to string
`fs::write`	`path`, `content`	Write a string to a file
`fs::glob`	`pattern`	Glob pattern match
`fs::exists`	`path`	Check path existence
`git::staged_files`	—	`git diff --cached --name-only`
`git::diff`	`revision`	`git diff [revision]`
`git::log`	`count`	`git log -N --format=%H\t%s`
`git::status`	—	`git status --porcelain`
`json::pick`	`fields`	Extract named fields from input object
`json::merge`	`with`	Merge static object into input
`json::jq`	`expr`	Dot-path traversal (e.g. `".foo.bar"`)
`ctrl::noop`	—	Pass input through unchanged
`ctrl::log`	—	Log to stderr and pass through
`ctrl::assert`	`condition`	Assert condition is truthy or fail
`llm::invoke`	`prompt`, `provider`, `model`	Raw LLM completion (OpenAI/Anthropic/Ollama)
`container::run`	`image`, `env`, `limits`	Start a container from a `HarnessProfile`
`container::wait`	`timeout_ms`	Block until container exits, emit exit code/logs
`harness::evolve`	`profile`, `metrics_from`	Run `EvolutionPlanner` and apply resulting diff
`harness::canary`	`image`, `traffic_percent`	Deploy canary alongside current harness
`rx::run`	`name`, `args?`, `registry?`	Run a script registered in the rx registry
`rx::list`	`registry?`	List all commands in the rx registry

Behind --features baml:

Handler	Key args	Description
`llm::extract`	`function`, `input`	BAML structured extraction
`llm::decompose`	`spec`	Spec decomposition into task list
`llm::plan`	`goal`	Pipeline generation from natural language

See docs/crux-capabilities.md for the full support matrix including combinators and known gaps.

Examples

Rust agents

cargo run --example basic_agent

See examples/ for pipeline .crux files and input fixtures.

Documentation

See the tutorial for a chapter-by-chapter walkthrough.

License

MIT -- see LICENSE.

cruxx-script 0.2.6