# agent-runtime
[](https://github.com/Mattbusel/agent-runtime/actions/workflows/ci.yml)
[](https://crates.io/crates/llm-agent-runtime)
[](https://docs.rs/llm-agent-runtime)
[](LICENSE)
[](https://www.rust-lang.org)
**agent-runtime** is a unified Tokio async agent runtime for Rust. It combines orchestration
primitives, episodic and semantic memory, an in-memory knowledge graph, and a ReAct
(Thought-Action-Observation) agent loop in a single crate.
The library consolidates the public APIs of `tokio-prompt-orchestrator`, `tokio-agent-memory`,
`mem-graph`, and `wasm-agent`, and extends them with pluggable LLM providers, optional
file-based persistence with per-step checkpointing, lock-free runtime metrics, and a
compile-time typestate builder that prevents misconfiguration at zero runtime cost.
---
## What it does
- **ReAct agent loop** — runs Thought → Action → Observation cycles with a pluggable tool
registry. Terminates on `FINAL_ANSWER` or `max_iterations`, whichever comes first.
- **Episodic memory** — per-agent event store with configurable decay, hybrid recall scoring,
per-agent capacity limits, and recall-count tracking.
- **Semantic memory** — key-value store with tag-based retrieval and cosine-similarity vector
search.
- **Working memory** — bounded LRU key-value store injected into the agent prompt.
- **Knowledge graph** — directed in-memory graph with BFS, DFS, Dijkstra shortest-path,
transitive closure, degree and betweenness centrality, community detection, cycle detection,
and subgraph extraction.
- **Circuit breaker** — configurable failure threshold and recovery window with a pluggable
backend trait for distributed state (e.g., Redis).
- **Retry policy** — exponential backoff capped at 60 s.
- **Deduplicator** — TTL-based request deduplication with in-flight tracking.
- **Backpressure guard** — hard and soft capacity limits with tracing warnings.
- **Pipeline** — composable string-transform stage chain.
- **LLM providers** — built-in `AnthropicProvider` and `OpenAiProvider` with SSE streaming
(behind feature flags).
- **Persistence** — async `PersistenceBackend` trait and `FilePersistenceBackend` for
session and per-step checkpointing.
- **Metrics** — atomic counters for active/total sessions, steps, tool calls, backpressure
sheds, memory recalls, and checkpoint errors.
- **`types` module** — `AgentId` and `MemoryId` are always available without enabling the
`memory` feature, allowing the runtime to compile in minimal configurations.
---
## How it works
```
User Code
|
v
+--------------------+ compile-time typestate
+----+----+----+-----+ AgentRuntimeBuilder<HasConfig>
| | | .build() (infallible)
| | |
| | +--------------------------------------------+
| | |
| +-------------------+ |
| | |
v v v
+--------------------+ +---------------------+ +--------------------+
| EpisodicStore | | GraphStore | | CircuitBreaker |
| DecayPolicy | | BFS / DFS | | RetryPolicy |
| RecallPolicy | | Dijkstra | | Deduplicator |
| per-agent cap | | transitive close | | BackpressureGuard |
| SemanticStore | | centrality | | Pipeline |
| cosine search | | community detect | +--------------------+
| WorkingMemory | | cycle detection |
| LRU eviction | +---------------------+
+--------------------+
|
v
+--------------------+
| ReActLoop |<--- ToolRegistry (ToolSpec, per-tool CircuitBreaker)
| AgentConfig |
| AgentSession |
+--------------------+
|
+---------------------------+
| |
v v
+--------------------+ +--------------------+
| AnthropicProvider | | FilePersistence |
| OpenAiProvider | | session checkpoint|
+--------------------+ | per-step snapshot |
+--------------------+
|
+---------+
v
+--------------------+
| metrics.rs |
| RuntimeMetrics |
| (atomic counters) |
+--------------------+
```
### Data flow inside `run_agent`
1. `BackpressureGuard` is checked; sessions exceeding capacity are rejected immediately with
`AgentRuntimeError::BackpressureShed`.
2. `EpisodicStore` is recalled for the agent; matching items are injected into the prompt,
subject to `max_memory_recalls` and the optional `max_memory_tokens` token budget.
3. `WorkingMemory` key-value pairs are appended to the enriched prompt.
4. `GraphStore` entity count is captured for session metadata.
5. `ReActLoop` runs Thought-Action-Observation cycles, dispatching tool calls through
`ToolRegistry`.
6. Per-tool `CircuitBreaker` (optional) fast-fails unhealthy tools and records structured
error observations with `kind` classification (`not_found`, `transient`, `permanent`).
7. On completion an `AgentSession` is returned; if a `PersistenceBackend` is configured,
the final session and every per-step snapshot are saved atomically.
8. `RuntimeMetrics` counters are updated atomically throughout.
---
## Quickstart
### 1. Add to `Cargo.toml`
```toml
[dependencies]
llm-agent-runtime = "1.58"
tokio = { version = "1", features = ["full"] }
```
To enable built-in LLM providers:
```toml
llm-agent-runtime = { version = "1.58", features = ["anthropic", "openai"] }
```
To opt in to only the subsystems you need:
```toml
llm-agent-runtime = { version = "1.58", default-features = false, features = ["memory", "orchestrator"] }
```
### 2. Environment variables
```sh
export ANTHROPIC_API_KEY="sk-ant-..." # required for AnthropicProvider
export OPENAI_API_KEY="sk-..." # required for OpenAiProvider
export RUST_LOG="agent_runtime=debug" # optional structured logging
```
### 3. Minimal example (no external services)
The default feature set (`orchestrator`, `memory`, `graph`, `wasm`) runs entirely in-process
with no API keys, no Redis, and no database.
```rust
use llm_agent_runtime::prelude::*;
#[tokio::main]
async fn main() -> Result<(), AgentRuntimeError> {
// Seed episodic memory.
let memory = EpisodicStore::new();
let agent_id = AgentId::new("demo");
memory.add_episode(agent_id.clone(), "Rust is fast and memory-safe.", 0.9)?;
memory.add_episode(agent_id.clone(), "Tokio is an async runtime for Rust.", 0.8)?;
// Build the runtime. The typestate builder enforces that
// with_agent_config() is called before build() — at compile time.
let runtime = AgentRuntime::builder()
.with_memory(memory)
.with_agent_config(
AgentConfig::new(5, "stub-model")
.with_system_prompt("You are a demo agent.")
.with_max_memory_recalls(3),
)
.register_tool(ToolSpec::new("double", "Doubles a number", |args| {
let n = args.get("n").and_then(|v| v.as_i64()).unwrap_or(0);
serde_json::json!(n * 2)
}))
.build();
// The `infer` closure acts as the model.
// Replace with a real provider call in production.
let mut step = 0usize;
let session = runtime
.run_agent(agent_id, "Double the number 21.", move |_ctx: String| {
step += 1;
let s = step;
async move {
if s == 1 {
r#"Thought: I will use the double tool.
Action: double {"n":21}"#
.to_string()
} else {
"Thought: The answer is 42.\nAction: FINAL_ANSWER 42".to_string()
}
}
})
.await?;
println!(
"Done in {} step(s), {} memory hit(s), {}ms",
session.step_count(),
session.memory_hits,
session.duration_ms,
);
Ok(())
}
```
### 4. Using a built-in provider
```rust,no_run
use llm_agent_runtime::providers::{AnthropicProvider, LlmProvider};
#[tokio::main]
async fn main() {
let api_key = std::env::var("ANTHROPIC_API_KEY").expect("ANTHROPIC_API_KEY not set");
let provider = AnthropicProvider::new(api_key);
let reply = provider
.complete("Say hello in one sentence.", "claude-sonnet-4-6")
.await
.expect("provider call failed");
println!("{reply}");
}
```
### 5. Wiring a provider into the runtime
`run_agent_with_provider` removes the boilerplate closure when using a built-in
provider:
```rust,no_run
use llm_agent_runtime::prelude::*;
use llm_agent_runtime::providers::AnthropicProvider;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), AgentRuntimeError> {
let api_key = std::env::var("ANTHROPIC_API_KEY").expect("ANTHROPIC_API_KEY not set");
let provider = Arc::new(AnthropicProvider::new(api_key));
let runtime = AgentRuntime::builder()
.with_agent_config(AgentConfig::new(10, "claude-sonnet-4-6"))
.build();
let session = runtime
.run_agent_with_provider(AgentId::new("agent-1"), "What is 6 * 7?", provider)
.await?;
println!("Answer: {}", session.final_answer().unwrap_or("no answer"));
Ok(())
}
```
---
## Feature Flags
| `orchestrator` | yes | `CircuitBreaker` with pluggable backends, `RetryPolicy`, `Deduplicator`, `BackpressureGuard` with soft limit, `Pipeline` |
| `memory` | yes | `EpisodicStore` with `DecayPolicy`, `RecallPolicy::Hybrid`, per-agent capacity; `SemanticStore` with cosine search; `WorkingMemory` LRU |
| `graph` | yes | `GraphStore` — BFS, DFS, Dijkstra, transitive closure, degree/betweenness centrality, community detection, subgraph, cycle detection |
| `wasm` | yes | `ReActLoop` with sync and streaming inference, `ToolRegistry`, `ToolSpec`, `parse_react_step`, `AgentConfig`, observer callbacks, and step-level metrics |
| `persistence` | no | `PersistenceBackend` trait + `FilePersistenceBackend`; session and per-step checkpointing |
| `providers` | no | `LlmProvider` async trait |
| `anthropic` | no | Built-in Anthropic Messages API provider with SSE streaming (implies `providers` + `reqwest`) |
| `openai` | no | Built-in OpenAI Chat Completions API provider with SSE streaming and custom base-URL support (implies `providers` + `reqwest`) |
| `redis-circuit-breaker` | no | Distributed `CircuitBreakerBackend` via Redis |
| `full` | no | All features simultaneously |
---
## API Reference
### `AgentRuntime` builder
```rust
let runtime = AgentRuntime::builder() // AgentRuntimeBuilder<NeedsConfig>
.with_memory(EpisodicStore::new())
.with_working_memory(WorkingMemory::new(64)?)
.with_graph(GraphStore::new())
.with_backpressure(BackpressureGuard::new(100)?)
.register_tool(my_tool)
.with_metrics(metrics_arc)
.with_checkpoint_backend(backend_arc) // persistence feature
.with_agent_config(config) // --> AgentRuntimeBuilder<HasConfig>
.build(); // infallible
```
| `.with_agent_config(cfg)` | `AgentConfig` | **Required.** Transitions builder to `HasConfig`. |
| `.with_memory(store)` | `EpisodicStore` | Episodic memory recalled and injected into the prompt. |
| `.with_working_memory(wm)` | `WorkingMemory` | Bounded key-value working memory appended to the prompt. |
| `.with_graph(graph)` | `GraphStore` | Knowledge graph; entity count captured in session metadata. |
| `.with_backpressure(guard)` | `BackpressureGuard` | Rejects sessions when in-flight count exceeds capacity. |
| `.register_tool(spec)` | `ToolSpec` | Adds a callable tool to the ReAct loop. |
| `.with_metrics(m)` | `Arc<RuntimeMetrics>` | Shares a custom metrics instance. |
| `.with_checkpoint_backend(b)` | `Arc<dyn PersistenceBackend>` | Enables checkpointing (`persistence` feature). |
### `AgentConfig`
| `max_iterations` | `usize` | required | Maximum Thought-Action-Observation cycles |
| `model` | `String` | required | Model identifier forwarded to the `infer` closure |
| `.with_system_prompt(s)` | `String` | `"You are a helpful AI agent."` | Injected at the head of every context string |
| `.with_max_memory_recalls(n)` | `usize` | `3` | Maximum episodic items injected per run |
| `.with_max_memory_tokens(n)` | `usize` | `None` | Approximate token budget (~4 chars/token) |
| `.with_stop_sequences(v)` | `Vec<String>` | `[]` | Stop sequences forwarded to the provider on every call |
| `.with_loop_timeout_ms(n)` | `u64` | `None` | Wall-clock deadline for the entire ReAct loop in milliseconds |
### `EpisodicStore` constructors
| `EpisodicStore::new()` | Unbounded, no decay, importance-ranked |
| `EpisodicStore::with_decay(policy)` | `DecayPolicy::exponential(half_life_hours)` |
| `EpisodicStore::with_recall_policy(p)` | `RecallPolicy::Hybrid { recency_weight, frequency_weight }` |
| `EpisodicStore::with_per_agent_capacity(n)` | Evicts lowest-importance item when agent exceeds `n` memories |
### `BackpressureGuard`
```rust
let guard = BackpressureGuard::new(100)? // hard limit
.with_soft_limit(75)?; // warn when depth reaches 75
```
### `CircuitBreaker`
```rust
let cb = CircuitBreaker::new("my-service", 5, Duration::from_secs(30))?;
### `ToolSpec`
```rust
// Synchronous handler
});
// Async handler
let spec = ToolSpec::new_async("fetch", "Fetches a URL", |args| {
Box::pin(async move { serde_json::json!({ "status": "ok" }) })
});
// With validation and circuit breaker
})
.with_required_fields(vec!["q".to_string()])
.with_circuit_breaker(cb_arc);
```
### `AgentSession`
Every call to `run_agent` / `run_agent_with_provider` returns an `AgentSession` with rich
introspection methods:
| `step_count()` | `usize` | Total Thought-Action-Observation cycles |
| `has_final_answer()` | `bool` | Whether the session ended with `FINAL_ANSWER` |
| `final_answer()` | `Option<&str>` | The final answer text, if any |
| `duration_secs()` | `f64` | Wall-clock duration in seconds |
| `failed_tool_call_count()` | `usize` | Steps with error-bearing observations |
| `step_success_count()` | `usize` | Steps without tool failures |
| `checkpoint_error_count()` | `usize` | Number of checkpoint save errors |
| `all_thoughts()` | `Vec<&str>` | All thought strings in step order |
| `all_actions()` | `Vec<&str>` | All action strings in step order |
| `all_observations()` | `Vec<&str>` | All observation strings in step order |
| `action_lengths()` | `Vec<usize>` | Byte length of each action string |
| `thought_lengths()` | `Vec<usize>` | Byte length of each thought string |
| `total_thought_bytes()` | `usize` | Sum of all thought byte lengths |
| `total_observation_bytes()` | `usize` | Sum of all observation byte lengths |
| `avg_action_length()` | `f64` | Mean byte length of action strings |
| `avg_thought_length()` | `f64` | Mean byte length of thought strings |
| `longest_thought()` | `Option<&str>` | Thought string with the most bytes |
| `shortest_action()` | `Option<&str>` | Action string with the fewest bytes |
| `first_step_action()` | `Option<&str>` | Action of the first step |
| `last_step_action()` | `Option<&str>` | Action of the last step |
| `most_common_action()` | `Option<String>` | Most frequently used action string |
| `count_steps_with_action(s)` | `usize` | Steps whose action matches `s` exactly |
| `thought_contains_count(s)` | `usize` | Steps whose thought contains substring `s` |
| `observation_contains_count(s)` | `usize` | Steps whose observation contains `s` |
| `count_nonempty_thoughts()` | `usize` | Steps with a non-empty thought string |
| `steps_above_thought_length(n)` | `usize` | Steps whose thought exceeds `n` bytes |
### `MetricsSnapshot`
Obtained via `runtime.metrics().snapshot()`. All fields are plain integers, safe to log or
serialize.
| `tool_call_count(name)` | `u64` | Total calls for a named tool |
| `tool_failure_count(name)` | `u64` | Total failures for a named tool |
| `tool_success_count(name)` | `u64` | Calls minus failures for a tool |
| `tool_names()` | `Vec<&str>` | Sorted names of tools with recorded calls |
| `failure_rate()` | `f64` | Overall failure rate (0.0–1.0) |
| `success_rate()` | `f64` | Overall success rate (0.0–1.0) |
| `failed_tool_ratio_for(name)` | `f64` | Per-tool failure rate |
| `most_called_tool()` | `Option<String>` | Tool name with the highest call count |
| `tool_names_with_failures()` | `Vec<String>` | Sorted names of tools with ≥1 failure |
| `avg_failures_per_session()` | `f64` | Failed tool calls per completed session |
| `steps_per_tool_call()` | `f64` | Ratio of total steps to total tool calls |
| `tool_diversity()` | `usize` | Number of distinct tools called |
| `total_agent_count()` | `usize` | Distinct agents with recorded call data |
| `agent_with_most_calls()` | `Option<String>` | Agent id with the highest total calls |
| `backpressure_shed_rate()` | `f64` | Ratio of shed events to total tool calls |
| `delta(after, before)` | `MetricsSnapshot` | Per-request delta (saturating subtraction) |
| `to_json()` | `serde_json::Value` | Serialize for logging or export |
### `EpisodicStore` helpers
Beyond the core `add_episode` / `recall` / `clear` API:
| `episode_count_for(agent)` | Episode count for a specific agent |
| `agents_with_episodes()` | Sorted list of agents with at least one episode |
| `content_lengths(agent)` | Byte lengths of episode contents in insertion order |
| `total_content_bytes(agent)` | Sum of byte lengths of all episode contents |
| `avg_content_length(agent)` | Mean byte length of episode contents |
| `max_content_length(agent)` | Byte length of the longest episode |
| `min_content_length(agent)` | Byte length of the shortest episode |
| `high_importance_count(agent, t)` | Episodes with importance > `t` |
| `episodes_by_importance(agent)` | Content strings sorted by descending importance |
| `content_contains_count(agent, s)` | Episodes whose content contains substring `s` |
### `SemanticStore` helpers
| `count_by_tag(tag)` | Entries that contain `tag` |
| `list_tags()` | Sorted list of all distinct tags |
| `most_common_tag()` | Tag appearing most often across all entries |
| `all_keys()` | Sorted list of all stored keys |
| `total_value_bytes()` | Sum of byte lengths of all stored values |
| `avg_value_bytes()` | Mean byte length of stored values |
| `max_value_bytes()` | Byte length of the longest stored value |
| `min_value_bytes()` | Byte length of the shortest stored value |
| `tag_count()` | Number of distinct tags across all entries |
| `entry_count_with_embedding()` | Entries with an associated embedding vector |
### `GraphStore` query methods
Beyond graph construction and traversal:
| `relationship_kinds()` | Sorted list of distinct relationship kind strings |
| `relationship_kind_count()` | Number of distinct relationship kinds |
| `relationship_count_between(a, b)` | Directed relationships from `a` to `b` |
| `edges_from(id)` | All `Relationship` objects originating from an entity |
| `neighbors_of(id)` | Sorted `EntityId`s reachable in one hop |
| `entities_with_self_loops()` | Entities where `from == to` |
| `bidirectional_count()` | Pairs with edges in both directions |
| `isolated_entity_count()` | Entities with no incoming or outgoing edges |
| `total_in_degree()` | Sum of in-degrees (equals relationship count) |
| `avg_relationship_weight()` | Mean weight of all relationships |
| `avg_out_degree()` | Mean number of outgoing relationships per entity |
| `avg_in_degree()` | Mean number of incoming relationships per entity |
### `ToolRegistry`
```rust
// Check, remove, and query tools at runtime
if registry.contains("old-tool") {
let removed: Option<ToolSpec> = registry.remove("old-tool");
}
// Introspect the registry
let sorted = registry.tool_names_sorted(); // Vec<&str>
let avg_len = registry.avg_description_length(); // f64
let shortest = registry.shortest_description(); // Option<&str>
let longest = registry.longest_description(); // Option<&str>
let count = registry.description_contains_count("search"); // usize
let matching = registry.names_containing("calc"); // Vec<&str>
```
### `FilePersistenceBackend`
```rust
// Conditional save — only checkpoint if one doesn't exist yet
if !backend.exists("agent-1-session").await? {
backend.save("agent-1-session", &data).await?;
}
```
---
## Error Handling
All public APIs return `Result<T, AgentRuntimeError>`. Match only the variants you care about:
```rust
use llm_agent_runtime::prelude::*;
fn handle(err: AgentRuntimeError) {
match err {
AgentRuntimeError::CircuitOpen { service } =>
eprintln!("Circuit open for {service} — backing off"),
AgentRuntimeError::BackpressureShed { depth, capacity } =>
eprintln!("Shed: {depth}/{capacity} in-flight — try again later"),
AgentRuntimeError::AgentLoop(msg) =>
eprintln!("Agent loop failed: {msg}"),
AgentRuntimeError::Memory(msg) =>
eprintln!("Memory subsystem error: {msg}"),
AgentRuntimeError::Graph(msg) =>
eprintln!("Graph subsystem error: {msg}"),
AgentRuntimeError::Persistence(msg) =>
eprintln!("Persistence error: {msg}"),
AgentRuntimeError::Provider(msg) =>
eprintln!("LLM provider error: {msg}"),
other => eprintln!("Other error: {other}"),
}
}
```
All production code paths are panic-free. Clippy denies `unwrap_used`, `expect_used`,
`panic`, and `todo` in `src/`.
---
## Running Tests
```sh
# Default feature set
cargo test
# All features including persistence and providers
cargo test --all-features
# A specific module
cargo test --lib memory
cargo test --lib graph
# With structured log output
RUST_LOG=agent_runtime=debug cargo test -- --nocapture
```
Run the full CI suite locally:
```sh
cargo fmt --all -- --check
cargo clippy --all-features -- -D warnings
cargo test --all-features
cargo doc --no-deps --all-features
cargo build --release --all-features
```
---
## Contributing
1. Fork the repository and create a descriptive feature branch.
2. Add tests for every new public function, struct, and trait. The project maintains over
1,460 tests across all modules and targets a minimum 1:1 test-to-production line ratio.
3. All production paths must be panic-free. Use `Result` for every fallible operation.
Clippy denies `unwrap_used`, `expect_used`, `panic`, and `todo` in `src/`.
4. Run `cargo test --all-features` and `cargo clippy --all-features -- -D warnings` with
zero failures before opening a pull request.
5. New public items require `///` doc comments. The crate enforces `#![deny(missing_docs)]`.
6. Describe the motivation and design decisions in the PR body.
**Checklist before opening a PR:**
- [ ] `cargo test --all-features` passes
- [ ] `cargo clippy --all-features -- -D warnings` passes
- [ ] `cargo fmt --all -- --check` passes
- [ ] `cargo doc --no-deps --all-features` passes with `RUSTDOCFLAGS="-D warnings"`
- [ ] New public items have `///` doc comments
---
## License
Licensed under the MIT License. See [LICENSE](LICENSE) for details.