agent-runtime

agent-runtime is a unified Tokio async agent runtime for Rust. It combines orchestration primitives, episodic and semantic memory, an in-memory knowledge graph, and a ReAct (Thought-Action-Observation) agent loop in a single crate.

The library consolidates the public APIs of tokio-prompt-orchestrator, tokio-agent-memory, mem-graph, and wasm-agent, and extends them with pluggable LLM providers, optional file-based persistence with per-step checkpointing, lock-free runtime metrics, and a compile-time typestate builder that prevents misconfiguration at zero runtime cost.

What it does

ReAct agent loop — runs Thought → Action → Observation cycles with a pluggable tool registry. Terminates on FINAL_ANSWER or max_iterations, whichever comes first.
Episodic memory — per-agent event store with configurable decay, hybrid recall scoring, per-agent capacity limits, and recall-count tracking.
Semantic memory — key-value store with tag-based retrieval and cosine-similarity vector search.
Working memory — bounded LRU key-value store injected into the agent prompt.
Knowledge graph — directed in-memory graph with BFS, DFS, Dijkstra shortest-path, transitive closure, degree and betweenness centrality, community detection, cycle detection, and subgraph extraction.
Circuit breaker — configurable failure threshold and recovery window with a pluggable backend trait for distributed state (e.g., Redis).
Retry policy — exponential backoff capped at 60 s.
Deduplicator — TTL-based request deduplication with in-flight tracking.
Backpressure guard — hard and soft capacity limits with tracing warnings.
Pipeline — composable string-transform stage chain.
LLM providers — built-in AnthropicProvider and OpenAiProvider with SSE streaming (behind feature flags).
Persistence — async PersistenceBackend trait and FilePersistenceBackend for session and per-step checkpointing.
Metrics — atomic counters for active/total sessions, steps, tool calls, backpressure sheds, memory recalls, and checkpoint errors.
types module — AgentId and MemoryId are always available without enabling the memory feature, allowing the runtime to compile in minimal configurations.

How it works

  User Code
     |
     v
+--------------------+      compile-time typestate
|  AgentRuntime      |<---- AgentRuntimeBuilder<NeedsConfig>
|  runtime.rs        |          .with_agent_config()  -->
+----+----+----+-----+      AgentRuntimeBuilder<HasConfig>
     |    |    |                  .build()             (infallible)
     |    |    |
     |    |    +--------------------------------------------+
     |    |                                                 |
     |    +-------------------+                            |
     |                        |                            |
     v                        v                            v
+--------------------+  +---------------------+  +--------------------+
|  memory.rs         |  |  graph.rs           |  |  orchestrator.rs   |
|                    |  |                     |  |                    |
|  EpisodicStore     |  |  GraphStore         |  |  CircuitBreaker    |
|    DecayPolicy     |  |    BFS / DFS        |  |  RetryPolicy       |
|    RecallPolicy    |  |    Dijkstra         |  |  Deduplicator      |
|    per-agent cap   |  |    transitive close |  |  BackpressureGuard |
|  SemanticStore     |  |    centrality       |  |  Pipeline          |
|    cosine search   |  |    community detect |  +--------------------+
|  WorkingMemory     |  |    cycle detection  |
|    LRU eviction    |  +---------------------+
+--------------------+
     |
     v
+--------------------+
|  agent.rs          |
|                    |
|  ReActLoop         |<--- ToolRegistry (ToolSpec, per-tool CircuitBreaker)
|  AgentConfig       |
|  AgentSession      |
+--------------------+
     |
     +---------------------------+
     |                           |
     v                           v
+--------------------+  +--------------------+
|  providers.rs      |  |  persistence.rs    |
|  LlmProvider trait |  |  PersistenceBackend|
|  AnthropicProvider |  |  FilePersistence   |
|  OpenAiProvider    |  |  session checkpoint|
+--------------------+  |  per-step snapshot |
                         +--------------------+
                                  |
                        +---------+
                        v
              +--------------------+
              |  metrics.rs        |
              |  RuntimeMetrics    |
              |  (atomic counters) |
              +--------------------+

Data flow inside `run_agent`

BackpressureGuard is checked; sessions exceeding capacity are rejected immediately with AgentRuntimeError::BackpressureShed.
EpisodicStore is recalled for the agent; matching items are injected into the prompt, subject to max_memory_recalls and the optional max_memory_tokens token budget.
WorkingMemory key-value pairs are appended to the enriched prompt.
GraphStore entity count is captured for session metadata.
ReActLoop runs Thought-Action-Observation cycles, dispatching tool calls through ToolRegistry.
Per-tool CircuitBreaker (optional) fast-fails unhealthy tools and records structured error observations with kind classification (not_found, transient, permanent).
On completion an AgentSession is returned; if a PersistenceBackend is configured, the final session and every per-step snapshot are saved atomically.
RuntimeMetrics counters are updated atomically throughout.

Quickstart

1. Add to `Cargo.toml`

[dependencies]
llm-agent-runtime = "1.58"
tokio = { version = "1", features = ["full"] }

To enable built-in LLM providers:

llm-agent-runtime = { version = "1.58", features = ["anthropic", "openai"] }

To opt in to only the subsystems you need:

llm-agent-runtime = { version = "1.58", default-features = false, features = ["memory", "orchestrator"] }

2. Environment variables

export ANTHROPIC_API_KEY="sk-ant-..."   # required for AnthropicProvider
export OPENAI_API_KEY="sk-..."          # required for OpenAiProvider
export RUST_LOG="agent_runtime=debug"   # optional structured logging

3. Minimal example (no external services)

The default feature set (orchestrator, memory, graph, wasm) runs entirely in-process with no API keys, no Redis, and no database.

use llm_agent_runtime::prelude::*;

#[tokio::main]
async fn main() -> Result<(), AgentRuntimeError> {
    // Seed episodic memory.
    let memory = EpisodicStore::new();
    let agent_id = AgentId::new("demo");
    memory.add_episode(agent_id.clone(), "Rust is fast and memory-safe.", 0.9)?;
    memory.add_episode(agent_id.clone(), "Tokio is an async runtime for Rust.", 0.8)?;

    // Build the runtime.  The typestate builder enforces that
    // with_agent_config() is called before build() — at compile time.
    let runtime = AgentRuntime::builder()
        .with_memory(memory)
        .with_agent_config(
            AgentConfig::new(5, "stub-model")
                .with_system_prompt("You are a demo agent.")
                .with_max_memory_recalls(3),
        )
        .register_tool(ToolSpec::new("double", "Doubles a number", |args| {
            let n = args.get("n").and_then(|v| v.as_i64()).unwrap_or(0);
            serde_json::json!(n * 2)
        }))
        .build();

    // The `infer` closure acts as the model.
    // Replace with a real provider call in production.
    let mut step = 0usize;
    let session = runtime
        .run_agent(agent_id, "Double the number 21.", move |_ctx: String| {
            step += 1;
            let s = step;
            async move {
                if s == 1 {
                    r#"Thought: I will use the double tool.
Action: double {"n":21}"#
                        .to_string()
                } else {
                    "Thought: The answer is 42.\nAction: FINAL_ANSWER 42".to_string()
                }
            }
        })
        .await?;

    println!(
        "Done in {} step(s), {} memory hit(s), {}ms",
        session.step_count(),
        session.memory_hits,
        session.duration_ms,
    );
    Ok(())
}

4. Using a built-in provider

use llm_agent_runtime::providers::{AnthropicProvider, LlmProvider};

#[tokio::main]
async fn main() {
    let api_key = std::env::var("ANTHROPIC_API_KEY").expect("ANTHROPIC_API_KEY not set");
    let provider = AnthropicProvider::new(api_key);

    let reply = provider
        .complete("Say hello in one sentence.", "claude-sonnet-4-6")
        .await
        .expect("provider call failed");

    println!("{reply}");
}

5. Wiring a provider into the runtime

run_agent_with_provider removes the boilerplate closure when using a built-in provider:

use llm_agent_runtime::prelude::*;
use llm_agent_runtime::providers::AnthropicProvider;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), AgentRuntimeError> {
    let api_key = std::env::var("ANTHROPIC_API_KEY").expect("ANTHROPIC_API_KEY not set");
    let provider = Arc::new(AnthropicProvider::new(api_key));

    let runtime = AgentRuntime::builder()
        .with_agent_config(AgentConfig::new(10, "claude-sonnet-4-6"))
        .build();

    let session = runtime
        .run_agent_with_provider(AgentId::new("agent-1"), "What is 6 * 7?", provider)
        .await?;

    println!("Answer: {}", session.final_answer().unwrap_or("no answer"));
    Ok(())
}

Feature Flags

Feature	Default	Description
`orchestrator`	yes	`CircuitBreaker` with pluggable backends, `RetryPolicy`, `Deduplicator`, `BackpressureGuard` with soft limit, `Pipeline`
`memory`	yes	`EpisodicStore` with `DecayPolicy`, `RecallPolicy::Hybrid`, per-agent capacity; `SemanticStore` with cosine search; `WorkingMemory` LRU
`graph`	yes	`GraphStore` — BFS, DFS, Dijkstra, transitive closure, degree/betweenness centrality, community detection, subgraph, cycle detection
`wasm`	yes	`ReActLoop` with sync and streaming inference, `ToolRegistry`, `ToolSpec`, `parse_react_step`, `AgentConfig`, observer callbacks, and step-level metrics
`persistence`	no	`PersistenceBackend` trait + `FilePersistenceBackend`; session and per-step checkpointing
`providers`	no	`LlmProvider` async trait
`anthropic`	no	Built-in Anthropic Messages API provider with SSE streaming (implies `providers` + `reqwest`)
`openai`	no	Built-in OpenAI Chat Completions API provider with SSE streaming and custom base-URL support (implies `providers` + `reqwest`)
`redis-circuit-breaker`	no	Distributed `CircuitBreakerBackend` via Redis
`full`	no	All features simultaneously

API Reference

`AgentRuntime` builder

let runtime = AgentRuntime::builder()       // AgentRuntimeBuilder<NeedsConfig>
    .with_memory(EpisodicStore::new())
    .with_working_memory(WorkingMemory::new(64)?)
    .with_graph(GraphStore::new())
    .with_backpressure(BackpressureGuard::new(100)?)
    .register_tool(my_tool)
    .with_metrics(metrics_arc)
    .with_checkpoint_backend(backend_arc)   // persistence feature
    .with_agent_config(config)              // --> AgentRuntimeBuilder<HasConfig>
    .build();                               // infallible

Method	Argument	Description
`.with_agent_config(cfg)`	`AgentConfig`	Required. Transitions builder to `HasConfig`.
`.with_memory(store)`	`EpisodicStore`	Episodic memory recalled and injected into the prompt.
`.with_working_memory(wm)`	`WorkingMemory`	Bounded key-value working memory appended to the prompt.
`.with_graph(graph)`	`GraphStore`	Knowledge graph; entity count captured in session metadata.
`.with_backpressure(guard)`	`BackpressureGuard`	Rejects sessions when in-flight count exceeds capacity.
`.register_tool(spec)`	`ToolSpec`	Adds a callable tool to the ReAct loop.
`.with_metrics(m)`	`Arc<RuntimeMetrics>`	Shares a custom metrics instance.
`.with_checkpoint_backend(b)`	`Arc<dyn PersistenceBackend>`	Enables checkpointing (`persistence` feature).

`AgentConfig`

Field / Builder	Type	Default	Description
`max_iterations`	`usize`	required	Maximum Thought-Action-Observation cycles
`model`	`String`	required	Model identifier forwarded to the `infer` closure
`.with_system_prompt(s)`	`String`	`"You are a helpful AI agent."`	Injected at the head of every context string
`.with_max_memory_recalls(n)`	`usize`	`3`	Maximum episodic items injected per run
`.with_max_memory_tokens(n)`	`usize`	`None`	Approximate token budget (~4 chars/token)
`.with_stop_sequences(v)`	`Vec<String>`	`[]`	Stop sequences forwarded to the provider on every call
`.with_loop_timeout_ms(n)`	`u64`	`None`	Wall-clock deadline for the entire ReAct loop in milliseconds

`EpisodicStore` constructors

Constructor	Description
`EpisodicStore::new()`	Unbounded, no decay, importance-ranked
`EpisodicStore::with_decay(policy)`	`DecayPolicy::exponential(half_life_hours)`
`EpisodicStore::with_recall_policy(p)`	`RecallPolicy::Hybrid { recency_weight, frequency_weight }`
`EpisodicStore::with_per_agent_capacity(n)`	Evicts lowest-importance item when agent exceeds `n` memories

`BackpressureGuard`

let guard = BackpressureGuard::new(100)?   // hard limit
    .with_soft_limit(75)?;                 // warn when depth reaches 75

`CircuitBreaker`

let cb = CircuitBreaker::new("my-service", 5, Duration::from_secs(30))?;
let result = cb.call(|| my_fallible_operation())?;

`ToolSpec`

// Synchronous handler
let spec = ToolSpec::new("greet", "Greets someone", |args| {
    serde_json::json!({ "message": "hello" })
});

// Async handler
let spec = ToolSpec::new_async("fetch", "Fetches a URL", |args| {
    Box::pin(async move { serde_json::json!({ "status": "ok" }) })
});

// With validation and circuit breaker
let spec = ToolSpec::new("search", "Searches the web", |args| {
    serde_json::json!({ "results": [] })
})
.with_required_fields(vec!["q".to_string()])
.with_circuit_breaker(cb_arc);

`AgentSession`

Every call to run_agent / run_agent_with_provider returns an AgentSession with rich introspection methods:

Method	Return	Description
`step_count()`	`usize`	Total Thought-Action-Observation cycles
`has_final_answer()`	`bool`	Whether the session ended with `FINAL_ANSWER`
`final_answer()`	`Option<&str>`	The final answer text, if any
`duration_secs()`	`f64`	Wall-clock duration in seconds
`failed_tool_call_count()`	`usize`	Steps with error-bearing observations
`step_success_count()`	`usize`	Steps without tool failures
`checkpoint_error_count()`	`usize`	Number of checkpoint save errors
`all_thoughts()`	`Vec<&str>`	All thought strings in step order
`all_actions()`	`Vec<&str>`	All action strings in step order
`all_observations()`	`Vec<&str>`	All observation strings in step order
`action_lengths()`	`Vec<usize>`	Byte length of each action string
`thought_lengths()`	`Vec<usize>`	Byte length of each thought string
`total_thought_bytes()`	`usize`	Sum of all thought byte lengths
`total_observation_bytes()`	`usize`	Sum of all observation byte lengths
`avg_action_length()`	`f64`	Mean byte length of action strings
`avg_thought_length()`	`f64`	Mean byte length of thought strings
`longest_thought()`	`Option<&str>`	Thought string with the most bytes
`shortest_action()`	`Option<&str>`	Action string with the fewest bytes
`first_step_action()`	`Option<&str>`	Action of the first step
`last_step_action()`	`Option<&str>`	Action of the last step
`most_common_action()`	`Option<String>`	Most frequently used action string
`count_steps_with_action(s)`	`usize`	Steps whose action matches `s` exactly
`thought_contains_count(s)`	`usize`	Steps whose thought contains substring `s`
`observation_contains_count(s)`	`usize`	Steps whose observation contains `s`
`count_nonempty_thoughts()`	`usize`	Steps with a non-empty thought string
`steps_above_thought_length(n)`	`usize`	Steps whose thought exceeds `n` bytes

`MetricsSnapshot`

Obtained via runtime.metrics().snapshot(). All fields are plain integers, safe to log or serialize.

Method	Return	Description
`tool_call_count(name)`	`u64`	Total calls for a named tool
`tool_failure_count(name)`	`u64`	Total failures for a named tool
`tool_success_count(name)`	`u64`	Calls minus failures for a tool
`tool_names()`	`Vec<&str>`	Sorted names of tools with recorded calls
`failure_rate()`	`f64`	Overall failure rate (0.0–1.0)
`success_rate()`	`f64`	Overall success rate (0.0–1.0)
`failed_tool_ratio_for(name)`	`f64`	Per-tool failure rate
`most_called_tool()`	`Option<String>`	Tool name with the highest call count
`tool_names_with_failures()`	`Vec<String>`	Sorted names of tools with ≥1 failure
`avg_failures_per_session()`	`f64`	Failed tool calls per completed session
`steps_per_tool_call()`	`f64`	Ratio of total steps to total tool calls
`tool_diversity()`	`usize`	Number of distinct tools called
`total_agent_count()`	`usize`	Distinct agents with recorded call data
`agent_with_most_calls()`	`Option<String>`	Agent id with the highest total calls
`backpressure_shed_rate()`	`f64`	Ratio of shed events to total tool calls
`delta(after, before)`	`MetricsSnapshot`	Per-request delta (saturating subtraction)
`to_json()`	`serde_json::Value`	Serialize for logging or export

`EpisodicStore` helpers

Beyond the core add_episode / recall / clear API:

Method	Description
`episode_count_for(agent)`	Episode count for a specific agent
`agents_with_episodes()`	Sorted list of agents with at least one episode
`content_lengths(agent)`	Byte lengths of episode contents in insertion order
`total_content_bytes(agent)`	Sum of byte lengths of all episode contents
`avg_content_length(agent)`	Mean byte length of episode contents
`max_content_length(agent)`	Byte length of the longest episode
`min_content_length(agent)`	Byte length of the shortest episode
`high_importance_count(agent, t)`	Episodes with importance > `t`
`episodes_by_importance(agent)`	Content strings sorted by descending importance
`content_contains_count(agent, s)`	Episodes whose content contains substring `s`

`SemanticStore` helpers

Method	Description
`count_by_tag(tag)`	Entries that contain `tag`
`list_tags()`	Sorted list of all distinct tags
`most_common_tag()`	Tag appearing most often across all entries
`all_keys()`	Sorted list of all stored keys
`total_value_bytes()`	Sum of byte lengths of all stored values
`avg_value_bytes()`	Mean byte length of stored values
`max_value_bytes()`	Byte length of the longest stored value
`min_value_bytes()`	Byte length of the shortest stored value
`tag_count()`	Number of distinct tags across all entries
`entry_count_with_embedding()`	Entries with an associated embedding vector

`GraphStore` query methods

Beyond graph construction and traversal:

Method	Description
`relationship_kinds()`	Sorted list of distinct relationship kind strings
`relationship_kind_count()`	Number of distinct relationship kinds
`relationship_count_between(a, b)`	Directed relationships from `a` to `b`
`edges_from(id)`	All `Relationship` objects originating from an entity
`neighbors_of(id)`	Sorted `EntityId`s reachable in one hop
`entities_with_self_loops()`	Entities where `from == to`
`bidirectional_count()`	Pairs with edges in both directions
`isolated_entity_count()`	Entities with no incoming or outgoing edges
`total_in_degree()`	Sum of in-degrees (equals relationship count)
`avg_relationship_weight()`	Mean weight of all relationships
`avg_out_degree()`	Mean number of outgoing relationships per entity
`avg_in_degree()`	Mean number of incoming relationships per entity

`ToolRegistry`

// Check, remove, and query tools at runtime
if registry.contains("old-tool") {
    let removed: Option<ToolSpec> = registry.remove("old-tool");
}

// Introspect the registry
let sorted   = registry.tool_names_sorted();           // Vec<&str>
let avg_len  = registry.avg_description_length();      // f64
let shortest = registry.shortest_description();        // Option<&str>
let longest  = registry.longest_description();         // Option<&str>
let count    = registry.description_contains_count("search"); // usize
let matching = registry.names_containing("calc");      // Vec<&str>

`FilePersistenceBackend`

// Conditional save — only checkpoint if one doesn't exist yet
if !backend.exists("agent-1-session").await? {
    backend.save("agent-1-session", &data).await?;
}

Error Handling

All public APIs return Result<T, AgentRuntimeError>. Match only the variants you care about:

use llm_agent_runtime::prelude::*;

fn handle(err: AgentRuntimeError) {
    match err {
        AgentRuntimeError::CircuitOpen { service } =>
            eprintln!("Circuit open for {service} — backing off"),
        AgentRuntimeError::BackpressureShed { depth, capacity } =>
            eprintln!("Shed: {depth}/{capacity} in-flight — try again later"),
        AgentRuntimeError::AgentLoop(msg) =>
            eprintln!("Agent loop failed: {msg}"),
        AgentRuntimeError::Memory(msg) =>
            eprintln!("Memory subsystem error: {msg}"),
        AgentRuntimeError::Graph(msg) =>
            eprintln!("Graph subsystem error: {msg}"),
        AgentRuntimeError::Persistence(msg) =>
            eprintln!("Persistence error: {msg}"),
        AgentRuntimeError::Provider(msg) =>
            eprintln!("LLM provider error: {msg}"),
        other => eprintln!("Other error: {other}"),
    }
}

All production code paths are panic-free. Clippy denies unwrap_used, expect_used, panic, and todo in src/.

Running Tests

# Default feature set
cargo test

# All features including persistence and providers
cargo test --all-features

# A specific module
cargo test --lib memory
cargo test --lib graph

# With structured log output
RUST_LOG=agent_runtime=debug cargo test -- --nocapture

Run the full CI suite locally:

cargo fmt --all -- --check
cargo clippy --all-features -- -D warnings
cargo test --all-features
cargo doc --no-deps --all-features
cargo build --release --all-features

Contributing

Fork the repository and create a descriptive feature branch.
Add tests for every new public function, struct, and trait. The project maintains over 1,460 tests across all modules and targets a minimum 1:1 test-to-production line ratio.
All production paths must be panic-free. Use Result for every fallible operation. Clippy denies unwrap_used, expect_used, panic, and todo in src/.
Run cargo test --all-features and cargo clippy --all-features -- -D warnings with zero failures before opening a pull request.
New public items require /// doc comments. The crate enforces #![deny(missing_docs)].
Describe the motivation and design decisions in the PR body.

Checklist before opening a PR:

cargo test --all-features passes
cargo clippy --all-features -- -D warnings passes
cargo fmt --all -- --check passes
cargo doc --no-deps --all-features passes with RUSTDOCFLAGS="-D warnings"
New public items have /// doc comments

License

Licensed under the MIT License. See LICENSE for details.

llm-agent-runtime 1.58.0

agent-runtime

What it does

How it works

Data flow inside run_agent

Quickstart

1. Add to Cargo.toml

2. Environment variables

3. Minimal example (no external services)

4. Using a built-in provider

5. Wiring a provider into the runtime

Feature Flags

API Reference

AgentRuntime builder

AgentConfig

EpisodicStore constructors

BackpressureGuard

CircuitBreaker

ToolSpec

AgentSession

MetricsSnapshot

EpisodicStore helpers

SemanticStore helpers

GraphStore query methods

ToolRegistry

FilePersistenceBackend