mentra 0.4.0 - Docs.rs

# mentra

Mentra is an agent runtime for building tool-using LLM applications.

MSRV: Rust 1.85.

## Current Features

- streaming model response handling
- provider-neutral token usage reporting across OpenAI, OpenRouter, Anthropic, and Gemini
- optional tool authorization with structured previews and fail-closed execution blocking
- recoverable malformed tool-call input handling that feeds retry guidance back to the model
- custom tool execution through the async `ExecutableTool` trait
- builtin `shell`, `background_run`, `check_background`, and `files` tools
- builtin `task` subagents with isolated child context and parent-side tracking
- persistent agent teams with `team_spawn`, `team_send`, `broadcast`, `team_read_inbox`, and generic request-response protocols via `team_request`, `team_respond`, and `team_list_requests`
- three-layer context compaction with silent tool-result shrinking, auto-summary compaction, and a builtin `compact` tool
- agent events and snapshots for CLI or UI watchers
- Anthropic provider support
- Gemini Developer API provider support
- OpenAI provider support via the Responses API
- OpenRouter provider support via the Responses API
- image inputs for OpenAI and Anthropic, plus inline image bytes for Gemini

## Quickstart Example

Clone the repository and run the workspace quickstart example:

```bash
cargo run -p mentra-examples --example quickstart -- "Summarize the benefits of tool-using agents."
```

The quickstart example accepts a prompt from CLI args or stdin. Set `MENTRA_MODEL` to force a specific OpenAI model; otherwise it resolves the newest available OpenAI model automatically.

## Building A Runtime

Use `Runtime::builder()` when you want Mentra's builtin runtime tools, or `Runtime::empty_builder()` when you want to opt into every tool explicitly.

```rust,no_run
use mentra::{BuiltinProvider, Runtime};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let runtime = Runtime::builder()
        .with_provider(BuiltinProvider::OpenAI, std::env::var("OPENAI_API_KEY")?)
        .with_optional_provider(
            BuiltinProvider::OpenRouter,
            std::env::var("OPENROUTER_API_KEY").ok(),
        )
        .with_optional_provider(
            BuiltinProvider::Gemini,
            std::env::var("GEMINI_API_KEY").ok(),
        )
        .build()?;

    let _ = runtime;
    Ok(())
}
```

## Architecture

Mentra is organized around four runtime subsystems:

- execution: model providers, runtime policy, hooks, turn execution, and shell/background command routing
- persistence: agent records, run state, task snapshots, leases, team state, background notifications, and memory
- tooling: builtin and custom tools, optional skills, and typed app context
- collaboration: persistent teammates, team inbox/request flows, and background task wakeups

Persistent teammates are hosted as async actors on a shared Tokio runtime. Live actors are wake-driven rather than steady-state polled: inbox appends, protocol updates, background task completion, explicit resume, and autonomy timers wake the actor to process durable state already written to the store. After a restart, the persisted team inbox, protocol requests, and background notifications remain the source of truth, and `Runtime::resume(...)` revives teammate actors against that stored state.

## Resolving A Model

Use `Runtime::resolve_model(...)` when you want provider-aware model selection without reimplementing discovery or `ModelInfo` construction in application code.

```rust,no_run
use mentra::{BuiltinProvider, ModelSelector, Runtime};

# async fn demo() -> Result<(), Box<dyn std::error::Error>> {
let runtime = Runtime::builder()
    .with_provider(BuiltinProvider::OpenAI, std::env::var("OPENAI_API_KEY")?)
    .build()?;
let model = runtime
    .resolve_model(
        BuiltinProvider::OpenAI,
        std::env::var("MENTRA_MODEL")
            .map(ModelSelector::Id)
            .unwrap_or(ModelSelector::NewestAvailable),
    )
    .await?;

let _ = model;
# Ok(())
# }
```

## Coding Agent Setup

`Runtime::builder()` registers Mentra's builtin tools, including `shell`, `background_run`, `check_background`, `files`, and the runtime/task/team intrinsics. Shell and background execution remain disabled by default, so coding-agent setups must opt in with a runtime policy. If you want semantic review before tools execute, install a `ToolAuthorizer`.

```rust,no_run
use async_trait::async_trait;
use mentra::{BuiltinProvider, Runtime, RuntimePolicy};
use mentra::tool::{
    ToolAuthorizationDecision, ToolAuthorizationRequest, ToolAuthorizer,
};

struct AllowAllAuthorizer;

#[async_trait]
impl ToolAuthorizer for AllowAllAuthorizer {
    async fn authorize(
        &self,
        _request: &ToolAuthorizationRequest,
    ) -> Result<ToolAuthorizationDecision, mentra::error::RuntimeError> {
        Ok(ToolAuthorizationDecision::allow())
    }
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let runtime = Runtime::builder()
        .with_provider(BuiltinProvider::OpenAI, std::env::var("OPENAI_API_KEY")?)
        .with_policy(RuntimePolicy::permissive())
        .with_tool_authorizer(AllowAllAuthorizer)
        .build()?;

    let _ = runtime;
    Ok(())
}
```

## Runtime Policy Defaults

Mentra's builtin runtime tools are available by default, but command execution is not:

- `Runtime::builder()` registers the builtin shell, background, file, task, team, and memory-oriented intrinsics
- foreground shell execution is disabled by default
- background command execution is disabled by default
- `RuntimePolicy::permissive()` enables both shell and background command execution
- runtime policy still enforces hard limits such as working-directory roots, file read/write roots, allowed environment variables, timeouts, output caps, and background task limits
- semantic review is opt-in through `RuntimeBuilder::with_tool_authorizer(...)`

Use the default policy when you want a safer runtime surface, and opt into `RuntimePolicy::permissive()` only when you are intentionally building a coding-agent or automation workflow that should be able to act on the local workspace.

## Tool Authorization

Mentra can run a caller-provided authorization pass before any tool executes. This is the recommended integration point for LLM-based security review, human approval, or custom policy engines.

- no authorizer installed: tools run under the remaining hard runtime constraints
- authorizer returns `Allow`: the tool executes
- authorizer returns `Prompt` or `Deny`: Mentra blocks execution and returns an error `tool_result`
- authorizer timeout or error: Mentra fails closed and blocks execution

Every authorization request includes a `ToolAuthorizationPreview` with tool metadata plus structured input. Builtin tools provide more specific previews:

- `shell` and `background_run` include the raw command, resolved working directory, timeout, background flag, and justification
- `files` includes resolved paths and operation kinds such as `read`, `search`, `set`, `move`, and `delete`, without file contents

```rust,no_run
use async_trait::async_trait;
use mentra::tool::{
    ToolAuthorizationDecision, ToolAuthorizationRequest, ToolAuthorizer,
};

struct DenyDeletes;

#[async_trait]
impl ToolAuthorizer for DenyDeletes {
    async fn authorize(
        &self,
        request: &ToolAuthorizationRequest,
    ) -> Result<ToolAuthorizationDecision, mentra::error::RuntimeError> {
        let structured = &request.preview.structured_input;
        let denies_delete = structured
            .get("operations")
            .and_then(|value| value.as_array())
            .is_some_and(|ops| ops.iter().any(|op| op.get("op").and_then(|v| v.as_str()) == Some("delete")));

        if request.tool_name == "files" && denies_delete {
            Ok(ToolAuthorizationDecision::deny("delete operations require manual approval"))
        } else {
            Ok(ToolAuthorizationDecision::allow())
        }
    }
}
```

Registering a skills directory also makes the builtin `load_skill` tool available:

```rust,no_run
use mentra::{BuiltinProvider, Runtime};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let runtime = Runtime::builder()
        .with_provider(BuiltinProvider::OpenAI, std::env::var("OPENAI_API_KEY")?)
        .with_skills_dir("./skills")?
        .build()?;

    let _ = runtime;
    Ok(())
}
```

## App Context

If your tools need access to typed host-side state, register it on the runtime and retrieve it from `ToolContext` or `ParallelToolContext`:

```rust,no_run
use std::sync::Arc;

use async_trait::async_trait;
use mentra::{
    BuiltinProvider, Runtime,
    tool::{ExecutableTool, ToolContext, ToolResult, ToolSpec},
};
use serde_json::{Value, json};

struct AppState {
    api_base: String,
}

struct InspectStateTool;

#[async_trait]
impl ExecutableTool for InspectStateTool {
    fn spec(&self) -> ToolSpec {
        ToolSpec::builder("inspect_state")
            .description("Return the configured API base URL.")
            .input_schema(json!({
                "type": "object",
                "properties": {}
            }))
            .build()
    }

    async fn execute_mut(&self, ctx: ToolContext<'_>, _input: Value) -> ToolResult {
        let state = ctx.app_context::<AppState>()?;
        Ok(state.api_base.clone())
    }
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let runtime = Runtime::builder()
        .with_provider(BuiltinProvider::OpenAI, std::env::var("OPENAI_API_KEY")?)
        .with_context(Arc::new(AppState {
            api_base: "https://api.example.com".to_string(),
        }))
        .with_tool(InspectStateTool)
        .build()?;

    let _ = runtime;
    Ok(())
}
```

## Custom Tools

Use `ToolSpec::builder(...)` to define custom tools without hand-assembling the metadata struct:

```rust,no_run
use async_trait::async_trait;
use mentra::tool::{
    ExecutableTool, ParallelToolContext, ToolCapability, ToolDurability, ToolResult,
    ToolSideEffectLevel, ToolSpec,
};
use serde_json::{Value, json};

struct UppercaseTool;

#[async_trait]
impl ExecutableTool for UppercaseTool {
    fn spec(&self) -> ToolSpec {
        ToolSpec::builder("uppercase_text")
            .description("Uppercase the provided text")
            .input_schema(json!({
                "type": "object",
                "properties": {
                    "text": { "type": "string" }
                },
                "required": ["text"]
            }))
            .capability(ToolCapability::ReadOnly)
            .side_effect_level(ToolSideEffectLevel::None)
            .durability(ToolDurability::ReplaySafe)
            .execution_timeout(std::time::Duration::from_secs(5))
            .build()
    }

    async fn execute(&self, _ctx: ParallelToolContext, input: Value) -> ToolResult {
        let text = input
            .get("text")
            .and_then(|value| value.as_str())
            .ok_or_else(|| "text is required".to_string())?;
        Ok(text.to_uppercase())
    }
}
```

`ToolSpec::execution_timeout(...)` is enforced by Mentra around the tool future itself, which is useful for network-backed tools that need a tighter budget than the overall agent run.

When a tool needs disposable delegated work, `ParallelToolContext::spawn_subagent()` can create a child agent that inherits the current runtime and model defaults. See the `subagent_tool` example in the workspace examples crate for a complete usage pattern.

Override `ExecutableTool::authorization_preview(...)` when your custom tool needs to expose structured metadata to the installed `ToolAuthorizer`. The default preview includes the resolved working directory, tool capabilities, side-effect level, durability, the raw JSON input, and the same JSON as `structured_input`.

## Hosted Tool Search

Mentra can mark custom tools as deferred and let a provider load them on demand with native hosted tool search.

Mark a tool as deferred in its `ToolSpec`:

```rust,no_run
use async_trait::async_trait;
use mentra::tool::{ExecutableTool, ParallelToolContext, ToolResult, ToolSpec};
use serde_json::{Value, json};

struct LookupOrderTool;

#[async_trait]
impl ExecutableTool for LookupOrderTool {
    fn spec(&self) -> ToolSpec {
        ToolSpec::builder("lookup_order")
            .description("Look up an order by id.")
            .input_schema(json!({
                "type": "object",
                "properties": {
                    "order_id": { "type": "string" }
                },
                "required": ["order_id"]
            }))
            .defer_loading(true)
            .build()
    }

    async fn execute(&self, _ctx: ParallelToolContext, _input: Value) -> ToolResult {
        Ok("order loaded".to_string())
    }
}
```

Enable hosted tool search per agent with `ProviderRequestOptions`:

```rust,no_run
use mentra::agent::AgentConfig;
use mentra::provider::{ProviderRequestOptions, ToolSearchMode};

let config = AgentConfig {
    provider_request_options: ProviderRequestOptions {
        tool_search_mode: ToolSearchMode::Hosted,
        ..Default::default()
    },
    ..Default::default()
};
```

Current provider support:

- OpenAI: supported through the Responses API hosted `tool_search` surface
- Anthropic: supported through the Messages API BM25 tool-search server tool
- Gemini: deferred custom tools are not supported; Mentra returns `InvalidRequest`

Deferred tools are filtered through `ToolProfile` just like immediate tools. If you force a deferred tool with `ToolChoice::Tool { name }`, Mentra serializes that specific tool as immediate for the request so explicit invocation still works.

## Tool Profiles

Register tools once on the runtime, then use `AgentConfig::tool_profile` to expose different subsets for different operating modes.

```rust,no_run
use mentra::{BuiltinProvider, ModelSelector, Runtime};
use mentra::agent::{AgentConfig, ToolProfile};

# async fn demo() -> Result<(), Box<dyn std::error::Error>> {
let runtime = Runtime::builder()
    .with_provider(BuiltinProvider::OpenAI, std::env::var("OPENAI_API_KEY")?)
    .build()?;
let model = runtime
    .resolve_model(
        BuiltinProvider::OpenAI,
        ModelSelector::Id("gpt-5.4-mini".to_string()),
    )
    .await?;

let queue_mode = AgentConfig {
    tool_profile: ToolProfile::only([
        "shell",
        "background_run",
        "check_background",
        "files",
        "task",
    ]),
    ..Default::default()
};

let direct_mode = AgentConfig {
    tool_profile: ToolProfile::hide(["task", "background_run"]),
    ..Default::default()
};

let _queue_agent = runtime.spawn_with_config("Queue Agent", model.clone(), queue_mode)?;
let _direct_agent = runtime.spawn_with_config("Direct Agent", model, direct_mode)?;
# Ok(())
# }
```

This is the recommended pattern when one application needs multiple tool surfaces such as a queue-backed agent with delegation enabled and a direct mode that keeps the same runtime but hides long-running or task-oriented tools.

## CLI Integration Pattern

For CLI-style coding or analysis tools, the usual setup is:

- register a superset of builtin and custom tools on one runtime
- scope shell and file access with `RuntimePolicy`
- keep application-specific output paths in app context for custom tools
- switch behavior per mode by changing `AgentConfig::tool_profile`, not by rebuilding the runtime
- inspect `agent.history()` after the run when you want to render a compact tool log or transcript summary

The `cli_runtime` example in the workspace examples crate shows this pattern end to end with custom tools, policy setup, mode-specific tool surfaces, and transcript inspection.

## Disposable Tasks vs Persistent Teams

Mentra supports two different delegation models:

- use the builtin `task` tool or `ParallelToolContext::spawn_subagent()` for short-lived disposable delegation that should return a single summary to the parent
- use `team_spawn`, `team_send`, `team_read_inbox`, `team_request`, and `team_respond` when you want a persistent teammate with a durable mailbox and request/response workflow across turns

The `task` path is ideal for one-off decomposition inside a single run. The `team_*` tools are for longer-lived collaborators that should keep state, receive follow-up work, and participate in approval or shutdown flows.

## Sending Images

You can attach image blocks alongside text when sending a user turn:

```rust,no_run
# use mentra::{ContentBlock, Agent};
# async fn demo(agent: &mut Agent) -> Result<(), Box<dyn std::error::Error>> {
agent
    .send(vec![
        ContentBlock::text("What is happening in this screenshot?"),
        ContentBlock::image_bytes("image/png", std::fs::read("screenshot.png")?),
    ])
    .await?;
# Ok(())
# }
```

For already-hosted assets, use `ContentBlock::image_url(...)` instead. Gemini currently supports inline `image_bytes(...)` inputs only and rejects `image_url(...)`.

## Long-Term Memory

Agents automatically recall from long-term memory by default. When you use `Runtime::builder()`, the builtin runtime intrinsics include:

- `memory_search` for explicit recall
- `memory_pin` for writing important facts
- `memory_forget` for tombstoning a specific memory record

`MemoryConfig` controls recall and write behavior per agent. The default configuration enables automatic recall and memory write tools, which is useful for long-running assistants and teammate workflows. Disable write tools when you want recall without model-initiated mutation.

## Context Compaction

Agents compact context by default:

- old tool results are micro-compacted in outbound requests
- when estimated request context exceeds roughly 50k tokens, Mentra writes the full transcript to the default transcript directory and replaces older history with a model-generated summary
- the model can also call the builtin `compact` tool explicitly

You can tune or disable this per-agent with `ContextCompactionConfig`:

```rust
use mentra::agent::{AgentConfig, ContextCompactionConfig};

let config = AgentConfig {
    context_compaction: ContextCompactionConfig {
        auto_compact_threshold_tokens: Some(75_000),
        ..Default::default()
    },
    ..Default::default()
};
```

## Data And Persistence Defaults

For non-test builds, Mentra keeps all default persisted state under a workspace-scoped app-data directory:

- store: `<platform data dir>/mentra/workspaces/<workspace-hash>/runtime.sqlite`
- runtime-scoped stores: `<platform data dir>/mentra/workspaces/<workspace-hash>/runtime-<runtime-id>.sqlite`
- team state: `<platform data dir>/mentra/workspaces/<workspace-hash>/team/`
- task state: `<platform data dir>/mentra/workspaces/<workspace-hash>/tasks/`
- transcripts: `<platform data dir>/mentra/workspaces/<workspace-hash>/transcripts/`

If the platform data directory cannot be resolved, Mentra falls back to `.mentra/workspaces/<workspace-hash>/...` inside the current workspace.

Override these defaults when needed:

- use `Runtime::builder().with_store(...)` for the SQLite store
- customize `AgentConfig::task.tasks_dir`, `AgentConfig::team.team_dir`, and `AgentConfig::context_compaction.transcript_dir` for task, team, and transcript storage

## Persistence Extension Points

The public persistence surface is intentionally split into narrower traits:

- `AgentStore` for agent records and working-memory snapshots
- `RunStore` for turn and run lifecycle tracking
- `TaskStore` for the dependency-aware task board
- `LeaseStore` for runtime ownership and resume coordination

`RuntimeStore` composes those traits with `TeamStore`, `BackgroundStore`, and `MemoryStore`. `SqliteRuntimeStore` is the default all-in-one backend. `HybridRuntimeStore` keeps SQLite runtime state and swaps in the hybrid memory engine for richer long-term memory behavior.

## Testing With MockRuntime

Enable the `test-utils` feature when you want a deterministic scripted runtime for unit and integration tests.

`mentra::test::MockRuntime` wraps a real runtime with:

- a scripted provider
- a temporary SQLite-backed runtime store
- deterministic per-turn helper methods for assistant text, streamed text, tool-call turns, and provider failures

This is the recommended way to test Mentra-based agents and tools without live API keys.

The common pattern is:

- build a `MockRuntime`
- register the same custom tools you use in production
- spawn an agent with the `AgentConfig` or `ToolProfile` you want to verify
- assert against `mock.recorded_requests()` to confirm the runtime exposed the expected tools and tool-choice hints

See `mentra::test` and the crate tests for a full example of asserting runtime assembly with custom tools and filtered tool surfaces.

## Interactive Repo Example

Clone the repository when you want the richer interactive demo with provider selection, persisted runtime inspection, skills loading, and team/task visibility.

Set `OPENAI_API_KEY`, `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY`, then run. The example lets you choose a provider and shows up to 10 models from that provider ordered newest to oldest.

```bash
cargo run -p mentra-examples --example chat
```

Additional focused examples live in the same crate:

```bash
cargo run -p mentra-examples --example custom_tool
cargo run -p mentra-examples --example subagent_tool
cargo run -p mentra-examples --example team_collaboration
cargo run -p mentra-examples --example cli_runtime -- --mode direct
```

`cli_runtime` is the closest example to a real integration. It combines runtime policy setup, custom tools, mode-specific `ToolProfile` selection, and transcript inspection after the run.

## Run Checks

```bash
cargo fmt --all --check
cargo check --workspace
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
```