awaken-runtime 0.2.0

# Awaken

[English](./README.md) | [中文](./README.zh-CN.md)

[![CI](https://github.com/AwakenWorks/awaken/actions/workflows/test.yml/badge.svg)](https://github.com/AwakenWorks/awaken/actions/workflows/test.yml) [![crates.io](https://img.shields.io/crates/v/awaken-agent.svg?label=crates.io)](https://crates.io/crates/awaken-agent) ![License](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue) ![MSRV](https://img.shields.io/badge/MSRV-1.85-orange)

Production AI agent runtime for Rust — type-safe state, multi-protocol serving, plugin extensibility.

Published on crates.io as `awaken-agent`; keep importing it in Rust as `awaken`.
The workspace uses Rust 1.93.0 for development; the crate MSRV is 1.85.

Docs: [GitHub Pages](https://awakenworks.github.io/awaken/) | [Chinese docs](https://awakenworks.github.io/awaken/zh-CN/)

<p align="center">
  <img src="./docs/assets/demo.svg" alt="Awaken demo — tool call + LLM streaming" width="800">
</p>

## Highlights

- **Rust-first agent runtime**: typed tools, generated JSON Schema, typed state keys, scoped snapshots, and atomic state commits.
- **One runtime, many clients**: HTTP/SSE run API, AI SDK v6, AG-UI/CopilotKit, A2A, and MCP JSON-RPC from the same backend.
- **Configuration-first optimization**: choose models/providers, tune prompts, reminders, permissions, generative UI, and deferred tools through `/v1/config/*`, `/v1/capabilities`, and the admin console.
- **Production control paths**: mailbox-backed background runs, HITL decisions, cancellation/interrupt, SSE replay, retries, fallback models, circuit breakers, metrics, and health probes.
- **Plugin surface**: permission gates, reminders, OpenTelemetry, MCP tools, skills, generative UI, and deferred tool loading with an explicit probability model.

## 30-second mental model

1. **Tools** — implement `Tool` directly or `TypedTool` with `schemars`-generated JSON Schema
2. **Agents** — each agent has a system prompt, a model, and a set of allowed tools; the LLM drives orchestration through natural language — no predefined graphs
3. **State** — typed run/thread state plus persistent profile/shared state for cross-thread or cross-agent coordination
4. **Plugins** — lifecycle hooks for permissions, observability, context management, skills, MCP, and more

Your agent picks tools, calls them, reads and updates state, and repeats — all orchestrated by the runtime through 9 typed phases, including a pure `ToolGate` before tool execution. Every state change is committed atomically after the gather phase.

## Try it in 5 minutes

Prerequisites: Rust 1.85+ for the published crate, or the pinned `rust-toolchain.toml`
toolchain when working from this repository, plus an LLM provider API key.

- Rust 1.85 or newer. This repository pins Rust 1.93.0 for local development.
- An OpenAI-compatible API key.

```toml
[dependencies]
awaken = { package = "awaken-agent", version = "0.2" }
tokio = { version = "1.51.0", features = ["full"] }
async-trait = "0.1.89"
serde_json = "1.0.149"
```

```bash
export OPENAI_API_KEY=<your-key>
```

Copy this into `src/main.rs` and run `cargo run`:

```rust,no_run
use std::sync::Arc;
use serde_json::{json, Value};
use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
use awaken::contract::message::Message;
use awaken::engine::GenaiExecutor;
use awaken::registry_spec::AgentSpec;
use awaken::registry::ModelBinding;
use awaken::{AgentRuntimeBuilder, RunRequest};

struct EchoTool;

#[async_trait]
impl Tool for EchoTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("echo", "Echo", "Echo input back to the caller")
            .with_parameters(json!({
                "type": "object",
                "properties": { "text": { "type": "string" } },
                "required": ["text"]
            }))
    }

    async fn execute(&self, args: Value, _ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        let text = args["text"].as_str().unwrap_or_default();
        Ok(ToolResult::success("echo", json!({ "echoed": text })).into())
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let agent_spec = AgentSpec::new("assistant")
        .with_model_id("gpt-4o-mini")
        .with_system_prompt("You are a helpful assistant. Use the echo tool when asked.")
        .with_max_rounds(5);

    let runtime = AgentRuntimeBuilder::new()
        .with_agent_spec(agent_spec)
        .with_tool("echo", Arc::new(EchoTool))
        .with_provider("openai", Arc::new(GenaiExecutor::new()))
        .with_model_binding("gpt-4o-mini", ModelBinding {
            provider_id: "openai".into(),
            upstream_model: "gpt-4o-mini".into(),
        })
        .build()?;

    let request = RunRequest::new(
        "thread-1",
        vec![Message::user("Say hello using the echo tool")],
    )
    .with_agent_id("assistant");

    // The quickstart only needs the final result. Use run(..., sink) when
    // streaming events to SSE, WebSocket, protocol adapters, or tests.
    let result = runtime.run_to_completion(request).await?;
    println!("response: {}", result.response);
    println!("termination: {:?}", result.termination);

    Ok(())
}
```

The quickstart path is covered without network access:

```bash
cargo test -p awaken-agent --test readme_quickstart
```

Live provider validation is opt-in so CI does not depend on external model services:

```bash
OPENAI_API_KEY=<your-key> cargo test -p awaken-agent --test readme_live_provider -- --ignored
```

## Serve over any protocol

Start the built-in server and connect from React, Next.js, or another agent — no code changes:

```rust,no_run
use awaken::prelude::*;
use awaken::stores::{InMemoryMailboxStore, InMemoryStore};
use std::sync::Arc;

let store = Arc::new(InMemoryStore::new());
let runtime = Arc::new(runtime);
let mailbox = Arc::new(Mailbox::new(
    runtime.clone(),
    Arc::new(InMemoryMailboxStore::new()),
    store.clone(),
    "default-consumer".into(),
    MailboxConfig::default(),
));

let state = AppState::new(
    runtime.clone(),
    mailbox,
    store,
    runtime.resolver_arc(),
    ServerConfig::default(),
);
serve(state).await?;
```

#### Frontend protocols

| Protocol | Endpoint | Frontend |
|---|---|---|
| AI SDK v6 | `POST /v1/ai-sdk/chat` | React `useChat()` |
| AG-UI | `POST /v1/ag-ui/run` | CopilotKit `<CopilotKit>` |
| A2A | `POST /v1/a2a/message:send` | Other agents |
| MCP | `POST /v1/mcp` | JSON-RPC 2.0 |

The optional admin console uses `/v1/capabilities` and `/v1/config/*` to edit
agents, models, providers, MCP servers, and plugin config sections in the
browser. Plugins expose JSON Schema through the same typed `PluginConfigKey`
path used by runtime hooks, so saving an agent section such as `permission`,
`reminder`, `generative-ui`, or `deferred_tools` publishes a new registry
snapshot and applies to subsequent `/v1/runs` requests. OpenAI-compatible
providers, including BigModel, use the `openai` adapter with a provider-specific
`base_url`.

The design intent is that agent optimization stays data-driven: model choice,
provider endpoints, base prompts, system reminders, generated-UI instructions,
permission policy, and tool-loading policy should be configured through the
same schema-backed path rather than hard-coded into the agent loop.

| Tuning surface | Configuration path |
|---|---|
| Base prompt | `AgentSpec.system_prompt` on the agent entry |
| Model and provider routing | `AgentSpec.model_id`, `/v1/config/models`, `/v1/config/providers` |
| System reminders and prompt context injection | `reminder` plugin section, using `system` or `suffix_system` targets |
| Generative UI prompt guidance | `generative-ui` plugin section (`catalog_id`, `examples`, or full `instructions`) |
| Tool policy and context cost | `permission` and `deferred_tools` plugin sections |
| Prompt semantic hooks | Not a built-in plugin yet; add them as typed `PluginConfigKey` sections with schema-backed hooks |

**React + AI SDK v6:**

```typescript
import { useChat } from "ai/react";

const { messages, input, handleSubmit } = useChat({
  api: "http://localhost:3000/v1/ai-sdk/chat",
});
```

**Next.js + CopilotKit:**

```typescript
import { CopilotKit } from "@copilotkit/react-core";

<CopilotKit runtimeUrl="http://localhost:3000/v1/ag-ui/run">
  <YourApp />
</CopilotKit>
```

#### Managed configuration

Wire a `ConfigStore` into `AppState` to manage agents, models, providers, and MCP servers through `/v1/config/*`. Use the [configuration-driven tuning guide](https://awakenworks.github.io/awaken/how-to/configure-agent-behavior.html) to tune providers, model bindings, tools, and plugin sections. The Admin Console in [`apps/admin-console`](./apps/admin-console/) uses the same API and reads `VITE_BACKEND_URL` for the server base URL.

## Built-in plugins

Facade features are enabled by default via the `full` feature. Use
`default-features = false` to opt out. Workspace extension crates that are not
re-exported by the facade, such as deferred tools, are added as direct
dependencies.

| Plugin | What it does | Feature flag |
|---|---|---|
| **Permission** | Firewall-style tool access control with Deny/Allow/Ask rules, glob/regex matching, and HITL suspension via mailbox. | `permission` |
| **Reminder** | Injects system or conversation-level context messages when tool calls match configured patterns. | `reminder` |
| **Observability** | OpenTelemetry telemetry aligned with GenAI Semantic Conventions; supports OTLP, file, and in-memory export. | `observability` |
| **MCP** | Connects to external MCP servers and registers their tools as native Awaken tools. | `mcp` |
| **Skills** | Discovers skill packages and injects a catalog before inference so the LLM can activate skills on demand. | `skills` |
| **Generative UI** | Streams declarative UI components to frontends via A2UI, JSON Render, and OpenUI Lang integrations. | `generative-ui` |
| **Deferred Tools** | Hides large tool schemas behind `ToolSearch`, then uses a discounted Beta probability model to re-defer idle promoted tools. | direct crate: `awaken-ext-deferred-tools` |

Deferred tools are configured through the `deferred_tools` agent section when
the `ext-deferred-tools` plugin is registered. See
[Use Deferred Tools](./docs/book/src/how-to/use-deferred-tools.md) for setup,
the activation heuristic, and the DiscBeta probability model.

Custom interception hooks should use `ToolGateHook` via `PluginRegistrar::register_tool_gate_hook()`. `BeforeToolExecute` is reserved for execution-time hooks that run only when a tool is actually about to execute.

## Why Awaken

- One backend serves multiple protocols: AI SDK v6, AG-UI, A2A, MCP, plus native HTTP/SSE routes.
- Configuration is the control plane: model/provider routing, prompts, reminders, permissions, and tool-loading policy use schema-backed config that can be validated and applied at runtime.
- The LLM orchestrates: define the agent identity, model binding, and tool access; no hand-coded DAG is required.
- Runtime-managed configuration updates agents, model bindings, providers, and MCP servers through the Config API or Admin Console.
- Plugin extension points are typed: 9 lifecycle phases, `PhaseHook`, `ToolGateHook`, scheduled actions, effects, request transforms, and plugin-provided tools.
- State is type-safe: `StateKey` binds each key to Rust value/update types, scopes it to run/thread/profile, and applies declared merge strategies before commit.
- Operational surfaces are built in: LLM retry/backoff, per-model circuit breaker, request timeout, graceful shutdown, Prometheus metrics, health probes, and mailbox retry/backoff.
- Zero `unsafe` — the entire workspace forbids `unsafe` and relies on the Rust compiler for memory safety.

## When to use Awaken

- You want a **Rust backend** for AI agents with compile-time safety
- You need to serve **multiple frontend or agent protocols** from one backend
- Your tools need to **safely share state** during concurrent execution
- You need **auditable thread history**, checkpoints, and resumable control paths
- You are comfortable wiring your own tools, providers, and model registry instead of relying on batteries-included defaults

## When NOT to use Awaken

- You need **built-in file/shell/web tools** out of the box — consider OpenAI Agents SDK, Dify, or CrewAI
- You want a **visual workflow builder** — consider Dify, LangGraph Studio
- You want **Python** and rapid prototyping — consider LangGraph, AG2, PydanticAI
- You need a **stable, slow-moving surface area** more than an evolving runtime platform
- You need **LLM-managed memory** (agent decides what to remember) — consider Letta

## Architecture

Awaken is split into three runtime layers. `awaken-contract` defines the shared contracts: agent specs, model/provider specs, tools, events, transport traits, and the typed state model. `awaken-runtime` resolves an `AgentSpec` into `ResolvedExecution`: local agents become a `ResolvedAgent` with an `ExecutionEnv` built from plugins, while endpoint-backed agents run through an `ExecutionBackend`. It also executes the phase loop and manages active runs plus external control such as cancellation and HITL decisions. `awaken-server` exposes that same runtime through HTTP routes, SSE replay, mailbox-backed background execution, and protocol adapters for AI SDK v6, AG-UI, A2A, and MCP.

Around those layers sit storage and extensions. `awaken-stores` provides memory, file, and PostgreSQL persistence for threads and runs; memory, file, and PostgreSQL config stores; memory and SQLite mailbox stores; and memory/file profile stores. `awaken-ext-*` crates extend the runtime at phase and tool boundaries.

```text
awaken                   Facade crate with feature flags
├─ awaken-contract       Contracts: specs, tools, events, transport, state model
├─ awaken-runtime        Resolver, phase engine, loop runner, runtime control
├─ awaken-server         Routes, mailbox, SSE transport, protocol adapters
├─ awaken-stores         Memory, file, PostgreSQL, and SQLite-backed stores
├─ awaken-tool-pattern   Glob/regex matching used by extensions
└─ awaken-ext-*          Optional runtime extensions
```

## Examples and learning paths

| Example | What it shows |
|---|---|
| [`live_test`](./crates/awaken/examples/live_test.rs) | Basic LLM integration |
| [`multi_turn`](./crates/awaken/examples/multi_turn.rs) | Multi-turn with persistent threads |
| [`tool_call_live`](./crates/awaken/examples/tool_call_live.rs) | Tool calling with calculator |
| [`ai-sdk-starter`](./examples/ai-sdk-starter/) | React + AI SDK v6 full-stack |
| [`copilotkit-starter`](./examples/copilotkit-starter/) | Next.js + CopilotKit full-stack |
| [`openui-chat`](./examples/openui-chat/) | OpenUI Lang chat frontend |
| [`admin-console`](./apps/admin-console/) | Config API management UI |

```bash
export OPENAI_API_KEY=<your-key>
cargo run --package awaken-agent --example multi_turn

cd examples/ai-sdk-starter && npm install && npm run dev

# Terminal 1: starter backend for admin console
AWAKEN_STORAGE_DIR=./target/admin-sessions cargo run -p ai-sdk-starter-agent

# Terminal 2: admin console
npm --prefix apps/admin-console install
npm --prefix apps/admin-console run dev
```

| Goal | Start with | Then |
|---|---|---|
| Build your first agent | [Get Started](https://awakenworks.github.io/awaken/get-started.html) | [Build Agents](https://awakenworks.github.io/awaken/build-agents.html) |
| See a full-stack app | [AI SDK starter](./examples/ai-sdk-starter/) | [CopilotKit starter](./examples/copilotkit-starter/) |
| Manage runtime config | [Admin Console](./apps/admin-console/) | [Configure Agent Behavior](https://awakenworks.github.io/awaken/how-to/configure-agent-behavior.html) |
| Explore the API | [Reference docs](https://awakenworks.github.io/awaken/reference/overview.html) | `cargo doc --workspace --no-deps --open` |
| Understand the runtime | [Architecture](https://awakenworks.github.io/awaken/explanation/architecture.html) | [Run Lifecycle and Phases](https://awakenworks.github.io/awaken/explanation/run-lifecycle-and-phases.html) |
| Migrate from tirea | [Migration guide](https://awakenworks.github.io/awaken/appendix/migration-from-tirea.html) | |

## Contributing

See [CONTRIBUTING.md](./CONTRIBUTING.md) and [DEVELOPMENT.md](./DEVELOPMENT.md) for setup details.

[Good first issues](https://github.com/AwakenWorks/awaken/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) are a great entry point. Quick contribution flow: fork → create a branch → write tests → open a PR.

Areas where contributions are especially welcome:

- Additional mailbox, config, and storage backends beyond the built-in memory/file/PostgreSQL/SQLite options
- Built-in tool implementations (file read/write, web search)
- Token cost tracking and budget enforcement
- Model fallback/degradation chains

Join the conversation on [GitHub Discussions](https://github.com/AwakenWorks/awaken/discussions).

---

Awaken is a ground-up rewrite of [tirea](../../tree/tirea-0.5); it is not backwards-compatible. The tirea 0.5 codebase is archived on the [`tirea-0.5`](../../tree/tirea-0.5) branch.

## License

Dual-licensed under [MIT](./LICENSE-MIT) or [Apache-2.0](./LICENSE-APACHE).