symbi 1.12.0

AI-native agent framework for building autonomous, policy-aware agents that can safely collaborate with humans, other agents, and large language models
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
# Reasoning Loop Guide


Complete guide to the Symbiont agentic reasoning loop: a typestate-enforced Observe-Reason-Gate-Act (ORGA) cycle for autonomous agent behavior.



---

## Overview

The reasoning loop is the core execution engine for autonomous agents in Symbiont. It drives a multi-turn conversation between an LLM, a policy gate, and external tools through a structured cycle:

1. **Observe** — Collect results from previous tool executions
2. **Reason** — LLM produces proposed actions (tool calls or text responses)
3. **Gate** — Policy engine evaluates each proposed action
4. **Act** — Approved actions are dispatched to tool executors

The loop continues until the LLM produces a final text response, hits iteration/token limits, or times out.

### Design Principles

- **Compile-time safety**: Invalid phase transitions are caught at compile time via Rust's type system
- **Opt-in complexity**: The loop works with just a provider and policy gate; knowledge bridge, Cedar policies, and human-in-the-loop are all optional
- **Backward compatible**: Adding new features (like the knowledge bridge) never breaks existing code
- **Observable**: Every phase emits journal events and tracing spans

---

## Quick Start

### Minimal Example

```rust
use std::sync::Arc;
use symbi_runtime::reasoning::circuit_breaker::CircuitBreakerRegistry;
use symbi_runtime::reasoning::context_manager::DefaultContextManager;
use symbi_runtime::reasoning::conversation::{Conversation, ConversationMessage};
use symbi_runtime::reasoning::executor::DefaultActionExecutor;
use symbi_runtime::reasoning::loop_types::{BufferedJournal, LoopConfig};
use symbi_runtime::reasoning::policy_bridge::DefaultPolicyGate;
use symbi_runtime::reasoning::reasoning_loop::ReasoningLoopRunner;
use symbi_runtime::types::AgentId;

// Set up the runner with default components
let runner = ReasoningLoopRunner {
    provider: Arc::new(my_inference_provider),
    policy_gate: Arc::new(DefaultPolicyGate::permissive()),
    executor: Arc::new(DefaultActionExecutor::default()),
    context_manager: Arc::new(DefaultContextManager::default()),
    circuit_breakers: Arc::new(CircuitBreakerRegistry::default()),
    journal: Arc::new(BufferedJournal::new(1000)),
    knowledge_bridge: None,
};

// Build a conversation
let mut conv = Conversation::with_system("You are a helpful assistant.");
conv.push(ConversationMessage::user("What is 6 * 7?"));

// Run the loop
let result = runner.run(AgentId::new(), conv, LoopConfig::default()).await;

println!("Output: {}", result.output);
println!("Iterations: {}", result.iterations);
println!("Tokens used: {}", result.total_usage.total_tokens);
```

### With Tool Definitions

```rust
use symbi_runtime::reasoning::inference::ToolDefinition;

let config = LoopConfig {
    max_iterations: 10,
    tool_definitions: vec![
        ToolDefinition {
            name: "web_search".into(),
            description: "Search the web for information".into(),
            parameters: serde_json::json!({
                "type": "object",
                "properties": {
                    "query": { "type": "string" }
                },
                "required": ["query"]
            }),
        },
    ],
    ..Default::default()
};

let result = runner.run(agent_id, conv, config).await;
```

---

## Phase System

### Typestate Pattern

The loop uses Rust's type system to enforce valid phase transitions at compile time. Each phase is a zero-sized type marker:

```rust
pub struct Reasoning;      // LLM produces proposed actions
pub struct PolicyCheck;    // Each action evaluated by the gate
pub struct ToolDispatching; // Approved actions executed
pub struct Observing;      // Results collected for next iteration
```

The `AgentLoop<Phase>` struct carries the loop state and can only call methods appropriate to its current phase. For example, `AgentLoop<Reasoning>` only exposes `produce_output()`, which consumes self and returns `AgentLoop<PolicyCheck>`.

This means the following mistakes are **compile errors**, not runtime bugs:
- Skipping the policy check
- Dispatching tools without reasoning first
- Observing results without dispatching

### Phase Flow

```mermaid
graph TD
    R["Reasoning Phase\nproduce_output"]
    P["Policy Check Phase\ncheck_policy"]
    T["Tool Dispatching Phase\ndispatch_tools"]
    O["Observing Phase\nobserve_results"]
    LR["LoopResult"]

    R --> P
    P --> T
    T --> O
    O -->|Continue| R
    O -->|Complete| LR
```

**Dynamic branching within the typestate:** The LLM's output during the Reasoning phase may indicate continuation (more tool calls needed) or completion (final text response). The `observe_results()` method returns a `LoopContinuation` enum — either `Continue(AgentLoop<Reasoning>)` to re-enter the reasoning phase, or `Complete(LoopResult)` to terminate. This enum is the only point where the loop branches dynamically; all other phase transitions are strictly linear and enforced at compile time. No dynamic dispatch or trait objects are used — the branching is a standard Rust `match` on a concrete enum.

---

## Inference Providers

The `InferenceProvider` trait abstracts over LLM backends:

```rust
#[async_trait]
pub trait InferenceProvider: Send + Sync {
    async fn complete(
        &self,
        conversation: &Conversation,
        options: &InferenceOptions,
    ) -> Result<InferenceResponse, InferenceError>;

    fn provider_name(&self) -> &str;
    fn default_model(&self) -> &str;
    fn supports_native_tools(&self) -> bool;
    fn supports_structured_output(&self) -> bool;
}
```

### Cloud Provider (OpenRouter)

The `CloudInferenceProvider` connects to OpenRouter (or any OpenAI-compatible endpoint):

```bash
export OPENROUTER_API_KEY="sk-or-..."
export OPENROUTER_MODEL="google/gemini-2.0-flash-001"  # optional
```

```rust
use symbi_runtime::reasoning::providers::cloud::CloudInferenceProvider;

let provider = CloudInferenceProvider::from_env()
    .expect("OPENROUTER_API_KEY must be set");
```

---

## Policy Gate

Every proposed action passes through the policy gate before execution:

```rust
#[async_trait]
pub trait ReasoningPolicyGate: Send + Sync {
    async fn evaluate_action(
        &self,
        agent_id: &AgentId,
        action: &ProposedAction,
        state: &LoopState,
    ) -> LoopDecision;
}

pub enum LoopDecision {
    Allow,
    Deny { reason: String },
    Modify { modified_action: Box<ProposedAction>, reason: String },
}
```

### Built-in Gates

- **`DefaultPolicyGate::permissive()`** — Allows all actions (development/testing)
- **`DefaultPolicyGate::new()`** — Default policy rules
- **`OpaPolicyGateBridge`** — Bridges to the OPA-based policy engine
- **`CedarGate`** — Cedar policy language integration

### Policy Denial Feedback

When an action is denied, the denial reason is fed back to the LLM as a policy feedback observation, allowing it to adjust its approach on the next iteration.

---

## Action Execution

### ActionExecutor Trait

```rust
#[async_trait]
pub trait ActionExecutor: Send + Sync {
    async fn execute_actions(
        &self,
        actions: &[ProposedAction],
        config: &LoopConfig,
        circuit_breakers: &CircuitBreakerRegistry,
    ) -> Vec<Observation>;
}
```

### Built-in Executors

| Executor | Description |
|----------|-------------|
| `DefaultActionExecutor` | Parallel dispatch with per-tool timeouts |
| `EnforcedActionExecutor` | Delegates through `ToolInvocationEnforcer` → MCP pipeline |
| `KnowledgeAwareExecutor` | Intercepts knowledge tools, delegates rest to inner executor |

### Circuit Breakers

Each tool has an associated circuit breaker that tracks failures:

- **Closed** (normal): Tool calls proceed normally
- **Open** (tripped): Too many consecutive failures; calls rejected immediately
- **Half-open** (probing): Limited calls allowed to test recovery

```rust
let circuit_breakers = CircuitBreakerRegistry::new(CircuitBreakerConfig {
    failure_threshold: 3,
    recovery_timeout: Duration::from_secs(60),
    half_open_max_calls: 1,
});
```

---

## Knowledge-Reasoning Bridge

The `KnowledgeBridge` connects the agent's knowledge store (hierarchical memory, knowledge base, vector search) to the reasoning loop.

### Setup

```rust
use symbi_runtime::reasoning::knowledge_bridge::{KnowledgeBridge, KnowledgeConfig};

let bridge = Arc::new(KnowledgeBridge::new(
    context_manager.clone(),  // Arc<dyn context::ContextManager>
    KnowledgeConfig {
        max_context_items: 5,
        relevance_threshold: 0.3,
        auto_persist: true,
    },
));

let runner = ReasoningLoopRunner {
    // ... other fields ...
    knowledge_bridge: Some(bridge),
};
```

### How It Works

**Before each reasoning step:**
1. Search terms are extracted from recent user/tool messages
2. `query_context()` and `search_knowledge()` retrieve relevant items
3. Results are formatted and injected as a system message (replacing the previous injection)

**During tool dispatch:**
The `KnowledgeAwareExecutor` intercepts two special tools:

- **`recall_knowledge`** — Searches the knowledge base and returns formatted results
  ```json
  { "query": "capital of France", "limit": 5 }
  ```

- **`store_knowledge`** — Stores a new fact as a subject-predicate-object triple
  ```json
  { "subject": "Earth", "predicate": "has", "object": "one moon", "confidence": 0.95 }
  ```

All other tool calls are delegated to the inner executor unchanged. This includes [ToolClad](/toolclad) tools, which are validated against their manifest contracts before execution.

**After loop completion:**
If `auto_persist` is enabled, the bridge extracts assistant responses and stores them as working memory for future conversations.

### Backward Compatibility

Setting `knowledge_bridge: None` makes the runner behave identically to before — no context injection, no knowledge tools, no persistence.

---

## Conversation Management

### Conversation Type

`Conversation` manages an ordered sequence of messages with serialization to both OpenAI and Anthropic API formats:

```rust
let mut conv = Conversation::with_system("You are a helpful assistant.");
conv.push(ConversationMessage::user("Hello"));
conv.push(ConversationMessage::assistant("Hi there!"));

// Serialize for API calls
let openai_msgs = conv.to_openai_messages();
let (system, anthropic_msgs) = conv.to_anthropic_messages();
```

### Token Budget Enforcement

The in-loop `ContextManager` (not to be confused with the knowledge `ContextManager`) manages the conversation token budget:

- **Sliding Window**: Remove oldest messages first
- **Observation Masking**: Hide verbose tool results
- **Anchored Summary**: Keep system message + N recent messages

---

## Durable Journal

Every phase transition emits a `JournalEntry` to the configured `JournalWriter`:

```rust
pub struct JournalEntry {
    pub sequence: u64,
    pub timestamp: DateTime<Utc>,
    pub agent_id: AgentId,
    pub iteration: u32,
    pub event: LoopEvent,
}

pub enum LoopEvent {
    Started { agent_id, config },
    ReasoningComplete { iteration, actions, usage },
    PolicyEvaluated { iteration, action_count, denied_count },
    ToolsDispatched { iteration, tool_count, duration },
    ObservationsCollected { iteration, observation_count },
    Terminated { reason, iterations, total_usage, duration },
    RecoveryTriggered { iteration, tool_name, strategy, error },
}
```

The default `BufferedJournal` stores entries in memory. Production deployments can implement `JournalWriter` for persistent storage.

---

## Configuration

### LoopConfig

```rust
pub struct LoopConfig {
    pub max_iterations: u32,        // Default: 25
    pub max_total_tokens: u32,      // Default: 100,000
    pub timeout: Duration,          // Default: 5 minutes
    pub default_recovery: RecoveryStrategy,
    pub tool_timeout: Duration,     // Default: 30 seconds
    pub max_concurrent_tools: usize, // Default: 5
    pub context_token_budget: usize, // Default: 32,000
    pub tool_definitions: Vec<ToolDefinition>,
}
```

### Recovery Strategies

When tool execution fails, the loop can apply different recovery strategies:

| Strategy | Description |
|----------|-------------|
| `Retry` | Retry with exponential backoff |
| `Fallback` | Try alternative tools |
| `CachedResult` | Use a cached result if fresh enough |
| `LlmRecovery` | Ask the LLM to find an alternative approach |
| `Escalate` | Route to a human operator queue |
| `DeadLetter` | Give up and log the failure |

---

## Testing

### Unit Tests (No API Key Required)

```bash
cargo test -j2 -p symbi-runtime --lib -- reasoning::knowledge
```

### Integration Tests with Mock Provider

```bash
cargo test -j2 -p symbi-runtime --test knowledge_reasoning_tests
```

### Live Tests with Real LLM

```bash
OPENROUTER_API_KEY="sk-or-..." OPENROUTER_MODEL="google/gemini-2.0-flash-001" \
  cargo test -j2 -p symbi-runtime --features http-input --test reasoning_live_tests -- --nocapture
```

---

## Implementation Phases

The reasoning loop was built in five phases, each adding capabilities:

| Phase | Focus | Key Components |
|-------|-------|----------------|
| **1** | Core loop | `conversation`, `inference`, `phases`, `reasoning_loop` |
| **2** | Resilience | `circuit_breaker`, `executor`, `context_manager`, `policy_bridge` |
| **3** | DSL integration | `human_critic`, `pipeline_config`, REPL builtins |
| **4** | Multi-agent | `agent_registry`, `critic_audit`, `saga` |
| **5** | Observability | `cedar_gate`, `journal`, `metrics`, `scheduler`, `tracing_spans` |
| **Bridge** | Knowledge | `knowledge_bridge`, `knowledge_executor` |
| **orga-adaptive** | Advanced | `tool_profile`, `progress_tracker`, `pre_hydrate`, extended `knowledge_bridge` |

---

## Advanced Primitives (orga-adaptive)

The `orga-adaptive` feature gate adds four advanced capabilities. See the [full guide](orga-adaptive.md) for details.

| Primitive | Purpose |
|-----------|---------|
| **Tool Profile** | Glob-based filtering of tools visible to the LLM |
| **Progress Tracker** | Per-step reattempt limits with stuck-loop detection |
| **Pre-Hydration** | Deterministic context pre-fetch from task input references |
| **Scoped Conventions** | Directory-aware convention retrieval via `recall_knowledge` |

```rust
let config = LoopConfig {
    tool_profile: Some(ToolProfile::include_only(&["search_*", "file_*"])),
    pre_hydration: Some(PreHydrationConfig::default()),
    ..Default::default()
};
```

---

## Next Steps

- **[Runtime Architecture](runtime-architecture.md)** — Full system architecture overview
- **[Security Model](security-model.md)** — Policy enforcement and audit trails
- **[DSL Guide](dsl-guide.md)** — Agent definition language
- **[API Reference](api-reference.md)** — Complete API documentation