# The Agent Loop
The agent loop is the core of phi-core. It implements the fundamental cycle:
```
User prompt → LLM call → Tool execution → LLM call → ... → Final response
```
The `agent_loop` module contains the core loop logic in `mod.rs` and the `evaluation` sub-module for evaluational parallelism strategies.
## How It Works
```
┌──────────────────────────────────────────────┐
│ agent_loop() │
│ │
│ 1. Add prompts to context │
│ 2. Emit AgentStart + TurnStart │
│ │
│ ┌─────────── Inner Loop ──────────────┐ │
│ │ • Check steering messages │ │
│ │ • Check execution limits │ │
│ │ • Compact context (if configured) │ │
│ │ • Stream LLM response │ │
│ │ • Extract tool calls │ │
│ │ • Execute tools (with steering) │ │
│ │ • Emit TurnEnd │ │
│ │ • Continue if tool_calls or steer │ │
│ └─────────────────────────────────────┘ │
│ │
│ 3. Check follow-up messages │
│ 4. If follow-ups exist, loop again │
│ 5. Emit AgentEnd │
└──────────────────────────────────────────────┘
```
## Entry Points
### `agent_loop()`
Starts a new agent run with prompt messages:
```rust
pub async fn agent_loop(
prompts: Vec<AgentMessage>,
context: &mut AgentContext,
config: &AgentLoopConfig,
tx: mpsc::UnboundedSender<AgentEvent>,
cancel: CancellationToken,
) -> Vec<AgentMessage>
```
The prompts are added to context, then the loop runs. Returns all new messages generated during the run.
### `agent_loop_continue()`
Resumes from existing context (e.g., after an error, retry, or branch):
```rust
pub async fn agent_loop_continue(
context: &mut AgentContext,
config: &AgentLoopConfig,
tx: mpsc::UnboundedSender<AgentEvent>,
cancel: CancellationToken,
) -> Vec<AgentMessage>
```
**Preconditions:** `context.agent_id` and `context.session_id` must be `Some` — the function panics with a descriptive message otherwise. In practice, any context that passed through `agent_loop()` at least once already has these set. When constructing a context manually (e.g., from a persisted snapshot), set them explicitly before calling this function.
The last message in context must also **not** be an assistant message.
## AgentLoopConfig
```rust
pub struct AgentLoopConfig {
/// REQUIRED — complete provider identity: model id, api_key, base_url, protocol, cost rates.
pub model_config: ModelConfig,
/// Optional override — bypasses ProviderRegistry, used for MockProvider in tests.
pub provider_override: Option<Arc<dyn StreamProvider>>,
pub config_id: Option<String>,
pub thinking_level: ThinkingLevel,
pub max_tokens: Option<u32>,
pub temperature: Option<f32>,
pub convert_to_llm: Option<ConvertToLlmFn>,
pub transform_context: Option<TransformContextFn>,
pub get_steering_messages: Option<GetMessagesFn>,
pub get_follow_up_messages: Option<GetMessagesFn>,
pub context_config: Option<ContextConfig>,
pub execution_limits: Option<ExecutionLimits>,
pub cache_config: CacheConfig,
pub tool_execution: ToolExecutionStrategy,
pub retry_config: RetryConfig,
pub before_loop: Option<BeforeLoopFn>,
pub after_loop: Option<AfterLoopFn>,
pub before_turn: Option<BeforeTurnFn>,
pub after_turn: Option<AfterTurnFn>,
pub on_error: Option<OnErrorFn>,
pub before_tool_execution: Option<BeforeToolExecutionFn>,
pub after_tool_execution: Option<AfterToolExecutionFn>,
pub before_tool_execution_update: Option<BeforeToolExecutionUpdateFn>,
pub after_tool_execution_update: Option<AfterToolExecutionUpdateFn>,
pub before_compaction_start: Option<BeforeCompactionStartFn>,
pub after_compaction_end: Option<AfterCompactionEndFn>,
pub input_filters: Vec<Arc<dyn InputFilter>>,
pub first_turn_trigger: TurnTrigger,
pub context_translation: Option<Arc<dyn ContextTranslationStrategy>>,
pub prun_pending: Option<Arc<Mutex<Vec<PrunRequest>>>>,
}
```
| `model_config` | **Required.** Complete provider identity: model id, api_key, base_url, api protocol, cost rates, compat flags. The provider is resolved from `model_config.api` via `ProviderRegistry`. |
| `provider_override` | Custom `Arc<dyn StreamProvider>` — bypasses registry when `Some`. Used for `MockProvider` in tests or fully custom backends. |
| `config_id` | Optional stable identity for this config; auto-derived as `"{provider_id}.{model_slug}[.thinking]"` when `None`. Used as the middle segment of `loop_id`. |
| `thinking_level` | `Off`, `Minimal`, `Low`, `Medium`, `High` |
| `convert_to_llm` | Custom `AgentMessage[] → Message[]` conversion |
| `transform_context` | Pre-processing hook for context pruning |
| `get_steering_messages` | Returns user interruptions during tool execution |
| `get_follow_up_messages` | Returns queued work after agent would stop |
| `context_config` | Token budget and compaction settings |
| `execution_limits` | Max turns, tokens, duration |
| `cache_config` | Prompt caching behavior (see [Prompt Caching](prompt-caching.md)) |
| `tool_execution` | Parallel, Sequential, or Batched (see [Tools](tools.md#execution-strategies)) |
| `retry_config` | Retry behavior for transient errors (see [Retry](retry.md)) |
| `before_loop` | Called once before `AgentStart`; return `false` to abort the entire run (see [Callbacks](callbacks.md)) |
| `after_loop` | Called once after `AgentEnd` with all new messages and accumulated usage (see [Callbacks](callbacks.md)) |
| `before_turn` | Called before each LLM call; return `false` to abort (see [Callbacks](callbacks.md)) |
| `after_turn` | Called after each turn with messages and usage (see [Callbacks](callbacks.md)) |
| `on_error` | Called on `StopReason::Error` with the error string (see [Callbacks](callbacks.md)) |
| `before_tool_execution` | Called before each tool call; return `false` to skip it (see [Callbacks](callbacks.md)) |
| `after_tool_execution` | Called after each tool call completes (see [Callbacks](callbacks.md)) |
| `before_tool_execution_update` | Called before each streaming tool update; return `false` to suppress the event (see [Callbacks](callbacks.md)) |
| `after_tool_execution_update` | Called after each streaming tool update event (see [Callbacks](callbacks.md)) |
| `before_compaction_start` | Called before compaction starts with `(estimated_tokens, message_count)`; return `false` to skip compaction for this cycle (see [Callbacks](callbacks.md)) |
| `after_compaction_end` | Called after compaction completes with `(messages_before, messages_after, tokens_before, tokens_after)` (see [Callbacks](callbacks.md)) |
| `input_filters` | Input filters applied to user messages before the LLM call (see [Tools](tools.md)) |
| `first_turn_trigger` | The `TurnTrigger` for the first `TurnStart` event; defaults to `TurnTrigger::User`, set to `SubAgent` by sub-agent callers |
| `context_translation` | Optional `ContextTranslationStrategy` for cross-provider compatibility — translates content types (e.g., `Content::Thinking`) when targeting a different provider (G8) |
| `prun_pending` | Shared state for `PrunTool` to communicate pruning requests to the loop; set automatically by `with_prun_tool()` |
## Steering & Follow-Ups
### Steering
**Steering messages** interrupt the agent between tool executions. When the agent is executing multiple tool calls from a single LLM response, steering is checked after each tool completes. If a steering message is found:
1. The current tool finishes normally
2. All remaining tool calls are **skipped** with `is_error: true` and "Skipped due to queued user message"
3. The steering message is injected into context
4. The loop continues with a new LLM call that sees the interruption
```rust
// While agent is running tools, redirect it:
agent.steer(AgentMessage::Llm(Message::user("Stop that. Instead, explain what you found.")));
```
### Follow-Ups
**Follow-up messages** are checked after the agent would normally stop (no more tool calls, no steering). If follow-ups exist, the loop continues with them as new input — the agent doesn't need to be re-prompted.
```rust
// Queue work for after the agent finishes its current task:
agent.follow_up(AgentMessage::Llm(Message::user("Now run the tests.")));
agent.follow_up(AgentMessage::Llm(Message::user("Then commit the changes.")));
```
### Queue Modes
Both queues support two delivery modes:
| `QueueMode::OneAtATime` | Delivers one message per turn (default) |
| `QueueMode::All` | Delivers all queued messages at once |
```rust
agent.set_steering_mode(QueueMode::All);
agent.set_follow_up_mode(QueueMode::OneAtATime);
```
### Queue Management
```rust
agent.clear_steering_queue(); // Drop all pending steers
agent.clear_follow_up_queue(); // Drop all pending follow-ups
agent.clear_all_queues(); // Drop everything
```
### Low-Level API
When using `agent_loop()` directly, steering and follow-ups are provided via callback functions:
```rust
let config = AgentLoopConfig {
get_steering_messages: Some(Box::new(|| {
// Return Vec<AgentMessage> — checked between tool calls
vec![]
})),
get_follow_up_messages: Some(Box::new(|| {
// Return Vec<AgentMessage> — checked when agent would stop
vec![]
})),
// ...
};
```
## Custom Compaction
By default, when context exceeds the token budget in `ContextConfig`, phi-core runs a 3-level compaction strategy: truncate tool outputs → summarize old turns → drop middle messages (legacy in-memory path via `compact_messages()`). When a Session is available, the modern system uses non-destructive CompactionBlock overlays — see [compaction](compaction.md). You can replace this with your own `CompactionStrategy`.
> **`CompactionStrategy` vs `BlockCompactionStrategy`**
>
> - **`CompactionStrategy`** — Legacy in-memory approach. Destructive: it mutates
> the message list directly. Used when `AgentContext.session` is `None` (no
> session persistence).
> - **`BlockCompactionStrategy`** — New overlay approach. Non-destructive: it
> creates a `CompactionBlock` on the `LoopRecord` rather than altering the
> original messages. Used when `AgentContext.session` is `Some` (session-backed
> execution). Original messages remain authoritative for replay and branching.
Example of a custom `CompactionStrategy`:
```rust
use phi_core::context::{CompactionStrategy, ContextConfig, CompactionConfig, compact_messages};
use phi_core::types::*;
use std::sync::Arc;
struct MyCompaction;
impl CompactionStrategy for MyCompaction {
fn compact(
&self,
messages: Vec<AgentMessage>,
config: &ContextConfig,
) -> Vec<AgentMessage> {
// Your logic here — then optionally delegate to the default:
compact_messages(messages, config)
}
}
// Modern pattern: set strategies via ContextConfig.compaction
let context_config = ContextConfig {
compaction: CompactionConfig {
// in_memory_strategy: used when AgentContext.session is None (sub-agents, tests)
in_memory_strategy: Some(Arc::new(MyCompaction)),
// block_strategy: used when AgentContext.session is Some (session-backed execution)
// block_strategy: Some(Arc::new(MyBlockCompaction)),
..CompactionConfig::default()
},
..ContextConfig::default()
};
let agent = BasicAgent::new(model_config)
.with_context_config(context_config);
```
The in-memory strategy is called once per turn, right before the LLM call, whenever `context_config` is `Some` and `AgentContext.session` is `None`. When `in_memory_strategy` is `None`, `DefaultCompaction` (which wraps `compact_messages()`) is used automatically. When a session is present, `block_strategy` is used instead (defaulting to `DefaultBlockCompaction`).
### Use Cases
**Memory-aware compaction** — Index messages into a vector store before they're dropped, so the agent can recall them later via a search tool:
```rust
struct MemoryAwareCompaction {
memory: Arc<dyn MemoryStore>,
}
impl CompactionStrategy for MemoryAwareCompaction {
fn compact(
&self,
messages: Vec<AgentMessage>,
config: &ContextConfig,
) -> Vec<AgentMessage> {
let compacted = compact_messages(messages.clone(), config);
// Index what was dropped
let dropped: Vec<_> = messages.iter()
.filter(|m| !compacted.contains(m))
.collect();
if !dropped.is_empty() {
self.memory.index(dropped);
}
compacted
}
}
```
**Semantic pointer compaction** — Replace dropped messages with a marker so the agent knows context was lost:
```rust
struct SemanticPointerCompaction;
impl CompactionStrategy for SemanticPointerCompaction {
fn compact(
&self,
messages: Vec<AgentMessage>,
config: &ContextConfig,
) -> Vec<AgentMessage> {
let compacted = compact_messages(messages.clone(), config);
let dropped_count = messages.len() - compacted.len();
if dropped_count == 0 {
return compacted;
}
// Insert a marker after the first kept messages
let mut result = compacted;
let insert_at = config.compaction.keep_first_turns.min(result.len());
result.insert(insert_at, AgentMessage::Extension(
ExtensionMessage::new("compaction_marker", serde_json::json!({
"dropped": dropped_count,
"note": format!("{} earlier messages were compacted", dropped_count),
}))
));
result
}
}
```
**Priority-preserving compaction** — Never drop messages containing important keywords:
```rust
struct PriorityPreservingCompaction {
preserve_keywords: Vec<String>,
}
impl CompactionStrategy for PriorityPreservingCompaction {
fn compact(
&self,
messages: Vec<AgentMessage>,
config: &ContextConfig,
) -> Vec<AgentMessage> {
let (priority, normal): (Vec<_>, Vec<_>) = messages.into_iter()
.partition(|m| self.is_priority(m));
let mut compacted = compact_messages(normal, config);
// Re-insert priority messages — they're never dropped
for msg in priority {
compacted.push(msg);
}
compacted
}
}
```
## Evaluational Parallelism
`agent_loop_parallel` runs the same prompt through multiple `AgentLoopConfig`s concurrently, evaluates the results with a pluggable `EvaluationStrategy`, and returns the winning branch. This is useful for multi-model comparison, A/B prompt testing, and selecting the best response among different reasoning approaches.
```rust
use phi_core::{agent_loop_parallel, PickFirstEvaluation, AgentContext, AgentLoopConfig};
use std::sync::Arc;
let result = agent_loop_parallel(
prompts,
base_context, // cloned per branch; Arc tools shared
vec![config_a, config_b],
Arc::new(PickFirstEvaluation),
tx,
cancel,
).await;
// result.selected_context feeds directly into agent_loop_continue()
// result.selected_messages is the winning branch's output
```
See [Evaluational Parallelism](./evaluational-parallelism.md) for the full guide including built-in strategies, the LLM judge, and session continuity.