baml-agent
Shared Rust crate for building BAML-powered SGR (Schema-Guided Reasoning) agents.
Reusable across multiple agent projects — just implement SgrAgent trait and wire your BAML-generated types.
What is SGR?
Schema-Guided Reasoning — the LLM generates structured JSON (not function calls) guided by a schema that BAML injects into the prompt via {{ ctx.output_format }}. The model fills in a discriminator field (task) to pick which tool to use, and the agent loop executes it.
User request → [SGR Loop] → decide (LLM) → execute (tools) → push result → repeat
↑ |
└────────────────────────────────────┘
Modules
| Module | What |
|---|---|
config |
AgentConfig, ProviderConfig — multi-provider LLM config (Vertex AI, Google AI, OpenAI-compatible) |
engine |
BamlRegistry trait, AgentEngine — builds BAML ClientRegistry from config |
session |
Session<M>, AgentMessage, MessageRole, EntryType, MessageBody, MessageContent, ContentBlock, SessionMeta, list_sessions, search_sessions — JSONL persistence with typed structs, UUID v7 IDs, Claude Code compatible format, history trimming, session browsing. Split into submodules: traits (message traits), format (serialization/deserialization), time (UUID v7 timestamp extraction, UTF-8 safe truncation), store (Session struct, persistence), meta (SessionMeta, listing, search) |
loop_detect |
LoopDetector, LoopStatus, normalize_signature — 3-tier loop detection (exact, semantic, output) |
agent_loop |
SgrAgent, SgrAgentStream, run_loop, run_loop_stream — the core agent loop |
prompt |
BASE_SYSTEM_PROMPT, build_system_prompt() — STAR system prompt template |
helpers |
norm, action_result_from, truncate_json_array, AgentContext — reusable patterns + context loading |
Quick Start
1. Add dependency
In your project (or use a symlink for local dev):
[]
= { = "../baml-agent" }
2. Implement the traits
use ;
use HashMap;
// --- Wrap your BAML-generated ClientRegistry ---
;
// --- Wrap your BAML-generated message types ---
;
// --- Implement SgrAgent ---
3. Run the loop
async
Streaming (TUI / progressive output)
For streaming tokens during the LLM decision phase, implement SgrAgentStream and use run_loop_stream:
use ;
// Use run_loop_stream instead of run_loop
let steps = run_loop_stream.await.unwrap;
Trait hierarchy
SgrAgent SgrAgentStream : SgrAgent
decide() decide_stream(on_token)
execute()
action_signature() (inherits all from SgrAgent)
action_category() [default]
run_loop(impl SgrAgent) run_loop_stream(impl SgrAgentStream)
calls decide() calls decide_stream()
no StreamToken events emits StreamToken events
3-tier loop detection 3-tier loop detection
- CLI agents —
SgrAgentonly,run_loop(). No streaming needed for autonomous CLI. - TUI agents — implement both
SgrAgent+SgrAgentStream. Headless mode usesrun_loop_stream(). TUI usesstep_stream()+ manual loop withprocess_step().
Session persistence
Session<M> saves every message to a JSONL file using UUID v7 session IDs (time-sortable). Messages use typed structs (EntryType, MessageBody, MessageContent, ContentBlock) with a Claude Code compatible format: user/system entries have plain string content, assistant/tool entries use content blocks arrays. Supports resume:
// New session
let session = new;
// Resume specific session
let session = resume;
// Resume most recent
let session = resume_last;
// Auto-trim when history exceeds max (preserves system messages)
let trimmed = session.trim; // returns number of trimmed messages
Session management
List and search past sessions without loading full message history:
use ;
// List all sessions (newest first)
let sessions: = list_sessions;
for s in &sessions
// Resume by selection
let picked = &sessions;
let session = resume;
SessionMeta fields:
path— JSONL file pathcreated— unix timestamp (extracted from UUID v7 in filename)message_count— number of messages (line count)topic— first user message (truncated to 120 chars)size_bytes— file size
Fuzzy search (feature search)
Requires baml-agent = { features = ["search"] } (adds nucleo-matcher dep):
use search_sessions;
// Fuzzy match on topic (first user message)
let results = search_sessions;
for in &results
System prompt template
use build_system_prompt;
let prompt = build_system_prompt;
// Use in BAML: replace {output_format} with {{ ctx.output_format }}
Provider config
AgentConfig::vertex_from_env() reads GOOGLE_CLOUD_PROJECT and sets up:
vertex— Gemini 3.1 Flash Lite (primary)vertex_fallback— Gemini 3 Flashlocal— Ollama llama3.2 at localhost:11434
Custom providers:
let mut config = vertex_from_env?;
config.add_provider;
config.default_provider = "openai".into;
Stateful executors
If execute() needs mutable state (MCP connections, DB handles), use interior mutability:
STAR reasoning framework
The agent loop uses STAR (Situation → Task → Action → Result) as the structured reasoning pattern. StepDecision maps directly:
| STAR | Field | What the LLM fills |
|---|---|---|
| S — Situation | situation |
Current state, what's done, what blocks progress |
| T — Task | task |
1-5 remaining steps, first = execute now |
| A — Action | actions |
Tool calls to run (parallel if independent) |
| R — Result | completed |
true only when goal is fully achieved |
BAML field design rules (critical for union actions)
All optional fields in task classes MUST be string | null, not string.
LLMs (Gemini, GPT, Claude) struggle to generate union-typed arrays when task classes have many required fields. If a task has 6 required string fields but only 2 are relevant for the current operation, the model often skips the entire next_actions array rather than filling irrelevant fields with empty strings.
// BAD — model skips next_actions because it can't fill all required fields
class ProjectTask {
task "project_operation" @stream.not_null
operation "create" | "open" | "add_files"
project_path string
input_path string // required but unused for "create"
meta_key string // required but unused for "create"
meta_value string // required but unused for "create"
}
// GOOD — model can emit the action with only relevant fields
class ProjectTask {
task "project_operation" @stream.not_null
operation "create" | "open" | "add_files"
project_path string @description("Path to .l2f project file")
input_path string | null @description("File path for add_files")
meta_key string | null @description("Key for set_meta/get_meta")
meta_value string | null @description("Value for set_meta")
}
Symptoms of this bug: current_state and plan are populated correctly, but next_actions is always []. The agent describes what it wants to do but never emits tool calls. Affects all models (Gemini Flash Lite, Flash, Pro, GPT-4o).
The empty-actions guard in process_step() detects this and nudges the model with a system message: "You MUST emit at least one tool call." After loop_abort_threshold empty steps, the loop aborts.
Prompt tips for STAR
Place this near {{ ctx.output_format }} in your BAML prompt:
CRITICAL: The `next_actions` array MUST contain at least one action.
Never return an empty array. Pick the tool for the next phase.
Define a phase-based workflow (ORIENT → PROJECT → ANALYZE → ...) so the model always knows which tool to emit next. Add "NEVER go back to a completed phase" to prevent loops.
Loop detection (3-tier)
LoopDetector catches three types of agent loops, each tracked independently:
| Tier | Signal | Catches | Example |
|---|---|---|---|
| 1. Exact | Identical action_signature() |
Trivial loops (same tool, same args) | inspect:/path × 6 |
| 2. Category | Normalized action_category() |
Semantic loops (same intent, different syntax) | rg -n 'TODO' src/ vs grep -rn "TODO" src/ |
| 3. Output | Identical tool output (hash) | Stagnation (different commands, same result) | "No matches found" × 4 |
Thresholds: warns at ⌈abort/2⌉, aborts at abort_threshold. Default: warn at 3, abort at 6.
How it works in the loop
decide() → action_signature() + action_category()
→ check_with_category(sig, cat) ← Tier 1+2
→ if Warning: inject "try different approach" system message
→ if Abort: terminate loop
execute() → tool output
→ record_output(output) ← Tier 3
→ if Warning: inject "result is definitive" system message
→ if Abort: terminate loop
All three tiers are automatic — process_step() handles everything. No per-project wiring needed.
Signature normalization (normalize_signature)
Tier 2 uses normalize_signature() to collapse bash command variations into a canonical form:
use normalize_signature;
// All normalize to "bash-search:TODO|FIXME crates/src"
normalize_signature;
normalize_signature;
normalize_signature;
// Non-bash signatures pass through unchanged
normalize_signature; // → "inspect:/path/video.mp4"
Rules for bash signatures:
- Strip fallback chains (
||,&&,;,|) - Remove flags (
-n,-i,--long-flag) - Strip quotes and trailing slashes from args
- Search tools (
rg,grep,ag,ack) →bash-search:args - Other commands →
bash:cmd:args
Custom category (optional)
Override action_category() on SgrAgent for project-specific normalization:
Helpers (helpers module)
Reusable utilities extracted from real agent implementations. Import directly or via re-exports:
use ;
BAML enum normalization
BAML generates Rust enum variants with a K prefix (Ksystem, Kdefault). norm() strips it:
use norm;
let op = norm; // → "default"
let role = norm; // → "system"
let clean = norm; // → "already_clean"
// norm_owned() takes owned String (convenience for format!("{:?}", variant))
use norm_owned;
let op = norm_owned; // → "create"
ActionResult builders
Every execute() arm follows the same pattern: call IO → wrap JSON → ActionResult. Helpers eliminate boilerplate:
use ;
// From Result<Value, E> — wraps error in {"error": "..."}
async
JSON array truncation
Keep context window manageable by truncating large arrays in tool results:
use truncate_json_array;
let mut res = json!;
truncate_json_array; // keeps 10 + "... showing 10 of 500 total"
truncate_json_array;
AgentContext — layered memory system
Two loading modes that merge into a single system message:
1. Agent home dir (load)
Each agent has a configurable home dir (e.g. .my-agent/). All files are optional — use only what your agent needs:
| File | Label | What |
|---|---|---|
SOUL.md |
Soul | Who the agent is: values, boundaries, tone (user-customizable persona) |
IDENTITY.md |
Identity | Name, role, stack, domain (optional — prefer baking into BAML prompt) |
MANIFESTO.md |
Manifesto | Dev principles, harness engineering (optional) |
RULES.md |
Rules | Coding rules, workflow constraints (optional — prefer baking into BAML prompt) |
MEMORY.md |
Memory (user notes) | Human-editable free-form notes (semi-manual) |
MEMORY.jsonl |
Memory (learned) | Typed agent memory — auto-written, auto-GC'd |
context/*.md |
(filename) | User-extensible extras |
Recommended pattern: Bake domain logic (pipeline phases, tools, rules) into the BAML prompt. Use home dir files only for user-customizable content (persona, preferences, learned patterns). This prevents users from accidentally breaking agent behavior by editing logic files.
2. Project dir (load_project) — Claude Code compatible
| Priority | File | Scope |
|---|---|---|
| 1 | AGENTS.md > CLAUDE.md > .claude/CLAUDE.md |
Project instructions (git) |
| 2 | AGENTS.local.md > CLAUDE.local.md |
Local instructions (gitignored) |
| 3 | .agents/rules/*.md > .claude/rules/*.md |
Rules by topic |
Supports @path/to/file imports (Claude Code compatible, recursive up to depth 5).
use AgentContext;
// Load agent-specific context + project context
let mut ctx = load;
ctx.merge;
// Inject into session
if let Some = ctx.to_system_message
// With token budget (drops low-priority parts first)
if let Some = ctx.to_system_message_with_budget
Typed memory (MEMORY.jsonl)
Agent writes structured entries via a MemoryTask tool (defined in each agent's BAML schema):
{"category":"preference","section":"User Rules","content":"Always use film profile for travel videos","confidence":"confirmed","created":1772700000}
{"category":"pattern","section":"Scoring","content":"Garbage filter 0.3 works better for short clips","confidence":"tentative","created":1772700100}
{"category":"decision","section":"Build System","content":"Use cargo, not make","confidence":"confirmed","created":1772700200}
Two confidence levels:
confirmed— user-confirmed rules (viastore_rule). Live forever.tentative— agent-learned patterns (vialearn). Auto-expire after 7 days if not confirmed.
Loaded into system message as:
### Build System
- [✓|decision] Use cargo
### Testing
- [?|pattern] Run check before test
Garbage collection: tentative entries older than 7 days are auto-removed on load. Confirmed entries live forever.
Token budget priority (highest kept, lowest dropped first):
| Priority | Label | Droppable? |
|---|---|---|
| 10 | Soul | Never |
| 9 | Memory (user notes) | Never |
| 8 | Identity, Rules | Yes |
| 7 | Project/Local Instructions | Yes |
| 6 | Memory (learned) | Yes |
| 5 | Manifesto | Yes |
| 3 | context/* extras, rules/* | Yes (first to go) |
Agent manifesto loader (legacy)
Simple loader for agent.md / .director/agent.md in CWD. Use AgentContext for new agents.
use ;
let manifesto = load_manifesto; // from CWD
Tests
# 81 tests: session (typed structs, UUID v7, format, store, meta),
# trimming, 3-tier loop detection, agent loop, streaming,
# empty actions guard, helpers, AgentContext, memory GC,
# token budget, @import, project loading