TinyAgents is a recursive language-model (RLM) harness for Rust. It is a typed, durable runtime where language models call models, agents call agents, graphs run graphs, and a model can author, compile, and run the very workflow it is standing inside — all as inspectable, checkpointed, policy-checked Rust.
What is an RLM, and why recursive?
Most agent frameworks stuff everything into one ever-growing context window and hope the model copes. Recursive Language Models (RLMs) take a different stance: a long prompt is treated as an external environment that the model explores through a REPL — examining it, decomposing it, and recursively calling itself (or sub-models) over snippets instead of swallowing the whole thing at once. This mitigates "context rot" and lets effective context exceed the raw window.
The idea comes from recent research:
- Paper: "Recursive Language Models," Alex L. Zhang, Tim Kraska, Omar Khattab (MIT CSAIL), 2025 — arXiv:2512.24601
- Blog: Alex L. Zhang, "Recursive Language Models" — https://alexzhang13.github.io/blog/2025/rlm/
- Reference implementation: https://github.com/alexzhang13/rlm
TinyAgents is inspired by and architected around the RLM execution model — a production-shaped Rust harness for building RLM-style systems. It does not claim to reproduce the paper's benchmark numbers; instead it brings the execution model to Rust as concrete, implemented surfaces:
- Sub-agents (agents calling agents). A harness agent is exposed as a tool
to another agent, so orchestration is literally a model calling a model
(
SubAgent,SubAgentSession,SubAgentTool). - Recursion policy + depth tracking. The runtime tracks
root_run_id/parent_run_id, enforces a recursion limit, and rolls child runs' events, usage, and cost up to the parent as first-class observable runs. - Graphs that run graphs. A node can embed another compiled graph, and the
.ragshREPL can drive a graph from inside a graph node (graph → REPL → graph). - The REPL as the RLM core. In
.ragsh, context and prompts are runtime values, not just prompt text. The model writes small programs, inspects their output, calls sub-models / sub-agents / sub-graphs as functions, and iterates — the RLM/CodeAct loop. - Self-authoring (the deepest recursion). A model can emit a
.ragblueprint that compiles through the same registry-bound compiler path as a human-authored file, then runs on the same runtime the model is already executing in. The harness can describe and re-enter itself.
Two languages, one runtime: .rag (declarative blueprint) and .ragsh
(imperative REPL) both lower into the exact same graph + harness types as
hand-written Rust — a language whose programs are the runtime that interprets
them.
Features
- Harness — provider-neutral model calls, typed tools, middleware, structured output, streaming, usage/cost accounting, retries and limits, response caching, memory/embeddings, summarization, steering, and a testkit.
- Graph runtime — LangGraph-style durable, typed state graphs:
START/END, nodes, edges, conditional routing, commands,Sendfanout, reducers/channels, checkpoints, interrupts, subgraphs, streaming, topology export, and time travel. - Registry — a named capability catalog (models, tools, agents, graphs,
stores, middleware, policy) that
.ragand.ragshbind by name. .ragexpressive language — a declarative, side-effect-free blueprint format that compiles (lexer → parser → compiler) into the runtime; the safe boundary for agent-authored plans..ragshREPL language — imperative, capability-bound interactive orchestration; the RLM/CodeAct loop surface.- Recursion & sub-agents — agents-as-tools, subgraphs, depth tracking, and a recursion policy so deep call trees stay bounded and observable.
- Durability & checkpoints — resume long runs, replay history, and travel back in time across superstep boundaries.
- Provider-neutral — one interface across hosted and local providers; swap models without rewriting workflows.
- Observability — normalized events, usage, and cost that roll up across recursive child runs.
- Structured output & streaming — typed responses and incremental token streams at the harness boundary.
Architecture
+-----------------------+ +-----------------------+
| .rag blueprint | | .ragsh REPL |
| declarative workflow | | imperative RLM loop |
+-----------+-----------+ +-----------+-----------+
\ /
\ compile / lower (by name) /
v v
+-------------+ +-------------------------------------------+
| Application |------->| Capability Registry |
| Rust code | | models | tools | agents | graphs | policy |
+------+------+ +---------------------+---------------------+
| |
| v
| +-------------------------------------------+
+------------->| Durable Graph Runtime |
| typed state | nodes | edges | checkpoints |
+---------------------+---------------------+
|
v
+-------------------------------------------+
| Agent Harness |
| prompts | tools | middleware | usage/cost |
+----+--------------------------+-----------+
| |
v v
+------------------+ +------------------+
| Model Providers | | Typed Tools |
| OpenAI/Anthropic | | local functions |
| Ollama/etc. | | external systems |
+------------------+ +------------------+
The recursion loop — agents call agents, and graphs run graphs:
+-------+
| START |
+---+---+
|
v
+-------------+ a sub-agent is just a tool,
| Agent Node | and a tool may itself be a
+------+------+ whole compiled graph...
|
+------+-------------------------+
| | |
needs tool calls sub-agent done
| | |
v v v
+-----------+ +---------------+ +-----+
| Tool Node | | SubAgent / | | END |
+-----+-----+ | Subgraph Node | +-----+
| +-------+-------+
| | depth +1, recursion policy,
| | child run rolls up usage/cost
+-- loops back --+--- re-enters the runtime ---+
to Agent Node (graph -> REPL -> graph)
Quick start
Add TinyAgents to your project:
[]
= "0.1"
The default build is offline. To enable hosted providers, turn on the openai
feature:
[]
= { = "0.1", = ["openai"] }
To explore locally:
OpenAI-backed examples need the feature flag and an API key:
Examples to explore
All live in examples/:
basic_graph— a minimal typed state graph:START, nodes, edges,END.complex_graph— conditional routing, fanout, and richer topology.durable_graph— checkpoints, resume, and time-travel over supersteps.agent_loop_tools— the agent ↔ tool loop the harness runs.orchestrator_subagents— recursion in action: an orchestrator agent that calls sub-agents as tools, with depth tracking and rolled-up usage.openai_self_blueprint— the deepest recursion: a model authors a.ragblueprint that is compiled and run on the same runtime.rag_blueprint— load and run a declarative.ragworkflow.openai_chat— a single provider-backed chat turn.openai_tools— tool calling against a hosted model.openai_structured— typed structured output.openai_graph_agent— a provider-backed agent driven inside a graph.
OpenAI-backed examples require --features openai and OPENAI_API_KEY.
Documentation
Contributors working directly in the repository should also read the checked-in
architecture specification under docs/spec/README.md.
Development
Contributing
TinyAgents welcomes focused contributions that improve the graph runtime,
harness contracts, the registry, the .rag / .ragsh languages, provider
adapters, tests, examples, and documentation.
Read CONTRIBUTING.md before opening a pull request.
License
TinyAgents is licensed under GPL-3.0-only.
Built by TinyHumans for the Rust agent ecosystem.