langcontinuation
langcontinuation is a continuation-passing workflow engine for durable Rust
programs, especially AI agent systems that need to call models, run
client-side tools, wait for people, fork into concurrent branches, serialize
their exact state, and resume later.
It is not a large agent framework. The crate gives you a small set of durable control-flow primitives and leaves prompts, tools, storage, scheduling, and provider policy in your code. When a workflow matters enough to checkpoint, inspect, replay, or route through provider batch APIs, the control flow should be visible.
At A Glance
- Workflows are ordinary Rust functions made durable through explicit continuations.
- A [
Workflow] stores a stable run id, JSON environment, current step, and continuation stack. - A [
Trampoline] runs local steps until the workflow halts or reaches a suspension boundary. - A [
WorkflowResult] is the runtime boundary for model calls, tool calls, human input, fork/join, and custom schedulers. - The live executor is available by default. The Postgres-backed batch executor
is behind the
batchfeature.
Use langcontinuation when you want the building blocks behind an agent
runtime without handing over the shape of the application. If all you need is a
single prompt sent to one provider, call the provider API directly. If you need
durable chains, tool loops, human review, branch isolation, retries, or a
choice between live and batch execution, those concerns are first-class here.
Status Matrix
| Boundary | Built-in live executor | batch feature |
Custom runtime |
|---|---|---|---|
| Local Rust calls | Runs through [Trampoline] |
Runs through [Trampoline] |
Runs through [Trampoline] |
| Anthropic messages | Supported with registered [Clients] |
Supported through Anthropic Message Batches | Handle [WorkflowResult::Anthropic] |
| Client-side tools | Runs registered [Tool] values inline |
Runs registered [Tool] values inline |
Handle [WorkflowResult::ToolCall] |
| Human input | Stops with human-input-required |
Persists blocked continuation rows | Handle [WorkflowResult::Human] |
| Fork/join | Runs branches concurrently | Persists parent and branch workflows | Handle [WorkflowResult::ForkJoin] |
| OpenAI | Not implemented | Persists blocked continuation rows | Handle [WorkflowResult::OpenAI] |
OpenAI is represented as a workflow suspension point, but bundled live OpenAI
execution is intentionally not implemented yet. Custom runtimes can recognize
[WorkflowResult::OpenAI] and resume with [Trampoline::resume_open_ai].
Quick Start
Run the local example, which does not require a model provider:
Run the test suite:
Run the minimal tool-calling agent live. The live model examples use the
claudius Anthropic client and expect ANTHROPIC_API_KEY in the environment:
Run the larger fork/join tool-calling demo live:
Or, with DATABASE_URL configured, run that same workflow through the
Postgres-backed batch executor:
Durable Human Example
This example shows the durability boundary without requiring a model API key.
The workflow suspends for human review, the paused [Workflow] is serialized,
then it is deserialized and resumed with the answer.
use ;
generate_goto!
generate_goto!
async
Execution Model
The crate treats a workflow as ordinary Rust control flow made explicit. A
registered function receives a [Workflow], reads typed inputs from a JSON
environment, writes typed outputs back into that environment, and returns a
[ContinuationChoice] describing what should happen next. A [Trampoline]
executes local calls until it reaches a boundary that needs another actor:
halting, an inference request, human input, a client-side tool call, or a
fork/join point.
A [Workflow] contains four durable pieces of state:
- a stable run id,
- a JSON environment,
- the current active or suspended step,
- a LIFO continuation stack.
The state derives Serde traits, so it can be serialized and deserialized through
Serde-compatible formats and storage systems. The environment values themselves
are stored as serde_json::Value; persistence of the surrounding workflow is
not limited to JSON as long as the chosen Serde format preserves the values the
workflow needs.
The important separation is between program shape and execution policy. The
same continuation-passing interface can drive live inference APIs through
[live::Executor], or, with a scheduler around [WorkflowResult], decompose
the same program into batch execution. In the batch shape, inference requests
can be collected, submitted through provider batch APIs, and resumed later,
which makes it possible to use lower-cost batch execution where a provider
offers it.
Agent System Patterns
If you are coming from LangChain-style systems, the mapping is direct but lower level. There is no hidden agent loop: each transition is a continuation choice, and each external dependency appears as a workflow result that a runtime can perform, persist, retry, or route elsewhere.
| Application pattern | langcontinuation primitive |
|---|---|
| Sequential chain | Registered functions plus [Continuation::call] |
| Durable memory | The serialized [Workflow] environment |
| Model call | [Continuation::anthropic] or a custom runtime around [WorkflowResult::OpenAI] |
| Client-side tool loop | [Continuation::tool_call] with tools registered through [Trampoline::register_tool], often driven by [dispatch_tool_uses] |
| Human approval | [Continuation::human] and [Trampoline::resume_human] |
| Parallel specialist agents | [Continuation::fork_join] with isolated branch workflows |
| Checkpointing | Persist the [Workflow] returned in any [WorkflowResult] |
| Live or batch execution | Choose [live::Executor], [batch::Executor], or your own scheduler |
That gives agent applications a compact shape: local Rust functions own business logic, continuations mark durable suspension points, and executors decide when and where external work happens. A support automation flow, a document analysis job, a human-reviewed coding assistant, or a parallel research pipeline can all use the same state machine.
Typed Environment
The environment is typed at the boundary. [Workflow::into_env] serializes a
Rust value into JSON, and [Workflow::from_env] deserializes it back into the
type requested by the caller.
The macros add a naming convention on top of that map:
- [
push_env!] writes a key from the variable name and type spelling. - [
from_env!] reads the same kind of key. - [
generate_goto!] reads function arguments from those generated keys.
For example, push_env!(workflow.answer: String = value) writes the key
"answer: String". This is part of the public contract. Different type-token
spellings can produce different keys.
Suspension And Resumption
Continuation choices make the next step explicit:
- [
Continuation::goto] continues with whatever is already on the continuation stack. - [
Continuation::halt] clears pending work and ends the workflow. - [
Continuation::call] schedules another named function. - [
Continuation::anthropic] records an Anthropic request and the function that should receive its response. - [
Continuation::human] records a [HumanRequest] and the function that should receive the human answer. - [
Continuation::tool_call] records client-side tool_use blocks and the function that should receive the tool results. - [
Continuation::fork_join] creates two independent branches and rejoins them through a named function.
Anthropic is first-class in the current live executor. Register providers with
[Clients::register_anthropic] or [live::Executor::register_anthropic], and
the executor will send the request and resume the workflow with the returned
message.
Human input is first-class at the trampoline boundary. A scheduler can match
[WorkflowResult::Human], persist the paused workflow, show the
[HumanRequest] to an operator, and later call [Trampoline::resume_human] with
the answer. The live executor does not choose a human-routing policy; it returns
human-input-required when a workflow reaches a human request.
Client-side tool calls are first-class. Register a [Tool] with
[Trampoline::register_tool] under the name the model uses. When an Anthropic
response calls tools, the receiver raises a [Continuation::tool_call]
suspension, usually through [dispatch_tool_uses]. The runtime resolves each
tool_use by name, runs it, and resumes with the Vec<ToolResultBlock> stored
under the output key. Each tool receives a [ToolCallId] derived from the run id
and the tool_use id; the value is durable and replay-deterministic, so a
side-effecting tool can skip work it already performed.
The tool contract is the Anthropic wire contract (ToolUseBlock in,
ToolResultBlock out): the crate makes no filesystem assumption, and the
receiver owns the conversation history. Each model round-trip and each batch of
tool calls is a visible, checkpointable boundary rather than a hidden in-memory
loop.
Examples
examples/durable_tool_call.rs is the smallest
end-to-end tool-calling agent. It registers a local calculator [Tool], asks
Anthropic a question, dispatches tool_use blocks through the trampoline, and
loops until the model returns plain text.
examples/tool_calling_and_web_searching.rs
shows the larger agent pattern the crate is meant to support:
- the entrypoint forks into two independent research branches,
- one branch researches LangChain with Anthropic web search,
- the other branch inspects this repository through a client-side text editor tool,
- both branches write required artifacts into
/work, - the join step receives both reports as typed environment values and asks the model to synthesize a final recommendation.
The text editor is a single [Tool] implementation registered once with
[Trampoline::register_tool]. It owns filesystem policy, mounts /repo as
read-only, mounts /work as writable scratch space, and returns Anthropic
ToolResultBlock values to the workflow. The workflow functions own the
conversation and continuation choices; the tool owns local side effects.
Batch Execution
The optional batch feature adds [batch::Executor], a SQLx/Postgres-backed
executor for Anthropic Message Batches. Callers provide a configured
sqlx::PgPool, run [batch::migrate], register the same workflow functions
they would use with the live executor, enqueue workflows, and call
[batch::Executor::run].
use langcontinuation::{Trampoline, Workflow, batch};
use sqlx::PgPool;
async fn run_batch(
pool: PgPool,
trampoline: Trampoline,
workflow: Workflow,
) -> Result<(), handled::SError> {
batch::migrate(&pool).await?;
let executor = batch::Executor::with_default_config(trampoline, pool);
executor.enqueue_workflow(workflow).await?;
executor.run().await?;
Ok(())
}
run is intentionally ephemeral: it is a loop around cancellation-safe
[batch::Executor::poll], so the future can be dropped and recreated later
from the same database. The database stores workflows, external continuation
rows, provider batch ids, provider message ids, usage, failures, timestamps,
and quiescent markers for external cleanup. Human and low-level OpenAI
suspensions are persisted as blocked continuation rows and can be resumed by
their generated continuation ids.
Batch submission uses a Nagle-style policy. If a provider has no outstanding
batch, ready Anthropic requests flush immediately. If a provider batch is in
flight, pending requests are held until either 100 requests accumulate or the
oldest pending request is five minutes old. These defaults are configurable in
[batch::Config].
For the batch observability and event-ledger design, see
docs/batching.md.
Fork/Join
Fork/join branches inherit the parent environment and start at caller-provided
function names. Each branch is a separate [Workflow] with its own run id. To
resume the parent, both branches must halt.
This is the continuation-native form of a parallel agent pattern. A coordinator can split work into two specialists, let each branch call models or tools independently, and rejoin only when both have produced durable state. The live executor runs the branches concurrently. The batch executor stores the parent, tracks the children, and resumes the join when both branches halt.
Merging compares each branch against the original parent environment:
- one-sided changes are accepted,
- identical writes from both sides are accepted,
- conflicting writes to the same key are rejected.
This gives branch code a simple rule: independent keys compose, shared keys must intentionally converge.
Production Shape
A production runtime usually has four small pieces:
- a [
Trampoline] with all local workflow functions registered, - durable storage for serialized [
Workflow] values, - provider or human adapters that consume [
WorkflowResult] values, - observability around each suspension, resume, retry, and merge.
The bundled [live::Executor] is the simplest policy: send Anthropic requests
immediately, run fork branches concurrently, and return when the workflow halts.
The optional [batch::Executor] is the durable queue policy for Anthropic
Message Batches: store work in Postgres, flush provider batches according to
configuration, and resume workflows as results arrive.
Custom runtimes can sit at the same boundary. For example, a service can store
[WorkflowResult::Human] rows until an operator approves them, or implement
OpenAI execution by recognizing [WorkflowResult::OpenAI] and resuming with
[Trampoline::resume_open_ai].
Known Limits
The current macros encode keys from identifiers and type-token spelling. That keeps examples compact, but it is a string convention rather than a type-level schema.
[generate_goto!] currently supports the forms implemented in the macro: one
or two typed environment inputs plus a continuation argument.
The live executor handles Anthropic calls and fork/join. OpenAI live execution is intentionally blocked until the crate has a suitable Rust OpenAI client.
The batch executor assumes a single worker process per database queue. Its database transitions are recoverable, but provider submission is at-least-once: if an Anthropic batch is created and the future is canceled before the provider batch id is committed, an orphan provider batch can exist and the saved requests may be submitted again.
Client-side tool calls run inline in both executors, because tools are local
Rust code rather than provider-batchable work. The batch executor therefore does
not checkpoint a workflow between running its tools and resuming; if the worker
crashes mid-tool, the workflow replays from its last persisted Anthropic resume
and the tools execute again. Tool execution is thus at-least-once. Tools that
have side effects should be idempotent with respect to their [ToolCallId],
which is stable across replay for exactly this purpose.