
duroxide
Notice: This is an experimental project exploring how AI can help build and iteratively improve a moderately complex framework. It's intended for learning and fun, not for production use (yet).
Deterministic task orchestration in Rust, inspired by Durable Task.
Latest Release: v0.1.5 — Worker queue visibility control, reduced default long poll timeout. See CHANGELOG.md for release notes.
What you can build with this (inspired by .NET Durable Task/Durable Functions patterns)
- Function chaining: model a multi-step process as sequential awaits where each step depends on prior results.
- Fan-out/fan-in: schedule many activities in parallel and deterministically aggregate results.
- Human interaction (external events): wait for out-of-band approvals, callbacks, or webhooks and then resume.
- Durable timers and deadlines: sleep for minutes/hours/days without holding threads; resume exactly-once after timeouts. Use
ctx.schedule_timer()for orchestration-level delays. Activities can sleep/poll internally as part of their work (e.g., provisioning resources). - Saga-style compensation: on failure, run compensating actions to roll back prior steps.
- Built-in activity retry:
ctx.schedule_activity_with_retry()with configurable backoff (exponential, linear, fixed) and per-attempt timeouts.
These scenarios mirror the officially documented Durable Task/Durable Functions application patterns and are enabled here by deterministic replay, correlation IDs, durable timers, and external event handling.
Getting started samples
- Start here: See
docs/ORCHESTRATION-GUIDE.mdfor complete guide to writing workflows - Quick start: Run
cargo run --example hello_worldto see Duroxide in action - Advanced patterns: Check
tests/e2e_samples.rsfor comprehensive usage patterns - Provider implementation: See
docs/provider-implementation-guide.mdfor building custom providers - Provider testing: See
docs/provider-testing-guide.mdfor testing custom providers - Observability: See
docs/observability-guide.mdfor structured logging and metrics
What it is
- Deterministic orchestration core with correlated event IDs and replay safety
- Message-driven runtime built on Tokio, with two dispatchers:
- OrchestrationDispatcher (orchestrator queue) - processes workflow turns and durable timers
- WorkDispatcher (worker queue) - executes activities with automatic lock renewal
- Storage-agnostic via
Providertrait (SQLite provider with in-memory and file-based modes) - Configurable polling frequency, lock timeouts, and automatic lock renewal for long-running activities
How it works (brief)
- The orchestrator runs turn-by-turn. Each turn it is polled once, may schedule actions, then the runtime waits for completions.
- Every operation has a correlation id. Scheduling is recorded as history events (e.g.,
ActivityScheduled) and completions are matched by id (e.g.,ActivityCompleted). - The runtime appends new events and advances turns; work is routed through provider-backed queues (Orchestrator and Worker) using peek-lock semantics. Timers are handled via orchestrator queue with delayed visibility.
- Deterministic future aggregation:
ctx.select2,ctx.select(Vec), andctx.join(Vec)resolve by earliest completion index in history (not polling order). - Logging is replay-safe via
tracingand thectx.trace_*helpers (implemented as a deterministic system activity). We do not persist trace events in history. - Providers enforce a history cap (default 1024; tests use a smaller cap). If an append would exceed the cap, they return an error; the runtime fails the run to preserve determinism (no truncation).
Key types
OrchestrationContext: schedules work (schedule_activity,schedule_timer,schedule_wait,schedule_sub_orchestration,schedule_orchestration) and exposes deterministicselect2/select/join,trace_*,continue_as_new.DurableFuture: returned byschedule_*; useinto_activity(),into_timer(),into_event(),into_sub_orchestration()(and_typedvariants) to await.Event/Action: immutable history entries and host-side actions, includingContinueAsNew.Provider: persistence + queues abstraction with atomic operations and lock renewal (SqliteProviderwith in-memory and file-based modes).RuntimeOptions: configure concurrency, lock timeouts, and lock renewal buffer for long-running activities.OrchestrationRegistry/ActivityRegistry: register orchestrations/activities in-memory.
Project layout
src/lib.rs— orchestration primitives and single-turn executorsrc/runtime/— runtime, registries, workers, and polling enginesrc/providers/— SQLite history store with transactional supporttests/— unit and e2e tests (seee2e_samples.rsto learn by example)
Install (when published)
[]
= { = "0.1", = ["sqlite"] } # With bundled SQLite provider
# OR
= "0.1" # Core only, bring your own Provider implementation
Hello world (activities + runtime)
Note: This example uses the bundled SQLite provider. Enable the
sqlitefeature in yourCargo.toml.
use Arc;
use ;
use ;
use ActivityRegistry;
use SqliteProvider;
#
# async
Parallel fan-out (DTF-style greetings)
async
Control flow + timers + externals
use DurableOutput;
async
Error handling (Result<String, String>)
async
ContinueAsNew and multi-execution
- Use
ctx.continue_as_new(new_input)to end the current execution and immediately start a fresh execution with the provided input. - Providers keep all executions' histories in the SQLite database with full ACID transactional support.
- The initial
start_orchestrationhandle resolves with an empty success whenContinueAsNewoccurs; the latest execution can be observed via status APIs.
Status and control-plane
Client::get_orchestration_status(instance)->Result<OrchestrationStatus, ClientError>whereOrchestrationStatusis Running | Completed { output } | Failed { details: ErrorDetails } | NotFoundClient::wait_for_orchestration(instance, timeout)-> Wait for completion with timeout, returnsResult<OrchestrationStatus, ClientError>- SQLite provider exposes execution-aware methods (
list_executions,read_with_execution, etc.) for diagnostics.
Error classification
- Infrastructure errors: Provider failures, data corruption (retryable by runtime, abort turn)
- Configuration errors: Unregistered orchestrations/activities, nondeterminism (require deployment fix, abort turn)
- Application errors: Business logic failures (handled by orchestration code, propagate normally)
- Use
ErrorDetails::category()for metrics,is_retryable()for retry logic,display_message()for logging
Local development
- Build:
cargo build - Test everything:
cargo test --all -- --nocapture - Run a specific test:
cargo test --test e2e_samples sample_dtf_legacy_gabbar_greetings -- --nocapture
Stress testing
- Run stress tests:
./run-stress-tests.sh - Run with result tracking:
./run-stress-tests.sh --track(saves tostress-test-results.md) - Tracked results include commit history, performance metrics, and rolling averages
- See
sqlite-stress/README.mdfor details
Runtime Configuration
- Configure lock timeouts, concurrency, and polling via
RuntimeOptions - Worker lock renewal automatically enabled for long-running activities (no configuration needed)
- Example:
RuntimeOptions { worker_lock_timeout: Duration::from_secs(300), worker_lock_renewal_buffer: Duration::from_secs(30), ... } - Lock renewal happens at
(timeout - buffer)for timeouts ≥15s, or0.5 * timeoutfor shorter timeouts - See API docs for
RuntimeOptionsfor complete configuration options
Observability
- Enable structured logging:
RuntimeOptions { observability: ObservabilityConfig { log_format: LogFormat::Compact, ... }, ... } - Default logs show only orchestration/activity traces at configured level; runtime internals at warn+
- Override with
RUST_LOGfor additional targets (e.g.,RUST_LOG=duroxide::runtime=debug) - All logs include
instance_id,execution_id,orchestration_name,activity_namefor correlation - Optional OpenTelemetry metrics via
observabilityfeature flag - Run
cargo run --example with_observabilityto see structured logging in action - Run
cargo run --example metrics_clito see observability dashboard - See
docs/observability-guide.mdfor complete guide
Notes
- Import as
duroxidein Rust source. - Timers are real time (Tokio sleep). External events are via
Runtime::raise_event. - Unknown-instance messages are logged and dropped. Providers persist history only (queues are in-memory runtime components).
- Logging is replay-safe by treating it as a system activity via
ctx.trace_*helpers; logs are emitted through tracing at completion time (not persisted as history events).
Provenance
- This codebase was generated entirely by AI with explicit guidance and reviews from a human