durable-execution-sdk 0.1.0-alpha2

AWS Durable Execution SDK for Lambda Rust Runtime
Documentation

AWS Durable Execution SDK for Rust

Build reliable, long-running workflows on AWS Lambda with automatic checkpointing and replay.

Overview

The AWS Durable Execution SDK enables Rust developers to build workflows that survive Lambda restarts, timeouts, and failures. Every operation is automatically checkpointed — if Lambda restarts, execution resumes from the last completed step. Replayed operations return cached results instantly. Workflows can run for up to one year, pausing and resuming seamlessly.

Installation

[dependencies]
durable-execution-sdk = "0.1"
tokio = { version = "1.0", features = ["full"] }
serde = { version = "1.0", features = ["derive"] }

Quick start

use durable_execution_sdk::{durable_execution, DurableContext, DurableError, Duration};
use serde::{Deserialize, Serialize};

#[derive(Deserialize)]
struct OrderEvent {
    order_id: String,
    amount: f64,
}

#[derive(Serialize)]
struct OrderResult {
    status: String,
    order_id: String,
}

#[durable_execution]
async fn process_order(event: OrderEvent, ctx: DurableContext) -> Result<OrderResult, DurableError> {
    // Step 1: Validate (automatically checkpointed)
    let is_valid: bool = ctx.step(|_| Ok(true), None).await?;

    // Step 2: Process payment (automatically checkpointed)
    let payment_id: String = ctx.step(|_| Ok("pay_123".to_string()), None).await?;

    // Step 3: Wait for confirmation (suspends Lambda, resumes later)
    ctx.wait(Duration::from_seconds(5), Some("payment_confirmation")).await?;

    Ok(OrderResult {
        status: "completed".to_string(),
        order_id: event.order_id,
    })
}

Core operations

Operation Description
step() / step_named() Execute and checkpoint a unit of work
wait() Pause execution for a duration (suspends Lambda)
wait_for_condition() Poll a condition with configurable retry
create_callback() / wait_for_callback() Wait for external systems to signal completion
invoke() Call other durable Lambda functions
map() Process collections in parallel with concurrency limits
parallel() Execute multiple independent operations concurrently
run_in_child_context() Create isolated nested workflows
all() / any() / race() / all_settled() Promise-style combinators

Step semantics

Two execution modes for different use cases:

  • AtLeastOncePerRetry (default) — Checkpoint after execution. May re-execute on interruption, but result is always persisted.
  • AtMostOncePerRetry — Checkpoint before execution. Guarantees at-most-once execution for non-idempotent operations.
use durable_execution_sdk::{StepConfig, StepSemantics};

let config = StepConfig {
    step_semantics: StepSemantics::AtMostOncePerRetry,
    ..Default::default()
};
let result = ctx.step(|_| Ok(42), Some(config)).await?;

Parallel processing

use durable_execution_sdk::MapConfig;

let results = ctx.map(
    vec![1, 2, 3, 4, 5],
    |child_ctx, item, _index| async move {
        child_ctx.step(|_| Ok(item * 2), None).await
    },
    Some(MapConfig {
        max_concurrency: Some(3),
        ..Default::default()
    }),
).await?;

Callbacks

use durable_execution_sdk::CallbackConfig;

// Recommended: wait_for_callback combines creation with replay-safe notification
let approval: ApprovalResponse = ctx.wait_for_callback(
    |callback_id| async move {
        notify_approver(&callback_id).await?;
        Ok::<(), Box<dyn std::error::Error + Send + Sync>>(())
    },
    Some(CallbackConfig {
        timeout: Duration::from_hours(24),
        ..Default::default()
    }),
).await?;

Retry strategies

use durable_execution_sdk::{StepConfig, ExponentialBackoff, FixedDelay, LinearBackoff};

let config = StepConfig {
    retry_strategy: Some(Box::new(ExponentialBackoff::new(3, 1000, 30000))),
    ..Default::default()
};

Error handling

use durable_execution_sdk::DurableError;

// Execution errors (workflow fails, no Lambda retry)
DurableError::execution("Invalid order");

// Invocation errors (triggers Lambda retry)
DurableError::invocation("Temporary failure");

Logging

ctx.log_info("Processing order started");
ctx.log_info_with("Processing order", &[
    ("order_id", "ORD-12345"),
    ("amount", "99.99"),
]);

All log messages automatically include durable_execution_arn, operation_id, parent_id, and is_replay for correlation.

Replay-safe helpers

use durable_execution_sdk::replay_safe::{uuid_string_from_operation, timestamp_from_execution};

// Deterministic UUID (same inputs = same UUID across replays)
let uuid = uuid_string_from_operation(&operation_id, 0);

// Replay-safe timestamp (execution start time, not wall clock)
let ts = timestamp_from_execution(ctx.state());

Duration support

use durable_execution_sdk::Duration;

let d = Duration::from_seconds(30);
let d = Duration::from_minutes(5);
let d = Duration::from_hours(2);
let d = Duration::from_days(7);
let d = Duration::from_weeks(2);   // 14 days
let d = Duration::from_months(3);  // 90 days
let d = Duration::from_years(1);   // 365 days

Execution limits

Limit Value
Maximum execution duration 1 year
Maximum wait duration 1 year
Minimum wait duration 1 second
Maximum response payload 6 MB
Maximum checkpoint payload 256 KB
Maximum history size 25,000 operations

Crate structure

Crate Description
durable-execution-sdk Core SDK with context, handlers, and client
durable-execution-sdk-macros #[durable_execution] proc macro
durable-execution-sdk-testing Local and cloud test runners

Documentation

License

Apache-2.0