Skip to main content

durable_execution_sdk/
lib.rs

1//! # AWS Durable Execution SDK for Lambda Rust Runtime
2//!
3//! This SDK enables Rust developers to build reliable, long-running workflows
4//! in AWS Lambda with automatic checkpointing, replay, and state management.
5//!
6//! ## Overview
7//!
8//! The AWS Durable Execution SDK provides a framework for building workflows that can
9//! survive Lambda function restarts, timeouts, and failures. It automatically checkpoints
10//! the state of your workflow, allowing it to resume exactly where it left off after
11//! any interruption.
12//!
13//! ### Key Features
14//!
15//! - **Automatic Checkpointing**: Every operation is automatically checkpointed, ensuring
16//!   your workflow can resume from the last completed step.
17//! - **Replay Mechanism**: When a function resumes, completed operations return their
18//!   checkpointed results instantly without re-execution.
19//! - **Concurrent Operations**: Process collections in parallel with configurable
20//!   concurrency limits and failure tolerance.
21//! - **External Integration**: Wait for callbacks from external systems with configurable
22//!   timeouts.
23//! - **Type Safety**: Full Rust type safety with generics, trait-based abstractions,
24//!   and newtype wrappers for domain identifiers.
25//! - **Promise Combinators**: Coordinate multiple durable promises with `all`, `any`, `race`, and `all_settled`.
26//! - **Replay-Safe Helpers**: Generate deterministic UUIDs and timestamps that are safe for replay.
27//! - **Configurable Checkpointing**: Choose between eager, batched, or optimistic checkpointing modes.
28//! - **Trait Aliases**: Cleaner function signatures with [`DurableValue`] and [`StepFn`] trait aliases.
29//! - **Sealed Traits**: Internal traits are sealed to allow API evolution without breaking changes.
30//!
31//! ## Important Documentation
32//!
33//! Before writing durable workflows, please read:
34//!
35//! - [`docs::determinism`]: **Critical** - Understanding determinism requirements for replay-safe workflows
36//! - [`docs::limits`]: Execution limits and constraints you need to know
37//!
38//! ## Getting Started
39//!
40//! Add the SDK to your `Cargo.toml`:
41//!
42//! ```toml
43//! [dependencies]
44//! durable-execution-sdk = "0.1"
45//! tokio = { version = "1.0", features = ["full"] }
46//! serde = { version = "1.0", features = ["derive"] }
47//! ```
48//!
49//! ### Basic Workflow Example
50//!
51//! Here's a simple workflow that processes an order:
52//!
53//! ```rust,ignore
54//! use durable_execution_sdk::{durable_execution, DurableContext, DurableError, Duration};
55//! use serde::{Deserialize, Serialize};
56//!
57//! #[derive(Deserialize)]
58//! struct OrderEvent {
59//!     order_id: String,
60//!     amount: f64,
61//! }
62//!
63//! #[derive(Serialize)]
64//! struct OrderResult {
65//!     status: String,
66//!     order_id: String,
67//! }
68//!
69//! #[durable_execution]
70//! async fn process_order(event: OrderEvent, ctx: DurableContext) -> Result<OrderResult, DurableError> {
71//!     // Step 1: Validate the order (checkpointed automatically)
72//!     let is_valid: bool = ctx.step(|_step_ctx| {
73//!         // Validation logic here
74//!         Ok(true)
75//!     }, None).await?;
76//!
77//!     if !is_valid {
78//!         return Err(DurableError::execution("Invalid order"));
79//!     }
80//!
81//!     // Step 2: Process payment (checkpointed automatically)
82//!     let payment_id: String = ctx.step(|_step_ctx| {
83//!         // Payment processing logic here
84//!         Ok("pay_123".to_string())
85//!     }, None).await?;
86//!
87//!     // Step 3: Wait for payment confirmation (suspends Lambda, resumes later)
88//!     ctx.wait(Duration::from_seconds(5), Some("payment_confirmation")).await?;
89//!
90//!     // Step 4: Complete the order
91//!     Ok(OrderResult {
92//!         status: "completed".to_string(),
93//!         order_id: event.order_id,
94//!     })
95//! }
96//! ```
97//!
98//! ## Core Concepts
99//!
100//! ### DurableContext
101//!
102//! The [`DurableContext`] is the main interface for durable operations. It provides:
103//!
104//! - [`step`](DurableContext::step): Execute and checkpoint a unit of work
105//! - [`wait`](DurableContext::wait): Pause execution for a specified duration
106//! - [`create_callback`](DurableContext::create_callback): Wait for external systems to signal completion
107//! - [`invoke`](DurableContext::invoke): Call other durable Lambda functions
108//! - [`map`](DurableContext::map): Process collections in parallel
109//! - [`parallel`](DurableContext::parallel): Execute multiple operations concurrently
110//! - [`run_in_child_context`](DurableContext::run_in_child_context): Create isolated nested workflows
111//!
112//! ### Steps
113//!
114//! Steps are the fundamental unit of work in durable executions. Each step is
115//! automatically checkpointed, allowing the workflow to resume from the last
116//! completed step after interruptions.
117//!
118//! ```rust,ignore
119//! // Simple step
120//! let result: i32 = ctx.step(|_| Ok(42), None).await?;
121//!
122//! // Named step for better debugging
123//! let result: String = ctx.step_named("fetch_data", |_| {
124//!     Ok("data".to_string())
125//! }, None).await?;
126//!
127//! // Step with custom configuration
128//! use durable_execution_sdk::{StepConfig, StepSemantics};
129//!
130//! let config = StepConfig {
131//!     step_semantics: StepSemantics::AtMostOncePerRetry,
132//!     ..Default::default()
133//! };
134//! let result: i32 = ctx.step(|_| Ok(42), Some(config)).await?;
135//! ```
136//!
137//! ### Step Semantics
138//!
139//! The SDK supports two execution semantics for steps:
140//!
141//! - **AtLeastOncePerRetry** (default): Checkpoint after execution. The step may
142//!   execute multiple times if interrupted, but the result is always checkpointed.
143//! - **AtMostOncePerRetry**: Checkpoint before execution. Guarantees the step
144//!   executes at most once per retry, useful for non-idempotent operations.
145//!
146//! ### Wait Operations
147//!
148//! Wait operations suspend the Lambda execution and resume after the specified
149//! duration. This is efficient because it doesn't block Lambda resources.
150//!
151//! ```rust,ignore
152//! use durable_execution_sdk::Duration;
153//!
154//! // Wait for 5 seconds
155//! ctx.wait(Duration::from_seconds(5), None).await?;
156//!
157//! // Wait for 1 hour with a name
158//! ctx.wait(Duration::from_hours(1), Some("wait_for_approval")).await?;
159//! ```
160//!
161//! ### Callbacks
162//!
163//! Callbacks allow external systems to signal your workflow. Create a callback,
164//! share the callback ID with an external system, and wait for the result.
165//!
166//! ```rust,ignore
167//! use durable_execution_sdk::CallbackConfig;
168//!
169//! // Create a callback with 24-hour timeout
170//! let callback = ctx.create_callback::<ApprovalResponse>(Some(CallbackConfig {
171//!     timeout: Duration::from_hours(24),
172//!     ..Default::default()
173//! })).await?;
174//!
175//! // Share callback.callback_id with external system
176//! notify_approver(&callback.callback_id).await?;
177//!
178//! // Wait for the callback result (suspends until callback is received)
179//! let approval = callback.result().await?;
180//! ```
181//!
182//! ### Parallel Processing
183//!
184//! Process collections in parallel with configurable concurrency and failure tolerance:
185//!
186//! ```rust,ignore
187//! use durable_execution_sdk::{MapConfig, CompletionConfig};
188//!
189//! // Process items with max 5 concurrent executions
190//! let results = ctx.map(
191//!     vec![1, 2, 3, 4, 5],
192//!     |child_ctx, item, index| async move {
193//!         child_ctx.step(|_| Ok(item * 2), None).await
194//!     },
195//!     Some(MapConfig {
196//!         max_concurrency: Some(5),
197//!         completion_config: CompletionConfig::all_successful(),
198//!         ..Default::default()
199//!     }),
200//! ).await?;
201//!
202//! // Get all successful results
203//! let values = results.get_results()?;
204//! ```
205//!
206//! ### Parallel Branches
207//!
208//! Execute multiple independent operations concurrently:
209//!
210//! ```rust,ignore
211//! use durable_execution_sdk::ParallelConfig;
212//!
213//! let results = ctx.parallel(
214//!     vec![
215//!         |ctx| Box::pin(async move { ctx.step(|_| Ok("a"), None).await }),
216//!         |ctx| Box::pin(async move { ctx.step(|_| Ok("b"), None).await }),
217//!         |ctx| Box::pin(async move { ctx.step(|_| Ok("c"), None).await }),
218//!     ],
219//!     None,
220//! ).await?;
221//! ```
222//!
223//! ### Promise Combinators
224//!
225//! The SDK provides promise combinators for coordinating multiple durable operations:
226//!
227//! ```rust,ignore
228//! use durable_execution_sdk::DurableContext;
229//!
230//! async fn coordinate_operations(ctx: &DurableContext) -> Result<(), DurableError> {
231//!     // Wait for ALL operations to complete successfully
232//!     let results = ctx.all(vec![
233//!         ctx.step(|_| Ok(1), None),
234//!         ctx.step(|_| Ok(2), None),
235//!         ctx.step(|_| Ok(3), None),
236//!     ]).await?;
237//!     // results = [1, 2, 3]
238//!
239//!     // Wait for ALL operations to settle (success or failure)
240//!     let batch_result = ctx.all_settled(vec![
241//!         ctx.step(|_| Ok("success"), None),
242//!         ctx.step(|_| Err("failure".into()), None),
243//!     ]).await;
244//!     // batch_result contains both success and failure outcomes
245//!
246//!     // Return the FIRST operation to settle (success or failure)
247//!     let first = ctx.race(vec![
248//!         ctx.step(|_| Ok("fast"), None),
249//!         ctx.step(|_| Ok("slow"), None),
250//!     ]).await?;
251//!
252//!     // Return the FIRST operation to succeed
253//!     let first_success = ctx.any(vec![
254//!         ctx.step(|_| Err("fail".into()), None),
255//!         ctx.step(|_| Ok("success"), None),
256//!     ]).await?;
257//!
258//!     Ok(())
259//! }
260//! ```
261//!
262//! ### Accessing Original Input
263//!
264//! Access the original input that started the execution:
265//!
266//! ```rust,ignore
267//! use serde::Deserialize;
268//!
269//! #[derive(Deserialize)]
270//! struct OrderEvent {
271//!     order_id: String,
272//!     amount: f64,
273//! }
274//!
275//! async fn my_workflow(ctx: DurableContext) -> Result<(), DurableError> {
276//!     // Get the original input that started this execution
277//!     let event: OrderEvent = ctx.get_original_input()?;
278//!     println!("Processing order: {}", event.order_id);
279//!     
280//!     // Or get the raw JSON string
281//!     if let Some(raw_input) = ctx.get_original_input_raw() {
282//!         println!("Raw input: {}", raw_input);
283//!     }
284//!     
285//!     Ok(())
286//! }
287//! ```
288//!
289//! ### Replay-Safe Helpers
290//!
291//! Generate deterministic values that are safe for replay:
292//!
293//! ```rust
294//! use durable_execution_sdk::replay_safe::{
295//!     uuid_from_operation, uuid_to_string, uuid_string_from_operation,
296//! };
297//!
298//! // Generate a deterministic UUID from an operation ID
299//! let operation_id = "my-operation-123";
300//! let uuid_bytes = uuid_from_operation(operation_id, 0);
301//! let uuid_string = uuid_to_string(&uuid_bytes);
302//!
303//! // Or use the convenience function
304//! let uuid = uuid_string_from_operation(operation_id, 0);
305//!
306//! // Same inputs always produce the same UUID
307//! let uuid2 = uuid_string_from_operation(operation_id, 0);
308//! assert_eq!(uuid, uuid2);
309//!
310//! // Different seeds produce different UUIDs
311//! let uuid3 = uuid_string_from_operation(operation_id, 1);
312//! assert_ne!(uuid, uuid3);
313//! ```
314//!
315//! For timestamps, use the execution start time instead of current time:
316//!
317//! ```rust,ignore
318//! use durable_execution_sdk::replay_safe::{
319//!     timestamp_from_execution, timestamp_seconds_from_execution,
320//! };
321//!
322//! async fn my_workflow(ctx: DurableContext) -> Result<(), DurableError> {
323//!     // Get replay-safe timestamp (milliseconds since epoch)
324//!     if let Some(timestamp_ms) = timestamp_from_execution(ctx.state()) {
325//!         println!("Execution started at: {} ms", timestamp_ms);
326//!     }
327//!     
328//!     // Or get seconds since epoch
329//!     if let Some(timestamp_secs) = timestamp_seconds_from_execution(ctx.state()) {
330//!         println!("Execution started at: {} seconds", timestamp_secs);
331//!     }
332//!     
333//!     Ok(())
334//! }
335//! ```
336//!
337//! **Important**: See [`docs::determinism`] for detailed guidance on writing replay-safe code.
338//!
339//! ### Wait Cancellation
340//!
341//! Cancel an active wait operation:
342//!
343//! ```rust,ignore
344//! async fn cancellable_workflow(ctx: DurableContext) -> Result<(), DurableError> {
345//!     // Start a long wait in a child context
346//!     let wait_op_id = ctx.next_operation_id();
347//!     
348//!     // In another branch, you can cancel the wait
349//!     ctx.cancel_wait(&wait_op_id).await?;
350//!     
351//!     Ok(())
352//! }
353//! ```
354//!
355//! ### Extended Duration Support
356//!
357//! The Duration type supports extended time periods:
358//!
359//! ```rust
360//! use durable_execution_sdk::Duration;
361//!
362//! // Standard durations
363//! let seconds = Duration::from_seconds(30);
364//! let minutes = Duration::from_minutes(5);
365//! let hours = Duration::from_hours(2);
366//! let days = Duration::from_days(7);
367//!
368//! // Extended durations
369//! let weeks = Duration::from_weeks(2);      // 14 days
370//! let months = Duration::from_months(3);    // 90 days (30 days per month)
371//! let years = Duration::from_years(1);      // 365 days
372//!
373//! assert_eq!(weeks.to_seconds(), 14 * 24 * 60 * 60);
374//! assert_eq!(months.to_seconds(), 90 * 24 * 60 * 60);
375//! assert_eq!(years.to_seconds(), 365 * 24 * 60 * 60);
376//! ```
377//!
378//! ## Type-Safe Identifiers (Newtypes)
379//!
380//! The SDK provides newtype wrappers for domain identifiers to prevent accidental
381//! mixing of different ID types at compile time. These types are available in the
382//! [`types`] module and re-exported at the crate root.
383//!
384//! ### Available Newtypes
385//!
386//! - [`OperationId`]: Unique identifier for an operation within a durable execution
387//! - [`ExecutionArn`]: Amazon Resource Name identifying a durable execution
388//! - [`CallbackId`]: Unique identifier for a callback operation
389//!
390//! ### Creating Newtypes
391//!
392//! ```rust
393//! use durable_execution_sdk::{OperationId, ExecutionArn, CallbackId};
394//!
395//! // From String or &str (no validation, for backward compatibility)
396//! let op_id = OperationId::from("op-123");
397//! let op_id2: OperationId = "op-456".into();
398//!
399//! // With validation (rejects empty strings)
400//! let op_id3 = OperationId::new("op-789").unwrap();
401//! assert!(OperationId::new("").is_err());
402//!
403//! // ExecutionArn validates ARN format
404//! let arn = ExecutionArn::new("arn:aws:lambda:us-east-1:123456789012:function:my-func:durable:abc123");
405//! assert!(arn.is_ok());
406//!
407//! // CallbackId for external system integration
408//! let callback_id = CallbackId::from("callback-xyz");
409//! ```
410//!
411//! ### Using Newtypes as Strings
412//!
413//! All newtypes implement `Deref<Target=str>` and `AsRef<str>` for convenient string access:
414//!
415//! ```rust
416//! use durable_execution_sdk::OperationId;
417//!
418//! let op_id = OperationId::from("op-123");
419//!
420//! // Use string methods directly via Deref
421//! assert!(op_id.starts_with("op-"));
422//! assert_eq!(op_id.len(), 6);
423//!
424//! // Use as &str via AsRef
425//! let s: &str = op_id.as_ref();
426//! assert_eq!(s, "op-123");
427//! ```
428//!
429//! ### Newtypes in Collections
430//!
431//! All newtypes implement `Hash` and `Eq`, making them suitable for use in `HashMap` and `HashSet`:
432//!
433//! ```rust
434//! use durable_execution_sdk::OperationId;
435//! use std::collections::HashMap;
436//!
437//! let mut results: HashMap<OperationId, String> = HashMap::new();
438//! results.insert(OperationId::from("op-1"), "success".to_string());
439//! results.insert(OperationId::from("op-2"), "pending".to_string());
440//!
441//! assert_eq!(results.get(&OperationId::from("op-1")), Some(&"success".to_string()));
442//! ```
443//!
444//! ### Serialization
445//!
446//! All newtypes use `#[serde(transparent)]` for seamless JSON serialization:
447//!
448//! ```rust
449//! use durable_execution_sdk::OperationId;
450//!
451//! let op_id = OperationId::from("op-123");
452//! let json = serde_json::to_string(&op_id).unwrap();
453//! assert_eq!(json, "\"op-123\""); // Serializes as plain string
454//!
455//! let restored: OperationId = serde_json::from_str(&json).unwrap();
456//! assert_eq!(restored, op_id);
457//! ```
458//!
459//! ## Trait Aliases
460//!
461//! The SDK provides trait aliases to simplify common trait bound combinations.
462//! These are available in the [`traits`] module and re-exported at the crate root.
463//!
464//! ### DurableValue
465//!
466//! [`DurableValue`] is a trait alias for types that can be durably stored and retrieved:
467//!
468//! ```rust
469//! use durable_execution_sdk::DurableValue;
470//! use serde::{Deserialize, Serialize};
471//!
472//! // DurableValue is equivalent to: Serialize + DeserializeOwned + Send
473//!
474//! // Any type implementing these traits automatically implements DurableValue
475//! #[derive(Debug, Clone, Serialize, Deserialize)]
476//! struct OrderResult {
477//!     order_id: String,
478//!     status: String,
479//! }
480//!
481//! // Use in generic functions for cleaner signatures
482//! fn process_result<T: DurableValue>(result: T) -> String {
483//!     serde_json::to_string(&result).unwrap_or_default()
484//! }
485//! ```
486//!
487//! ### StepFn
488//!
489//! [`StepFn`] is a trait alias for step function closures:
490//!
491//! ```rust
492//! use durable_execution_sdk::{StepFn, DurableValue};
493//! use durable_execution_sdk::handlers::StepContext;
494//!
495//! // StepFn<T> is equivalent to:
496//! // FnOnce(StepContext) -> Result<T, Box<dyn Error + Send + Sync>> + Send
497//!
498//! // Use in generic functions
499//! fn execute_step<T: DurableValue, F: StepFn<T>>(func: F) {
500//!     // func can be called with a StepContext
501//! }
502//!
503//! // Works with closures
504//! execute_step(|_ctx| Ok(42i32));
505//!
506//! // Works with named functions
507//! fn my_step(ctx: StepContext) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
508//!     Ok(format!("Processed by {}", ctx.operation_id))
509//! }
510//! execute_step(my_step);
511//! ```
512//!
513//! ## Sealed Traits and Factory Functions
514//!
515//! Some SDK traits are "sealed" - they cannot be implemented outside this crate.
516//! This allows the SDK to evolve without breaking external code. Sealed traits include:
517//!
518//! - [`Logger`]: For structured logging in durable executions
519//! - [`SerDes`]: For custom serialization/deserialization
520//!
521//! ### Custom Loggers
522//!
523//! Instead of implementing `Logger` directly, use the factory functions:
524//!
525//! ```rust
526//! use durable_execution_sdk::{custom_logger, simple_custom_logger, LogInfo};
527//!
528//! // Simple custom logger with a single function for all levels
529//! let logger = simple_custom_logger(|level, msg, info| {
530//!     println!("[{}] {}: {:?}", level, msg, info);
531//! });
532//!
533//! // Full custom logger with separate functions for each level
534//! let logger = custom_logger(
535//!     |msg, info| println!("[DEBUG] {}", msg),  // debug
536//!     |msg, info| println!("[INFO] {}", msg),   // info
537//!     |msg, info| println!("[WARN] {}", msg),   // warn
538//!     |msg, info| println!("[ERROR] {}", msg),  // error
539//! );
540//! ```
541//!
542//! ### Custom Serializers
543//!
544//! Instead of implementing `SerDes` directly, use the factory function:
545//!
546//! ```rust
547//! use durable_execution_sdk::serdes::{custom_serdes, SerDesContext, SerDesError};
548//!
549//! // Create a custom serializer for a specific type
550//! let serdes = custom_serdes::<String, _, _>(
551//!     |value, _ctx| Ok(format!("custom:{}", value)),  // serialize
552//!     |data, _ctx| {                                   // deserialize
553//!         data.strip_prefix("custom:")
554//!             .map(|s| s.to_string())
555//!             .ok_or_else(|| SerDesError::deserialization("Invalid format"))
556//!     },
557//! );
558//! ```
559//!
560//! ## Configuration Types
561//!
562//! The SDK provides type-safe configuration for all operations:
563//!
564//! - [`StepConfig`]: Configure retry strategy, execution semantics, and serialization
565//! - [`CallbackConfig`]: Configure timeout and heartbeat for callbacks
566//! - [`InvokeConfig`]: Configure timeout and serialization for function invocations
567//! - [`MapConfig`]: Configure concurrency, batching, and completion criteria for map operations
568//! - [`ParallelConfig`]: Configure concurrency and completion criteria for parallel operations
569//! - [`CompletionConfig`]: Define success/failure criteria for concurrent operations
570//!
571//! ### Completion Configuration
572//!
573//! Control when concurrent operations complete:
574//!
575//! ```rust
576//! use durable_execution_sdk::CompletionConfig;
577//!
578//! // Complete when first task succeeds
579//! let first = CompletionConfig::first_successful();
580//!
581//! // Wait for all tasks to complete (regardless of success/failure)
582//! let all = CompletionConfig::all_completed();
583//!
584//! // Require all tasks to succeed (zero failure tolerance)
585//! let strict = CompletionConfig::all_successful();
586//!
587//! // Custom: require at least 3 successes
588//! let custom = CompletionConfig::with_min_successful(3);
589//! ```
590//!
591//! ## Error Handling
592//!
593//! The SDK provides a comprehensive error hierarchy through [`DurableError`]:
594//!
595//! - **Execution**: Errors that return FAILED status without Lambda retry
596//! - **Invocation**: Errors that trigger Lambda retry
597//! - **Checkpoint**: Checkpoint failures (retriable or non-retriable)
598//! - **Callback**: Callback-specific failures
599//! - **NonDeterministic**: Replay mismatches (operation type changed between runs)
600//! - **Validation**: Invalid configuration or arguments
601//! - **SerDes**: Serialization/deserialization failures
602//! - **Suspend**: Signal to pause execution and return control to Lambda
603//!
604//! ```rust,ignore
605//! use durable_execution_sdk::DurableError;
606//!
607//! // Create specific error types
608//! let exec_error = DurableError::execution("Something went wrong");
609//! let validation_error = DurableError::validation("Invalid input");
610//!
611//! // Check error properties
612//! if error.is_retriable() {
613//!     // Handle retriable error
614//! }
615//! ```
616//!
617//! ## Custom Serialization
618//!
619//! The SDK uses JSON serialization by default, but you can provide custom
620//! serializers by implementing the [`SerDes`] trait:
621//!
622//! ```rust,ignore
623//! use durable_execution_sdk::serdes::{SerDes, SerDesContext, SerDesError};
624//!
625//! struct MyCustomSerDes;
626//!
627//! impl SerDes<MyType> for MyCustomSerDes {
628//!     fn serialize(&self, value: &MyType, context: &SerDesContext) -> Result<String, SerDesError> {
629//!         // Custom serialization logic
630//!         Ok(format!("{:?}", value))
631//!     }
632//!
633//!     fn deserialize(&self, data: &str, context: &SerDesContext) -> Result<MyType, SerDesError> {
634//!         // Custom deserialization logic
635//!         todo!()
636//!     }
637//! }
638//! ```
639//!
640//! ## Logging and Tracing
641//!
642//! The SDK integrates with the `tracing` crate for structured logging. All operations
643//! automatically include execution context (ARN, operation ID, parent ID) in log messages.
644//!
645//! For detailed guidance on configuring tracing for Lambda, log correlation, and best practices,
646//! see the [TRACING.md](../docs/TRACING.md) documentation.
647//!
648//! ### Simplified Logging API
649//!
650//! The [`DurableContext`] provides convenience methods for logging with automatic context:
651//!
652//! ```rust,ignore
653//! use durable_execution_sdk::{DurableContext, DurableError};
654//!
655//! async fn my_workflow(ctx: DurableContext) -> Result<(), DurableError> {
656//!     // Basic logging - context is automatically included
657//!     ctx.log_info("Starting order processing");
658//!     ctx.log_debug("Validating input parameters");
659//!     ctx.log_warn("Retry attempt 2 of 5");
660//!     ctx.log_error("Failed to process payment");
661//!     
662//!     // Logging with extra fields for filtering
663//!     ctx.log_info_with("Processing order", &[
664//!         ("order_id", "ORD-12345"),
665//!         ("customer_id", "CUST-789"),
666//!     ]);
667//!     
668//!     ctx.log_error_with("Payment failed", &[
669//!         ("error_code", "INSUFFICIENT_FUNDS"),
670//!         ("amount", "150.00"),
671//!     ]);
672//!     
673//!     Ok(())
674//! }
675//! ```
676//!
677//! All logging methods automatically include:
678//! - `durable_execution_arn`: The execution ARN for correlation
679//! - `parent_id`: The parent operation ID (for nested operations)
680//! - `is_replay`: Whether the operation is being replayed
681//!
682//! ### Extra Fields in Log Output
683//!
684//! Extra fields passed to `log_*_with` methods are included in the tracing output
685//! as key-value pairs, making them queryable in log aggregation systems like CloudWatch:
686//!
687//! ```rust,ignore
688//! // This log message...
689//! ctx.log_info_with("Order event", &[
690//!     ("event_type", "ORDER_CREATED"),
691//!     ("order_id", "ORD-123"),
692//! ]);
693//!
694//! // ...produces JSON output like:
695//! // {
696//! //   "message": "Order event",
697//! //   "durable_execution_arn": "arn:aws:...",
698//! //   "extra": "event_type=ORDER_CREATED, order_id=ORD-123",
699//! //   ...
700//! // }
701//! ```
702//!
703//! ### Replay-Aware Logging
704//!
705//! The SDK supports replay-aware logging that can suppress or filter logs during replay.
706//! This is useful to reduce noise when replaying previously executed operations.
707//!
708//! ```rust
709//! use durable_execution_sdk::{TracingLogger, ReplayAwareLogger, ReplayLoggingConfig};
710//! use std::sync::Arc;
711//!
712//! // Suppress all logs during replay (default)
713//! let logger = ReplayAwareLogger::suppress_replay(Arc::new(TracingLogger));
714//!
715//! // Allow only errors during replay
716//! let logger_errors = ReplayAwareLogger::new(
717//!     Arc::new(TracingLogger),
718//!     ReplayLoggingConfig::ErrorsOnly,
719//! );
720//!
721//! // Allow all logs during replay
722//! let logger_all = ReplayAwareLogger::allow_all(Arc::new(TracingLogger));
723//! ```
724//!
725//! ### Custom Logger
726//!
727//! You can also provide a custom logger using the factory functions:
728//!
729//! ```rust
730//! use durable_execution_sdk::{custom_logger, simple_custom_logger, LogInfo};
731//!
732//! // Simple custom logger with a single function for all levels
733//! let logger = simple_custom_logger(|level, msg, info| {
734//!     println!("[{}] {}: {:?}", level, msg, info);
735//! });
736//!
737//! // Full custom logger with separate functions for each level
738//! let logger = custom_logger(
739//!     |msg, info| println!("[DEBUG] {}", msg),  // debug
740//!     |msg, info| println!("[INFO] {}", msg),   // info
741//!     |msg, info| println!("[WARN] {}", msg),   // warn
742//!     |msg, info| println!("[ERROR] {}", msg),  // error
743//! );
744//! ```
745//!
746//! ## Duration Type
747//!
748//! The SDK provides a [`Duration`] type with convenient constructors:
749//!
750//! ```rust
751//! use durable_execution_sdk::Duration;
752//!
753//! let five_seconds = Duration::from_seconds(5);
754//! let two_minutes = Duration::from_minutes(2);
755//! let one_hour = Duration::from_hours(1);
756//! let one_day = Duration::from_days(1);
757//!
758//! assert_eq!(five_seconds.to_seconds(), 5);
759//! assert_eq!(two_minutes.to_seconds(), 120);
760//! assert_eq!(one_hour.to_seconds(), 3600);
761//! assert_eq!(one_day.to_seconds(), 86400);
762//! ```
763//!
764//! ## Thread Safety
765//!
766//! The SDK is designed for use in async Rust with Tokio. All core types are
767//! `Send + Sync` and can be safely shared across async tasks:
768//!
769//! - [`DurableContext`] uses `Arc` for shared state
770//! - [`ExecutionState`] uses `RwLock` and atomic operations for thread-safe access
771//! - Operation ID generation uses atomic counters
772//!
773//! ## Best Practices
774//!
775//! 1. **Keep steps small and focused**: Each step should do one thing well.
776//!    This makes debugging easier and reduces the impact of failures.
777//!
778//! 2. **Use named operations**: Named steps and waits make logs and debugging
779//!    much easier to understand.
780//!
781//! 3. **Handle errors appropriately**: Use `DurableError::execution` for errors
782//!    that should fail the workflow, and `DurableError::invocation` for errors
783//!    that should trigger a retry.
784//!
785//! 4. **Consider idempotency**: For operations that may be retried, ensure they
786//!    are idempotent or use `AtMostOncePerRetry` semantics.
787//!
788//! 5. **Use appropriate concurrency limits**: When using `map` or `parallel`,
789//!    set `max_concurrency` to avoid overwhelming downstream services.
790//!
791//! 6. **Set reasonable timeouts**: Always configure timeouts for callbacks and
792//!    invocations to prevent workflows from hanging indefinitely.
793//!
794//! 7. **Ensure determinism**: Your workflow must execute the same sequence of
795//!    operations on every run. Avoid using `HashMap` iteration, random numbers,
796//!    or current time outside of steps. See [`docs::determinism`] for details.
797//!
798//! 8. **Use replay-safe helpers**: When you need UUIDs or timestamps, use the
799//!    helpers in [`replay_safe`] to ensure consistent values across replays.
800//!
801//! 9. **Use type-safe identifiers**: Prefer [`OperationId`], [`ExecutionArn`], and
802//!    [`CallbackId`] over raw strings to catch type mismatches at compile time.
803//!
804//! 10. **Use trait aliases**: Use [`DurableValue`] and [`StepFn`] in your generic
805//!     functions for cleaner, more maintainable signatures.
806//!
807//! ## Result Type Aliases
808//!
809//! The SDK provides semantic result type aliases for cleaner function signatures:
810//!
811//! - [`DurableResult<T>`](DurableResult): Alias for `Result<T, DurableError>` - general durable operations
812//! - [`StepResult<T>`](StepResult): Alias for `Result<T, DurableError>` - step operation results
813//! - [`CheckpointResult<T>`](CheckpointResult): Alias for `Result<T, DurableError>` - checkpoint operation results
814//!
815//! ```rust,ignore
816//! use durable_execution_sdk::{DurableResult, StepResult, DurableError};
817//!
818//! // Use in function signatures for clarity
819//! fn process_order(order_id: &str) -> DurableResult<String> {
820//!     Ok(format!("Processed: {}", order_id))
821//! }
822//!
823//! fn execute_step() -> StepResult<i32> {
824//!     Ok(42)
825//! }
826//! ```
827//!
828//! ## Module Organization
829//!
830//! - [`client`]: Lambda service client for checkpoint operations
831//! - [`concurrency`]: Concurrent execution types (BatchResult, ConcurrentExecutor)
832//! - [`config`]: Configuration types for all operations
833//! - [`context`]: DurableContext, operation identifiers, and logging (includes factory functions
834//!   [`custom_logger`] and [`simple_custom_logger`] for creating custom loggers)
835//! - [`docs`]: **Documentation modules** - determinism requirements and execution limits
836//!   - [`docs::determinism`]: Understanding determinism for replay-safe workflows
837//!   - [`docs::limits`]: Execution limits and constraints
838//! - [`duration`]: Duration type with convenient constructors
839//! - [`error`]: Error types, error handling, and result type aliases ([`DurableResult`], [`StepResult`], [`CheckpointResult`])
840//! - [`handlers`]: Operation handlers (step, wait, callback, etc.)
841//! - [`lambda`]: Lambda integration types (input/output)
842//! - [`operation`]: Operation types and status enums (optimized with `#[repr(u8)]` for compact memory layout)
843//! - [`replay_safe`]: Replay-safe helpers for deterministic UUIDs and timestamps
844//! - [`serdes`]: Serialization/deserialization system (includes [`custom_serdes`] factory function)
845//! - [`state`]: Execution state and checkpointing system
846//! - [`traits`]: Trait aliases for common bounds ([`DurableValue`], [`StepFn`])
847//! - [`types`]: Type-safe newtype wrappers for domain identifiers ([`OperationId`], [`ExecutionArn`], [`CallbackId`])
848
849pub mod client;
850pub mod concurrency;
851pub mod config;
852pub mod context;
853pub mod docs;
854pub mod duration;
855pub mod error;
856pub mod handlers;
857pub mod lambda;
858#[macro_use]
859pub mod macros;
860pub mod operation;
861pub mod replay_safe;
862pub mod retry_presets;
863pub mod runtime;
864pub mod serdes;
865pub mod state;
866pub mod structured_json_logger;
867pub mod summary_generators;
868pub mod termination;
869pub mod traits;
870pub mod types;
871
872// Private module for sealed trait pattern
873mod sealed;
874
875// Re-export main types at crate root
876pub use client::{
877    CheckpointResponse, DurableServiceClient, GetOperationsResponse, LambdaClientConfig,
878    LambdaDurableServiceClient, SharedDurableServiceClient,
879};
880pub use config::*;
881pub use context::{
882    custom_logger, generate_operation_id, simple_custom_logger, CustomLogger, DurableContext,
883    LogInfo, Logger, OperationIdGenerator, OperationIdentifier, ReplayAwareLogger,
884    ReplayLoggingConfig, TracingLogger, WaitForConditionConfig, WaitForConditionContext,
885};
886pub use duration::Duration;
887pub use error::{
888    AwsError, CheckpointResult, DurableError, DurableResult, ErrorObject, StepResult,
889    TerminationReason,
890};
891pub use lambda::{
892    DurableExecutionInvocationInput, DurableExecutionInvocationOutput, InitialExecutionState,
893    InvocationStatus,
894};
895pub use operation::{
896    CallbackDetails, CallbackOptions, ChainedInvokeDetails, ChainedInvokeOptions, ContextDetails,
897    ContextOptions, ExecutionDetails, Operation, OperationAction, OperationStatus, OperationType,
898    OperationUpdate, StepDetails, StepOptions, WaitDetails, WaitOptions,
899};
900pub use serdes::{custom_serdes, CustomSerDes, JsonSerDes, SerDes, SerDesContext, SerDesError};
901pub use state::{
902    create_checkpoint_queue, CheckpointBatcher, CheckpointBatcherConfig, CheckpointRequest,
903    CheckpointSender, CheckpointedResult, ExecutionState, ReplayStatus,
904};
905
906// Re-export newtype wrappers for domain identifiers
907pub use types::{CallbackId, ExecutionArn, OperationId, ValidationError};
908
909// Re-export trait aliases for common bounds
910pub use traits::{DurableValue, StepFn};
911
912// Re-export concurrency types
913pub use concurrency::{
914    BatchItem, BatchItemStatus, BatchResult, CompletionReason, ConcurrentExecutor,
915    ExecutionCounters,
916};
917
918// Re-export handlers
919pub use handlers::{
920    all_handler, all_settled_handler, any_handler, callback_handler, child_handler, invoke_handler,
921    map_handler, parallel_handler, race_handler, step_handler, wait_cancel_handler, wait_handler,
922    Callback, StepContext,
923};
924
925// Re-export replay-safe helpers
926pub use replay_safe::{
927    timestamp_from_execution, timestamp_seconds_from_execution, uuid_from_operation,
928    uuid_string_from_operation, uuid_to_string,
929};
930
931// Re-export runtime support
932pub use runtime::run_durable_handler;
933
934// Re-export termination manager
935pub use termination::TerminationManager;
936
937// Re-export structured JSON logger
938pub use structured_json_logger::{JsonLogContext, LogLevel, StructuredJsonLogger};
939
940// Re-export macro if enabled
941#[cfg(feature = "macros")]
942pub use durable_execution_sdk_macros::durable_execution;