queue-runtime 0.2.0

# ADR-003: Async-First Design


Status: Accepted
Date: 2026-01-24
Owners: queue-runtime team

## Context


The queue-runtime library is designed for high-throughput, scalable event processing in bot applications that must handle 1000+ messages/second. I/O operations (network calls, queue operations) are inherently asynchronous and non-blocking.

Design options:

- Blocking APIs with async wrapper - forces unnecessary thread allocation, reduces scalability
- Async APIs only - enables true non-blocking I/O, fits modern Rust ecosystem
- Hybrid (blocking + async) - doubles API surface, harder to maintain consistency

The Rust ecosystem has standardized on `tokio` for async runtime and `async-trait` for async trait methods.

## Decision


**All I/O operations must be asynchronous.**

- All `QueueProvider` and `SessionProvider` trait methods are `async`
- All client-facing APIs return `Future` types
- Use `#[async_trait]` from the `async-trait` crate for trait methods
- Require `tokio` runtime for execution (minimum required feature)
- No blocking I/O operations in async context (no `.blocking_*` without explicit separation)

Applications use `#[tokio::main]` or similar for the async runtime.

## Consequences


**Enables:**

- Single thread can handle thousands of concurrent messages (via tokio work-stealing runtime)
- Natural composition with other async libraries (tracing, metrics, etc.)
- Timeouts via `tokio::time::timeout` without thread overhead
- Cancellation patterns via `tokio::select!` for graceful shutdown
- Better resource utilization (fewer threads, more concurrent connections)

**Forbids:**

- Synchronous APIs (e.g., `receive()` returning `T` instead of `Future<Output=T>`)
- Blocking operations in async context (`.unwrap()` on blocking lock, synchronous DNS, etc.)
- Thread allocation for waiting (use `select!` or timeout instead)
- Mixing callback-based APIs with async/await

**Trade-offs:**

- Requires tokio runtime (adds dependency)
- Slightly more complex application code (async/await syntax)
- Harder to debug (stack traces across await points less obvious)
- Some operations may be harder to instrument (tracing setup more involved)

## Alternatives considered


### Option A: Synchronous blocking APIs


**Why not**: Forces one OS thread per concurrent message; cannot scale to 1000+ msg/sec on reasonable hardware; incompatible with modern Rust practices.

### Option B: Hybrid sync + async APIs


**Why not**: Doubles API surface, harder to test, inconsistent error handling between versions, maintenance burden increases.

### Option C: Callback-based async (without async/await)


**Why not**: Error handling becomes verbose; stack traces become unreadable; doesn't compose well with ecosystem libraries.

## Implementation notes


**Timeouts:**

```rust
// Good: Uses async timeout
match tokio::time::timeout(Duration::from_secs(30), client.receive()).await {
    Ok(Ok(msg)) => { /* process */ }
    Ok(Err(e)) => { /* handle error */ }
    Err(_) => { /* handle timeout */ }
}

// Bad: Blocks thread
thread::sleep(Duration::from_secs(5));
```

**Cancellation:**

```rust
// Good: Graceful shutdown via select!
tokio::select! {
    msg = client.receive() => { /* process */ }
    _ = shutdown_signal => { /* cleanup and return */ }
}

// Bad: Spawned task doesn't respect shutdown
tokio::spawn(client.receive());
```

**Resource cleanup:**

```rust
// Good: Cleanup on drop via Drop impl or explicit close
impl Drop for AzureServiceBusProvider {
    fn drop(&mut self) {
        // Close connections cleanly
    }
}

// Bad: Resources held until garbage collection (Rust doesn't have GC)
```

**Trait async methods:**

```rust
// Good: Using async-trait
use async_trait::async_trait;

#[async_trait]

pub trait QueueProvider {
    async fn send(&self, message: Message) -> Result<MessageId, QueueError>;
}

// Bad: Manual impl Future (very verbose)
pub trait QueueProvider {
    fn send(&self, message: Message) -> Box<dyn Future<Output = ...>>;
}
```

**Testing async code:**

```rust
// Good: Use #[tokio::test]
#[tokio::test]

async fn test_receive_message() {
    let provider = InMemoryProvider::new();
    let msg = provider.receive().await.unwrap();
}

// Bad: Blocking .wait() (if available)
#[test]

fn test_receive_message() {
    let msg = block_on(provider.receive()).unwrap();
}
```

## Examples


**Basic async client usage:**

```rust
#[tokio::main]

async fn main() -> Result<()> {
    let config = ProviderConfig::AzureServiceBus(azure_config);
    let client = QueueClientFactory::create(config).await?;

    loop {
        match client.receive().await {
            Ok(message) => {
                if let Err(e) = process_message(&message).await {
                    client.abandon_message(&message.receipt).await?;
                } else {
                    client.complete_message(&message.receipt).await?;
                }
            }
            Err(e) => eprintln!("Receive failed: {}", e),
        }
    }
}
```

**Timeout with graceful handling:**

```rust
let result = tokio::time::timeout(
    Duration::from_secs(30),
    client.receive()
).await;

match result {
    Ok(Ok(msg)) => { /* success */ }
    Ok(Err(QueueError::SessionLocked { locked_until, .. })) => {
        tokio::time::sleep_until(locked_until.to_instant()).await;
    }
    Err(_timeout) => {
        tracing::warn!("Receive operation timed out");
    }
    Ok(Err(e)) => return Err(e),
}
```

## References


- [Tokio Runtime Documentation](https://tokio.rs/)
- [async-trait Crate](https://docs.rs/async-trait/)
- [Rust Async Book](https://rust-lang.github.io/async-book/)
- [Code Standards - Async/Await](../standards/code.md#asyncawait)