duroxide-cdb 0.1.5

A CosmosDB-based provider implementation for Duroxide, a durable task orchestration framework
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
# duroxide-cdb Architecture

> Comprehensive design and implementation reference for the CosmosDB NoSQL provider for [duroxide]https://github.com/microsoft/duroxide.

---

## Table of Contents

1. [Overview]#1-overview
2. [Data Model]#2-data-model
3. [Module Map]#3-module-map
4. [REST Client & Authentication]#4-rest-client--authentication
5. [Dispatch Slot Partitioning]#5-dispatch-slot-partitioning
6. [Core Algorithms]#6-core-algorithms
7. [Transactional Outbox]#7-transactional-outbox
8. [Session Affinity]#8-session-affinity
9. [Error Handling]#9-error-handling
10. [CosmosDB vs Postgres Semantic Differences]#10-cosmosdb-vs-postgres-semantic-differences
11. [Cross-Partition Query Strategy]#11-cross-partition-query-strategy
12. [Infrastructure Bootstrapping]#12-infrastructure-bootstrapping
13. [Configuration]#13-configuration
14. [Testing]#14-testing

---

## 1. Overview

`duroxide-cdb` implements the `Provider` and `ProviderAdmin` traits from the duroxide runtime, backed by Azure CosmosDB NoSQL API. All orchestration state — instance metadata, event history, work queues, sessions, and outbox intents — lives in a **single container** with `/instanceId` as the partition key.

### Design Principles

| Principle | How It Is Applied |
|-----------|-------------------|
| **Single container, partition-per-instance** | All documents for an orchestration instance share the same logical partition. Intra-instance operations use CosmosDB transactional batch for atomicity. |
| **Transactional outbox for cross-partition writes** | Sub-orchestration starts, completions, and detached orchestration starts record intents inside the source partition's transaction, then deliver best-effort + background reconciler. |
| **Optimistic concurrency via ETags** | Instance locking and queue item locking use CosmosDB `If-Match` conditional writes instead of database-level locks. |
| **Lease-based dispatcher partitioning** | Concurrent dispatchers within a runtime partition the 256-slot keyspace so they never contend on the same queue items. |
| **Raw REST API** | Direct HTTP calls to CosmosDB (`reqwest`), avoiding the Azure SDK. Auth uses HMAC-SHA256 master key tokens. |

### Why Not the Azure SDK?

The provider uses raw REST instead of `azure_data_cosmos` for:
- **Full control** over headers (`x-ms-documentdb-partitionkey`, `x-ms-documentdb-isquery`, etc.)
- **Transactional batch** support via the undocumented-but-stable batch endpoint
- **Minimal dependency footprint** — only `reqwest`, `hmac`, `sha2`, `base64`
- **Cross-partition query control** — explicit header management for `enablecrosspartition`

---

## 2. Data Model

All document types coexist in one container, discriminated by a `type` field. The partition key is `/instanceId`.

### 2.1 Document Types

```
┌─────────────────────────────────────────────────────────────────┐
│                     CosmosDB Container                         │
│                    Partition Key: /instanceId                   │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ Partition: "order-123"                                   │   │
│  │                                                          │   │
│  │  [instance]    order-123:instance                        │   │
│  │  [history]     order-123:history:1:1                     │   │
│  │  [history]     order-123:history:1:2                     │   │
│  │  [orch_queue]  <uuid>  (type=orch_queue)                 │   │
│  │  [worker_queue] <uuid> (type=worker_queue)               │   │
│  │  [outbox]      intent:<key>                              │   │
│  │  [session]     order-123:session:sess-A                  │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ Partition: "user-456"                                    │   │
│  │  [instance]    user-456:instance                         │   │
│  │  [history]     user-456:history:1:1                      │   │
│  │  ...                                                     │   │
│  └─────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
```

| Document Type | `type` Field | `id` Pattern | Purpose |
|---------------|-------------|-------------|---------|
| **Instance** | `instance` | `<instanceId>:instance` | Orchestration metadata, lock state, custom status |
| **History** | `history` | `<instanceId>:history:<executionId>:<eventId>` | Append-only event log |
| **Orch Queue** | `orch_queue` | `<uuid>` | Messages for the orchestration dispatcher |
| **Worker Queue** | `worker_queue` | `<uuid>` | Activity execution requests for the worker dispatcher |
| **Outbox Intent** | `outbox_intent` | `intent:<idempotencyKey>` | Pending cross-partition writes |
| **Session** | `session` | `<instanceId>:session:<sessionId>` | Session affinity ownership tracking |

### 2.2 Key Relationships

```
StartOrchestration ──▶ orch_queue item ──▶ fetch_orchestration_item
                              ┌────────────────────┘
                    lock instance (ETag)
                    collect all orch_queue messages
                    fetch history
                    ack_orchestration_item
                    ┌─────────────────────────────┐
                    │  Transactional Batch:         │
                    │  - DELETE orch_queue items     │
                    │  - CREATE history events       │
                    │  - CREATE worker_queue items   │
                    │  - CREATE orch_queue items     │
                    │  - CREATE outbox intents       │
                    │  - UPSERT instance (unlock)    │
                    └─────────────────────────────┘
              ┌───────────────┼───────────────┐
              ▼               ▼               ▼
        worker_queue    orch_queue      outbox_intent
        (activities)    (same part.)    (cross-partition)
              │                               │
              ▼                               ▼
        fetch_work_item              deliver to target
        ack_work_item                partition + delete
```

### 2.3 Indexing Policy

The container uses a **selective indexing** strategy — only fields used in queries are indexed; large blob fields are excluded:

**Included paths:** `/type`, `/instanceId`, `/dispatchSlot`, `/visibleAt`, `/enqueuedAt`, `/lockedUntil`, `/executionId`, `/eventId`, `/status`, `/sessionId`, `/ownerId`, `/parentInstanceId`, `/createdAt`, `/customStatusVersion`, `/docType`, `/lockToken`

**Excluded paths:** `/eventData/*`, `/workItem/*`, `/payload/*`, `/output/*`, `/customStatus/*`, `/*` (catch-all)

**Composite indexes:**
1. `(type ASC, dispatchSlot ASC, visibleAt ASC, enqueuedAt ASC)` — serves fetch_orchestration_item / fetch_work_item
2. `(type ASC, executionId ASC, eventId ASC)` — serves history reads
3. `(type ASC, status ASC)` — serves list_instances_by_status

---

## 3. Module Map

```
src/
├── lib.rs          Public API: re-exports CosmosDBProvider, CosmosDBProviderConfig
├── provider.rs     Provider + ProviderAdmin trait implementations (~2000 lines)
├── client.rs       CosmosDBClient: HTTP transport, auth, CRUD + query + batch
├── models.rs       Serde structs for all 6 document types + helper constructors
├── query.rs        Query builders: cross-partition candidate finders, history fetch
├── batch.rs        Transactional batch builder: BatchOperation enum, execute_batch()
├── containers.rs   Database/container creation with retry + indexing policy
├── outbox.rs       Outbox intent delivery + background reconciler task
├── leases.rs       LeaseProvider trait + InMemoryLeaseProvider (256-slot distribution)
└── errors.rs       CosmosDB HTTP status → ProviderError mapping
```

### Module Dependencies

```
provider.rs
  ├── client.rs      (HTTP operations)
  ├── query.rs       (query builders)
  ├── batch.rs       (transactional batch)
  ├── models.rs      (document types)
  ├── outbox.rs      (cross-partition delivery)
  ├── leases.rs      (slot assignment)
  ├── containers.rs  (infrastructure setup)
  └── errors.rs      (error mapping)
```

---

## 4. REST Client & Authentication

### 4.1 Authentication

Every request to CosmosDB requires an `Authorization` header computed from the master key:

```
HMAC-SHA256(key, verb + "\n" + resourceType + "\n" + resourceLink + "\n" + date + "\n\n")
```

The `CosmosDBClient` computes this for each request in `auth_header()`, using:
- `hmac::Hmac<sha2::Sha256>` for signing
- `base64` for encoding
- `urlencoding` for the authorization token

### 4.2 Common Headers

Every request includes:
- `x-ms-date` — RFC 1123 formatted UTC timestamp
- `x-ms-version``2020-07-15` (CosmosDB API version)
- `Authorization` — HMAC token
- `x-ms-documentdb-partitionkey` — JSON array, e.g. `["order-123"]`

### 4.3 Operations

| Operation | HTTP Method | Endpoint |
|-----------|------------|----------|
| Create document | POST | `/dbs/{db}/colls/{coll}/docs` |
| Read document | GET | `/dbs/{db}/colls/{coll}/docs/{id}` |
| Replace document | PUT | `/dbs/{db}/colls/{coll}/docs/{id}` |
| Delete document | DELETE | `/dbs/{db}/colls/{coll}/docs/{id}` |
| Query | POST | `/dbs/{db}/colls/{coll}/docs` (with `x-ms-documentdb-isquery: true`) |
| Transactional batch | POST | `/dbs/{db}/colls/{coll}/docs` (with `x-ms-cosmos-batch-request: true`) |
| Create database | POST | `/dbs` |
| Create container | POST | `/dbs/{db}/colls` |

---

## 5. Dispatch Slot Partitioning

### 5.1 Problem

Multiple dispatcher tasks within a runtime call `fetch_orchestration_item` / `fetch_work_item` concurrently. Without coordination, they all query the same queue items and race on locks, wasting RUs.

### 5.2 Solution: 256-Slot Keyspace

Every queue item gets a precomputed `dispatchSlot` (0-255) derived from `hash(instanceId) % 256`. Dispatchers are assigned non-overlapping subsets of slots via the `LeaseProvider`:

```
With orch_concurrency = 3:

Dispatcher 0 → slots 0, 3, 6, 9, ..., 255  (86 slots)
Dispatcher 1 → slots 1, 4, 7, 10, ..., 253 (85 slots)
Dispatcher 2 → slots 2, 5, 8, 11, ..., 254 (85 slots)
```

Each dispatcher only queries for items in its assigned slots (`AND c.dispatchSlot IN (...)`), completely eliminating intra-runtime contention.

### 5.3 InMemoryLeaseProvider

The current implementation uses an in-memory lease provider with an `AtomicU32` counter. Each `acquire_slots()` call assigns the next slice of the 256-slot keyspace. Assignments are cached per `caller_id` (tokio task ID).

A future Phase 2 will implement `CosmosDBLeaseProvider` using lease documents for multi-runtime coordination.

---

## 6. Core Algorithms

### 6.1 fetch_orchestration_item

The most complex read path. Finds a candidate queue item, locks the owning instance, collects all pending messages, and returns the orchestration context.

```
1. Acquire dispatch slots from lease provider
2. Cross-partition query: find visible, unlocked orch_queue items in my slots
3. Sort results client-side by enqueuedAt (cross-partition ORDER BY unsupported)
4. Loop (up to 20 attempts, excluding locked instances):
   a. Try to lock the instance (conditional replace with ETag)
   b. Check capability filter against instance's pinned version
   c. Collect all pending orch_queue messages for the instance (single-partition)
   d. Tag messages with lock token (transactional batch)
   e. Fetch history (single-partition, ordered by eventId)
   f. Build OrchestrationItem and return
```

### 6.2 ack_orchestration_item

The most complex write path. Atomically commits a completed orchestration turn using a transactional batch within the instance's partition, plus outbox intents for cross-partition effects.

```
1. Validate lock token matches instance
2. Classify output items by partition:
   - same-partition worker items → batch CREATE
   - same-partition orch items → batch CREATE
   - cross-partition items → outbox intents
3. Build transactional batch (single partition):
   - DELETE locked orch_queue messages
   - CREATE history events
   - CREATE same-partition worker/orch queue items
   - CREATE outbox intent documents
   - UPSERT instance document (update metadata, release lock)
4. If batch exceeds 100 operations → split into sequential batches
5. Best-effort delete of cancelled activity worker_queue items (outside transaction)
6. Deliver outbox intents best-effort (outside transaction)
7. Background reconciler handles any failed deliveries
```

> **Cancelled activity deletes are NOT in the transactional batch.** When an
> orchestration cancels an in-flight activity (e.g., `select2` picks a timer/event
> over an activity), the worker dispatcher may have already fetched and deleted the
> activity's `worker_queue` document. A batch DELETE of a non-existent document
> returns 404, which causes the entire transactional batch to fail with 424
> (Failed Dependency). To avoid this race, cancelled activity deletes are performed
> best-effort after the batch commits. If the document is already gone, the 404 is
> silently ignored. This is safe because activity cancellation is itself
> best-effort — a completed activity result that arrives after cancellation is
> simply discarded by the runtime's replay logic.

### 6.3 fetch_work_item & ack_work_item

Simpler than the orchestrator path:
- **fetch_work_item**: Query → lock single queue item (conditional replace) → session claim if applicable → return work item
- **ack_work_item**: Delete worker_queue item + create completion orch_queue item (transactional batch within same partition) + piggyback session `last_activity` update

### 6.4 Optimistic Locking Pattern

All "lock" operations follow the same pattern:

```
1. Read document → get current _etag
2. Modify document (set lockToken, lockedUntil, etc.)
3. Replace with If-Match: <etag> header
4. On 412 Precondition Failed → another writer won; skip or retry
5. On 409 Conflict → document already exists; handle accordingly
```

No database-level locks are ever held. This means operations are non-blocking and scale naturally, at the cost of occasional retry loops.

---

## 7. Transactional Outbox

### 7.1 When It Is Used

Cross-partition writes cannot be atomic in CosmosDB. The outbox pattern ensures eventual delivery:

| Scenario | Source Partition | Target Partition |
|----------|-----------------|-----------------|
| Sub-orchestration start | Parent instance | Child instance |
| Sub-orchestration completion | Child instance | Parent instance |
| Detached orchestration start | Coordinator | New instance |
| Cancel cascade | Parent instance | Child instance |

### 7.2 Lifecycle

```
ack_orchestration_item
    ├── Transactional batch (within source partition):
  │     CREATE outbox_intent { status: "pending", payload: <target doc JSON> }
    ├── Best-effort delivery (immediately after batch):
  │     POST target doc to target partition
  │     On success/409 → DELETE intent
  │     On transient error → leave as "pending"
    └── Background reconciler (every 2s):
        Query pending intents older than 2s
        Retry delivery for each
        On success/409 → DELETE intent
        On transient error → increment attemptCount, retry next cycle
```

### 7.3 Idempotency

Each outbox intent has a deterministic `idempotencyKey` derived from `<sourceInstance>:<executionId>:<eventSequence>`. The target document's `id` is set to `outbox:<idempotencyKey>`. Duplicate deliveries produce a 409 Conflict (treated as success).

### 7.4 Reconciler

Spawned as a background tokio task during provider initialization. Cancelled via `CancellationToken` on provider drop. Pending intents survive provider restarts — they remain in CosmosDB and are picked up by the next reconciler.

---

## 8. Session Affinity

Session affinity routes work items with the same `sessionId` to the same worker, enabling in-memory state reuse across activity invocations.

### 8.1 Session Documents

Stored in the same partition as the orchestration instance:
- `id`: `<instanceId>:session:<sessionId>`
- Tracks `ownerId`, `lockedUntil`, `lastActivity`

### 8.2 Session Lifecycle

1. **Claim**: During `fetch_work_item`, if the work item has a `sessionId`, read or create the session document. Only the owning worker (or a worker claiming an expired session) can process the item.
2. **Heartbeat**: `ack_work_item` and `renew_work_item_lock` piggyback-update `lastActivity` on the session document using a fresh `now_ms()` timestamp (not the stale one from the start of the operation — important for network-latency-sensitive idle window checks).
3. **Renewal**: `renew_session_lock` queries all session documents, filters by owner + non-expired + recent activity, and extends `lockedUntil`.
4. **Cleanup**: `cleanup_orphaned_sessions` deletes expired sessions with no pending work items.

---

## 9. Error Handling

### 9.1 CosmosDB Status → ProviderError Mapping

| Status | Meaning | ProviderError | Action |
|--------|---------|---------------|--------|
| 200-204 | Success | — | Continue |
| 400 | Bad Request | `permanent` | Query malformed |
| 404 | Not Found | `permanent` | Document absent |
| 408 | Timeout | `retryable` | Transient |
| 409 | Conflict | `retryable` | ETag race or duplicate |
| 412 | Precondition Failed | `retryable` | ETag mismatch |
| 413 | Entity Too Large | `permanent` | Batch too large |
| 429 | Rate Limited | `retryable` | Backoff + retry |
| 503 | Unavailable | `retryable` | Transient |

### 9.2 Infrastructure Retry

`ensure_infrastructure` (database + container creation) uses exponential backoff for 429 metadata rate-limiting: up to 10 retries starting at 500ms (500ms → 1s → 2s → 4s → ...).

---

## 10. CosmosDB vs Postgres Semantic Differences

Several SQL operations that are inherently safe in Postgres require different handling in CosmosDB due to fundamental semantic differences in how the two databases treat missing data and conflicts.

### 10.1 Summary Table

| Operation | Postgres Behavior | CosmosDB Behavior | CDB Provider Strategy |
|-----------|------------------|-------------------|----------------------|
| DELETE non-existent row/doc | No-op (0 rows affected), transaction continues | 404 error, fails entire transactional batch (424) | Best-effort delete outside batch, ignore 404 |
| INSERT duplicate primary key | Unique constraint violation, rolls back transaction | CREATE returns 409 (Conflict), fails entire batch (424) | Keep CREATE — atomicity test expects failure on duplicate event_ids |
| INSERT ON CONFLICT DO NOTHING | Silently skips duplicate, transaction continues | No equivalent in batch — CREATE or Upsert only | Use Upsert where idempotency is needed (e.g., instance doc), Create where uniqueness must be enforced (e.g., history) |
| UPDATE with WHERE matching 0 rows | No-op (0 rows affected), transaction continues | Conditional replace with non-matching ETag returns 412 | Treat 412 as "lost the race" — skip or retry depending on context |
| Transaction scope | Entire stored procedure is atomic (all-or-nothing) | Transactional batch is atomic but limited to single partition, max 100 ops | Split cross-partition writes into outbox intents; split >100 ops into sequential batches |
| Cross-partition query with ORDER BY | Works (query planner handles it) | 400 error (REST gateway returns query plan instead of results) | Fetch all matching items, sort client-side |
| Server-side aggregates (COUNT, SUM) | Works across all partitions | 400 error on cross-partition queries | SELECT ids + client-side `.len()` instead of COUNT |

### 10.2 Cancelled Activity Deletes

The most impactful difference. In `ack_orchestration_item`:

- **Postgres:** `DELETE FROM worker_queue WHERE instance_id = ... AND activity_id = ...` — if the worker already consumed the item, the DELETE affects 0 rows silently. The transaction proceeds.
- **CosmosDB:** A batch `DELETE` of a document ID that doesn't exist returns 404, which fails the entire transactional batch with 424 (Failed Dependency). Every other operation in the batch is rolled back.

**Solution:** Cancelled activity deletes are performed best-effort *after* the batch commits. If the document is already gone, the 404 is silently ignored.

### 10.3 History Event Uniqueness

- **Postgres (latest migration):** Plain `INSERT INTO history` — no `ON CONFLICT` clause. A duplicate `(instance_id, execution_id, event_id)` raises a unique constraint violation, rolling back the entire stored procedure.
- **CosmosDB:** Batch `CREATE` with a deterministic document ID. A duplicate ID returns 409, failing the batch with 424.

Both behaviors are equivalent: duplicate event_ids indicate a double-ack attempt, which must fail to preserve atomicity. The `test_atomicity_failure_rollback` validation test verifies this.

> **Note:** Postgres has a *separate* `append_history` function that uses `INSERT ON CONFLICT DO NOTHING` for idempotent history appends in a different code path. This is not the same as the inline history insert within `ack_orchestration_item`.

### 10.4 Session Piggyback Updates

- **Postgres:** `UPDATE sessions SET last_activity_at = ... WHERE locked_until > now` — if the session lock expired, matches 0 rows silently.
- **CosmosDB:** Conditional replace with ETag. A 412 (Precondition Failed) means the session was modified concurrently. Silently ignored since `last_activity_at` is advisory.

### 10.5 Lock Acquisition

- **Postgres:** `INSERT ON CONFLICT DO UPDATE ... WHERE locked_until <= now` — atomic upsert that silently returns 0 rows if the lock is held.
- **CosmosDB:** Read → conditional replace with `If-Match` ETag. A 412 means another dispatcher won the race. Silently skipped.

---

## 11. Cross-Partition Query Strategy

CosmosDB's gateway mode via REST API has limitations on cross-partition queries:

| Feature | Supported via REST Gateway? |
|---------|---------------------------|
| Simple filter (`WHERE c.type = 'x'`) | Yes |
| `ORDER BY` | **No** — returns 400 with query plan |
| Server-side aggregates (`COUNT`, `MAX`) | **No** — requires query plan decomposition |
| `TOP N` with `ORDER BY` | **No** |

### How We Handle This

- **Queue candidate queries** (`find_candidate_orch_item`, `find_candidate_work_item`): No `ORDER BY` or `TOP`. Fetch all matching items, sort client-side by `enqueuedAt`, take the first.
- **Dispatch slot filter**: Omitted when all 256 slots are assigned (the clause is redundant and bloats the query).
- **Count queries** (`count_by_type`): `SELECT c.id` + client-side `.len()` instead of `SELECT VALUE COUNT(1)`.
- **Partition-scoped queries** (history, messages): Can use `ORDER BY` safely since they target a single partition.

---

## 12. Infrastructure Bootstrapping

On provider initialization, `ensure_infrastructure` creates the database and container if they don't exist:

```
ensure_infrastructure (with retry for 429s)
  ├── ensure_database
  │     POST /dbs { "id": "<database>" }
  │     201 = created, 409 = already exists (both OK)
    └── ensure_container
        POST /dbs/<db>/colls {
          "id": "<container>",
          "partitionKey": { "paths": ["/instanceId"], "kind": "Hash" },
          "indexingPolicy": { ... selective indexing ... }
        }
        201 = created, 409 = already exists (both OK)
```

The indexing policy is set at creation time. Changing it after creation requires manual intervention.

---

## 13. Configuration

### CosmosDBProviderConfig

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `endpoint` | `String` | — | CosmosDB endpoint URL |
| `key` | `String` | — | CosmosDB master key |
| `database` | `String` | `"duroxide"` | Database name |
| `container` | `String` | `"duroxide"` | Container name |
| `orch_concurrency` | `u32` | `1` | Number of orchestration dispatchers (for slot partitioning) |
| `worker_concurrency` | `u32` | `1` | Number of worker dispatchers (for slot partitioning) |
| `reconciler_interval` | `Duration` | `2s` | Outbox reconciler poll interval |
| `reconciler_age_threshold` | `Duration` | `2s` | Minimum intent age before reconciler picks it up |

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `COSMOS_ENDPOINT` | CosmosDB account endpoint | `http://localhost:8081` (emulator) |
| `COSMOS_KEY` | CosmosDB account key | Emulator default key |
| `COSMOS_DATABASE` | Database name | `duroxide` |

---

## 14. Testing

### 13.1 Provider Validation Tests

The `cosmosdb_provider_test.rs` test file implements the `ProviderFactory` trait and runs the duroxide provider validation suite (~196 tests covering atomicity, concurrency, sessions, cancellation, error handling, capability filtering, and more).

Each test gets an **isolated container** (unique UUID suffix) to ensure test independence. The container is cleaned up after each test.

**Running:** `cargo nextest run --features provider-test`

**Concurrency:** Limited to 4 threads (via `.config/nextest.toml`) to avoid CosmosDB metadata rate-limiting (429) during container creation.

### 13.2 E2E Sample Tests

`e2e_samples.rs` ports the full duroxide e2e sample suite (hello world, fan-out/fan-in, sub-orchestrations, timers, external events, sessions, continue-as-new, cancellation, etc.) to run against CosmosDB.

### 13.3 Local Development

Use the [CosmosDB Linux Emulator](https://learn.microsoft.com/en-us/azure/cosmos-db/emulator) via Docker for local development:

```bash
docker run -p 8081:8081 -p 10250-10255:10250-10255 \
  mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:latest
```

Copy `.env.example` to `.env`— the defaults point to the local emulator.

### 13.4 Azure CosmosDB

For cloud testing, create a serverless CosmosDB account and set `COSMOS_ENDPOINT` and `COSMOS_KEY` in `.env`.