duroxide 0.1.27

Durable code execution framework for Rust
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
# Toygres Improvements Roadmap

This document captures improvements to **toygres**, the test application built on duroxide that simulates a managed Postgres service. Toygres serves as an important feedback loop for duroxide development, exercising real-world orchestration patterns.

**Repository**: [affandar/toygres](https://github.com/affandar/toygres)

## Summary

| # | Feature | Issue | Description |
|---|---------|-------|-------------|
| 1 | [Replica Support]#1-replica-support | [#24]https://github.com/microsoft/duroxide/issues/24 | Add read replicas with streaming replication |
| 2 | [Manual Failover]#2-manual-failover | [#25]https://github.com/microsoft/duroxide/issues/25 | Support operator-initiated failover to replica |
| 3 | [Automatic Failover Monitoring]#3-automatic-failover-monitoring | [#26]https://github.com/microsoft/duroxide/issues/26 | Instance actor monitors health and triggers automatic failover |
| 4 | [Backup & Restore]#4-backup--restore | [#27]https://github.com/microsoft/duroxide/issues/27 | Point-in-time backup and restore capabilities |
| 5 | [Postgres Parameters]#5-postgres-parameters | [#28]https://github.com/microsoft/duroxide/issues/28 | Configure and apply Postgres parameters dynamically |

---

## 1. Replica Support

### Current State

Toygres currently manages single-instance Postgres deployments with no replication.

### Proposed Features

**1.1 Add Replica to Instance:**

```rust
// API to add a replica
client.add_replica(AddReplicaRequest {
    instance_id: "pg-prod-1".into(),
    replica_id: "pg-prod-1-replica-1".into(),
    replica_config: ReplicaConfig {
        size: "small".into(),
        region: "us-east-1".into(),
        availability_zone: Some("us-east-1b".into()),
    },
}).await?;
```

**1.2 Replica Configuration:**

```rust
pub struct ReplicaConfig {
    /// Compute size for replica
    pub size: String,
    /// Region for replica (can differ from primary for geo-replication)
    pub region: String,
    /// Availability zone (for HA within region)
    pub availability_zone: Option<String>,
    /// Replication mode
    pub replication_mode: ReplicationMode,
    /// Max replication lag before alerts
    pub max_lag_bytes: Option<u64>,
}

pub enum ReplicationMode {
    /// Asynchronous streaming replication (default)
    Async,
    /// Synchronous replication (primary waits for replica confirmation)
    Sync,
}
```

**1.3 Orchestration Flow:**

```
AddReplica Orchestration
├── Provision replica VM/container
├── Configure replica Postgres
│   ├── Set up recovery.conf / standby.signal
│   ├── Configure primary_conninfo
│   └── Set replica identity
├── Take base backup from primary
├── Start replica in standby mode
├── Wait for initial sync
├── Register replica in instance state
└── Start replication lag monitoring
```

**1.4 List Replicas:**

```rust
let instance = client.get_instance("pg-prod-1").await?;
for replica in &instance.replicas {
    println!("{}: lag={}ms, state={:?}", 
        replica.id, 
        replica.replication_lag_ms,
        replica.state);
}
```

### Implementation Considerations

- Primary needs `wal_level = replica` and `max_wal_senders` configured
- Replica connection string stored securely
- Handle replica promotion (see Failover sections)
- Support removing replicas cleanly

---

## 2. Manual Failover

### Concept

Allow operators to manually trigger failover from primary to a designated replica. This is useful for:
- Planned maintenance on primary
- Testing disaster recovery procedures
- Upgrading primary with minimal downtime

### API

```rust
// Initiate manual failover
client.failover(FailoverRequest {
    instance_id: "pg-prod-1".into(),
    target_replica_id: "pg-prod-1-replica-1".into(),
    options: FailoverOptions {
        /// Wait for replica to catch up before failover
        wait_for_sync: true,
        /// Max time to wait for sync
        sync_timeout: Duration::from_secs(300),
        /// What to do with old primary after failover
        old_primary_action: OldPrimaryAction::ConvertToReplica,
    },
}).await?;

pub enum OldPrimaryAction {
    /// Convert old primary to replica of new primary
    ConvertToReplica,
    /// Stop old primary but keep data
    StopAndRetain,
    /// Terminate old primary completely
    Terminate,
}
```

### Orchestration Flow

```
ManualFailover Orchestration
├── Validate target replica exists and is healthy
├── If wait_for_sync:
│   ├── Pause writes on primary (optional)
│   ├── Wait for replica to catch up to primary LSN
│   └── Timeout if sync takes too long
├── Fence primary (prevent writes)
├── Promote replica to primary
│   ├── Create trigger file / promote command
│   ├── Wait for promotion complete
│   └── Update connection endpoint
├── Handle old primary per OldPrimaryAction
│   ├── ConvertToReplica: reconfigure as standby
│   ├── StopAndRetain: shutdown gracefully
│   └── Terminate: destroy instance
├── Update DNS / connection routing
├── Notify connected clients (if possible)
└── Update instance metadata (new primary, replica list)
```

### Failover Status

```rust
pub struct FailoverStatus {
    pub state: FailoverState,
    pub started_at: DateTime<Utc>,
    pub completed_at: Option<DateTime<Utc>>,
    pub old_primary_id: String,
    pub new_primary_id: String,
    pub replication_lag_at_start: u64,
    pub data_loss_bytes: Option<u64>,  // If any
}

pub enum FailoverState {
    WaitingForSync,
    FencingPrimary,
    PromotingReplica,
    UpdatingRouting,
    CleaningUpOldPrimary,
    Completed,
    Failed { reason: String },
}
```

---

## 3. Automatic Failover Monitoring

### Concept

The instance actor (long-running orchestration managing instance lifecycle) monitors primary health and automatically triggers failover when issues are detected.

### Health Monitoring

```rust
pub struct HealthMonitorConfig {
    /// Interval between health checks
    pub check_interval: Duration,
    /// Number of consecutive failures before failover
    pub failure_threshold: u32,
    /// Timeout for each health check
    pub check_timeout: Duration,
    /// Minimum replica lag for automatic failover eligibility
    pub max_eligible_lag_bytes: u64,
}

pub struct HealthCheck {
    /// Can connect to Postgres
    pub connectivity: bool,
    /// Postgres is accepting queries
    pub query_responsive: bool,
    /// Replication is active (if replicas exist)
    pub replication_healthy: bool,
    /// Disk space OK
    pub disk_space_ok: bool,
    /// Custom health query passes
    pub custom_check_ok: bool,
}
```

### Instance Actor Integration

The instance actor uses durable timers and activities for monitoring:

```rust
async fn instance_actor(ctx: OrchestrationContext) -> Result<(), String> {
    let state: InstanceState = ctx.get_state()?;
    
    loop {
        // Wait for next check interval
        ctx.schedule_timer(state.health_config.check_interval).await;
        
        // Perform health check
        let health = ctx.schedule_activity("check_primary_health", &state.primary_id)
            .await?;
        
        if !health.is_healthy() {
            state.consecutive_failures += 1;
            
            if state.consecutive_failures >= state.health_config.failure_threshold {
                // Find best replica for failover
                if let Some(target) = select_failover_target(&state).await? {
                    // Trigger automatic failover
                    ctx.schedule_sub_orchestration(
                        "automatic_failover",
                        AutoFailoverInput {
                            instance_id: state.instance_id.clone(),
                            target_replica_id: target.id,
                            reason: "Health check failures exceeded threshold".into(),
                        },
                    ).await?;
                } else {
                    // No eligible replica - alert only
                    ctx.schedule_activity("send_alert", AlertInput {
                        severity: AlertSeverity::Critical,
                        message: "Primary unhealthy but no replica eligible for failover",
                    }).await?;
                }
            }
        } else {
            state.consecutive_failures = 0;
        }
        
        // Continue-as-new periodically to manage history size
        if ctx.should_continue_as_new() {
            ctx.continue_as_new(state)?;
        }
    }
}
```

### Failover Target Selection

```rust
fn select_failover_target(state: &InstanceState) -> Option<&Replica> {
    state.replicas
        .iter()
        .filter(|r| r.state == ReplicaState::Streaming)
        .filter(|r| r.replication_lag_bytes <= state.health_config.max_eligible_lag_bytes)
        .min_by_key(|r| r.replication_lag_bytes)
}
```

### Automatic Failover Safeguards

- **Cooldown period**: Minimum time between automatic failovers
- **Manual override**: Ability to disable automatic failover temporarily
- **Quorum check**: Ensure network partition isn't causing false positives
- **Notification**: Alert on-call before/during automatic failover
- **Audit log**: Record all automatic failover decisions and outcomes

---

## 4. Backup & Restore

### Backup Types

```rust
pub enum BackupType {
    /// Full base backup (pg_basebackup)
    Full,
    /// Incremental using WAL archiving
    Incremental,
    /// Logical backup (pg_dump)
    Logical { databases: Vec<String> },
}

pub struct BackupConfig {
    /// Where to store backups
    pub storage: BackupStorage,
    /// Retention policy
    pub retention: RetentionPolicy,
    /// Encryption settings
    pub encryption: Option<EncryptionConfig>,
    /// Compression level
    pub compression: CompressionLevel,
}

pub enum BackupStorage {
    AzureBlob { container: String, connection_string: String },
    S3 { bucket: String, region: String },
    Local { path: String },
}

pub struct RetentionPolicy {
    /// Keep daily backups for N days
    pub daily_retention_days: u32,
    /// Keep weekly backups for N weeks
    pub weekly_retention_weeks: u32,
    /// Keep monthly backups for N months
    pub monthly_retention_months: u32,
}
```

### Backup API

```rust
// Trigger manual backup
let backup = client.create_backup(CreateBackupRequest {
    instance_id: "pg-prod-1".into(),
    backup_type: BackupType::Full,
    label: Some("pre-migration-backup".into()),
}).await?;

// List backups
let backups = client.list_backups("pg-prod-1").await?;

// Schedule automatic backups
client.configure_backup_schedule(BackupScheduleConfig {
    instance_id: "pg-prod-1".into(),
    full_backup_cron: "0 2 * * 0".into(),  // Weekly Sunday 2am
    incremental_backup_cron: "0 2 * * *".into(),  // Daily 2am
    config: BackupConfig { .. },
}).await?;
```

### Restore API

```rust
// Restore to new instance
let restored = client.restore_backup(RestoreRequest {
    backup_id: "backup-123".into(),
    target: RestoreTarget::NewInstance {
        instance_id: "pg-prod-1-restored".into(),
        config: InstanceConfig { .. },
    },
    point_in_time: Some(datetime!(2024-01-15 14:30:00 UTC)),  // PITR
}).await?;

// Restore to existing instance (destructive)
client.restore_backup(RestoreRequest {
    backup_id: "backup-123".into(),
    target: RestoreTarget::ExistingInstance {
        instance_id: "pg-dev-1".into(),
        confirm_destructive: true,
    },
    point_in_time: None,  // Restore to backup time
}).await?;
```

### Backup Orchestration

```
CreateBackup Orchestration
├── Validate instance exists and is healthy
├── If Full backup:
│   ├── Start pg_basebackup on replica (preferred) or primary
│   ├── Stream to backup storage
│   └── Record backup metadata
├── If Incremental:
│   ├── Archive WAL segments since last backup
│   └── Update backup chain metadata
├── If Logical:
│   ├── Run pg_dump for specified databases
│   └── Stream to backup storage
├── Verify backup integrity
├── Apply retention policy (delete old backups)
└── Update backup catalog
```

### Point-in-Time Recovery (PITR)

- Continuous WAL archiving to backup storage
- Restore base backup + replay WAL to target timestamp
- Recovery target options: timestamp, transaction ID, named restore point

---

## 5. Postgres Parameters

### Concept

Allow users to configure Postgres parameters and apply them dynamically or with restart.

### Parameter Categories

```rust
pub enum ParameterCategory {
    /// Can be changed without restart (SET command)
    Dynamic,
    /// Requires reload (pg_reload_conf)
    Reload,
    /// Requires restart
    Restart,
    /// Cannot be changed after init
    Immutable,
}

pub struct ParameterDefinition {
    pub name: String,
    pub category: ParameterCategory,
    pub data_type: ParameterType,
    pub default_value: String,
    pub min_value: Option<String>,
    pub max_value: Option<String>,
    pub description: String,
}
```

### API

```rust
// Set parameters
client.set_parameters(SetParametersRequest {
    instance_id: "pg-prod-1".into(),
    parameters: vec![
        ("shared_buffers".into(), "4GB".into()),
        ("work_mem".into(), "256MB".into()),
        ("max_connections".into(), "200".into()),
        ("log_statement".into(), "all".into()),
    ],
    apply_mode: ApplyMode::Immediate,  // or Scheduled { at: datetime }
}).await?;

pub enum ApplyMode {
    /// Apply immediately (may restart if needed)
    Immediate,
    /// Apply at scheduled time
    Scheduled { at: DateTime<Utc> },
    /// Apply on next restart
    OnNextRestart,
}

// Get current parameters
let params = client.get_parameters("pg-prod-1").await?;
for (name, value, pending) in params {
    if let Some(pending_value) = pending {
        println!("{}: {} (pending: {})", name, value, pending_value);
    } else {
        println!("{}: {}", name, value);
    }
}

// Get parameter definition
let def = client.describe_parameter("shared_buffers").await?;
println!("{}: {} ({:?})", def.name, def.description, def.category);
```

### Apply Parameters Orchestration

```
SetParameters Orchestration
├── Validate parameter names and values
├── Classify parameters by category
├── For Dynamic parameters:
│   └── Execute SET commands via SQL
├── For Reload parameters:
│   ├── Update postgresql.conf
│   └── Execute pg_reload_conf()
├── For Restart parameters:
│   ├── Update postgresql.conf
│   ├── Schedule restart (if Immediate mode)
│   │   ├── Graceful connection draining
│   │   ├── Stop Postgres
│   │   ├── Start Postgres
│   │   └── Verify parameters applied
│   └── Or mark as pending (if OnNextRestart mode)
├── Replicate parameter changes to replicas
└── Update instance state with current/pending params
```

### Parameter Profiles

Pre-defined parameter profiles for common workloads:

```rust
pub enum ParameterProfile {
    /// Optimized for OLTP workloads
    Oltp,
    /// Optimized for analytics/OLAP
    Analytics,
    /// Optimized for mixed workloads
    GeneralPurpose,
    /// Minimal resources for development
    Development,
    /// Custom profile
    Custom { parameters: HashMap<String, String> },
}

// Apply a profile
client.apply_parameter_profile(ApplyProfileRequest {
    instance_id: "pg-prod-1".into(),
    profile: ParameterProfile::Oltp,
    override_params: vec![
        ("max_connections".into(), "500".into()),  // Override specific values
    ],
}).await?;
```

---

## Open Questions

1. **Replicas**: Should replicas be promotable by default, or require explicit configuration?
2. **Failover**: How to handle in-flight transactions during failover?
3. **Automatic Failover**: What quorum/witness mechanism prevents split-brain?
4. **Backup**: Should logical backups be schema-only by default for large databases?
5. **Parameters**: How to validate parameter combinations that may conflict?