caxton 0.1.4

A secure WebAssembly runtime for multi-agent systems
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
# Performance Tuning Guide

This guide provides detailed instructions for optimizing Caxton performance. For performance targets and requirements, see [ADR-0017: Performance Requirements](../adr/0017-performance-requirements.md).

## Performance Targets

Caxton is designed to meet these performance targets:

| Metric | Target P50 | Target P99 | Measurement |
|--------|------------|------------|-------------|
| Local message routing | 100μs | 1ms | `caxton_routing_latency` |
| Remote message routing | 5ms | 50ms | `caxton_remote_routing_latency` |
| Agent startup | 10ms | 100ms | `caxton_agent_startup_time` |
| Message processing | 1ms | 10ms | `caxton_message_processing_time` |
| Gossip convergence | - | 5s | `caxton_gossip_convergence_time` |

## Quick Optimization Checklist

1. [ ] QUIC transport enabled (better than TCP)
2. [ ] MessagePack serialization (more efficient than JSON)
3. [ ] Agent pool pre-warming configured
4. [ ] Batch processing enabled where applicable
5. [ ] Resource limits properly set
6. [ ] Gossip parameters tuned for cluster size

## Configuration Tuning

### High Throughput Configuration

For maximum message throughput:

```yaml
# Optimized for throughput (100k+ msgs/sec)
runtime:
  max_agents: 5000
  agent_pool_size: 100  # Pre-warm agents
  max_concurrent_messages: 10000

messaging:
  queue_size: 100000
  batch_size: 1000  # Process in batches
  delivery_timeout: 5s
  enable_persistence: false  # Trade durability for speed

  # Use parallel processing
  parallel_routes: 8
  worker_threads: 16

coordination:
  cluster:
    # Larger gossip intervals for high throughput
    gossip_interval: 500ms
    gossip_fanout: 2  # Reduce gossip overhead

transport:
  type: quic  # Better performance than TCP
  max_streams: 1000
  congestion_control: bbr  # Better for high throughput
```

### Low Latency Configuration

For minimum message latency:

```yaml
# Optimized for latency (< 100μs P50)
runtime:
  agent_pool_size: 200  # More pre-warmed agents
  max_agents: 1000  # Fewer agents, less contention
  cpu_affinity: true  # Pin to CPU cores

messaging:
  queue_size: 10000  # Smaller queue, less queuing delay
  delivery_timeout: 1s
  priority_routing: true  # Priority queue for important messages

  # Direct routing, no batching
  batch_size: 1
  parallel_routes: 1

coordination:
  cluster:
    # Faster gossip for quick convergence
    gossip_interval: 100ms
    gossip_fanout: 4
    probe_interval: 500ms

transport:
  type: quic
  idle_timeout: 100ms  # Quick connection cleanup
  max_concurrent_streams: 100
```

### Memory-Optimized Configuration

For resource-constrained environments:

```yaml
# Optimized for low memory usage
runtime:
  max_agents: 500
  agent_pool_size: 10

  # Strict memory limits
  max_memory_per_agent: 10MB
  agent_heap_size: 5MB
  agent_stack_size: 512KB

messaging:
  queue_size: 5000
  enable_compression: true  # Trade CPU for memory

  # Aggressive cleanup
  message_ttl: 60s
  cleanup_interval: 10s

storage:
  # Compact storage settings
  type: sqlite
  page_cache_size: 10MB
  wal_size_limit: 50MB
  auto_vacuum: full
```

## SWIM Protocol Tuning

### Cluster Size Optimization

Tune SWIM parameters based on cluster size:

```yaml
# Small cluster (< 10 nodes)
coordination:
  cluster:
    gossip_interval: 100ms
    gossip_fanout: 3
    probe_interval: 500ms
    suspicion_multiplier: 3

# Medium cluster (10-50 nodes)
coordination:
  cluster:
    gossip_interval: 200ms
    gossip_fanout: 4
    probe_interval: 1s
    suspicion_multiplier: 4

# Large cluster (> 50 nodes)
coordination:
  cluster:
    gossip_interval: 500ms
    gossip_fanout: 5
    probe_interval: 2s
    suspicion_multiplier: 5
```

### Network Condition Tuning

Adjust for network conditions:

```yaml
# Low latency network (same datacenter)
coordination:
  cluster:
    probe_timeout: 200ms
    gossip_interval: 100ms
    indirect_probes: 2

# High latency network (cross-region)
coordination:
  cluster:
    probe_timeout: 2s
    gossip_interval: 1s
    indirect_probes: 5

# Unreliable network
coordination:
  cluster:
    probe_timeout: 5s
    suspicion_multiplier: 6
    gossip_to_dead: 5  # More attempts before marking dead
```

## Agent Performance

### Agent Pool Tuning

Pre-warm agents for better startup latency:

```rust
// Configure agent pool
pub struct AgentPoolConfig {
    // Number of pre-warmed instances
    pool_size: 100,

    // Warm up strategy
    warmup_strategy: WarmupStrategy::Eager,

    // Instance recycling
    max_reuse_count: 1000,
    recycle_after: Duration::from_hours(1),
}
```

### Resource Limits

Set appropriate resource limits:

```yaml
agents:
  default_limits:
    memory: 50MB
    cpu_shares: 100m  # 0.1 CPU core
    max_execution_time: 10s

  # Per-agent overrides
  overrides:
    - agent_id: heavy-processor
      memory: 500MB
      cpu_shares: 1000m  # 1 full core

    - agent_id: quick-responder
      memory: 10MB
      cpu_shares: 50m
      max_execution_time: 100ms
```

### WebAssembly Optimization

```rust
// WASM runtime optimization
pub struct WasmOptimization {
    // JIT compilation
    jit_enabled: true,
    optimization_level: OptLevel::Speed,

    // Caching
    cache_compiled_modules: true,
    module_cache_size: 100,

    // Memory management
    memory_pooling: true,
    stack_pooling: true,
}
```

## Message Routing Optimization

### Batch Processing

Process messages in batches for efficiency:

```rust
pub struct BatchProcessor {
    batch_size: 1000,
    batch_timeout: Duration::from_millis(10),
    parallel_batches: 4,
}

impl BatchProcessor {
    pub async fn process(&self) {
        // Collect messages into batches
        let batch = self.collect_batch().await;

        // Process batch in parallel
        let results = batch
            .par_iter()
            .map(|msg| self.process_message(msg))
            .collect();
    }
}
```

### Priority Routing

Implement priority queues for important messages:

```yaml
messaging:
  priority_routing:
    enabled: true
    queues:
      - name: critical
        priority: 0
        max_latency: 10ms

      - name: high
        priority: 1
        max_latency: 100ms

      - name: normal
        priority: 2
        max_latency: 1s

      - name: low
        priority: 3
        max_latency: 10s
```

### Connection Pooling

Optimize connection management:

```yaml
transport:
  connection_pool:
    min_idle: 10
    max_idle: 100
    max_lifetime: 300s
    idle_timeout: 60s

    # Per-node connection limits
    max_connections_per_node: 50

    # Connection warming
    pre_warm: true
    warm_connections: 5
```

## Storage Optimization

### SQLite Tuning

Optimize local SQLite storage:

```sql
-- Performance pragmas
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -64000;  -- 64MB cache
PRAGMA page_size = 4096;
PRAGMA mmap_size = 268435456;  -- 256MB memory-mapped I/O
PRAGMA temp_store = MEMORY;

-- Optimize queries
CREATE INDEX idx_agents_status ON agents(status);
CREATE INDEX idx_messages_timestamp ON messages(timestamp);
CREATE INDEX idx_conversations_active ON conversations(active) WHERE active = 1;
```

### Write Batching

Batch database writes:

```rust
pub struct BatchWriter {
    batch_size: 1000,
    flush_interval: Duration::from_millis(100),
}

impl BatchWriter {
    pub async fn write(&self, records: Vec<Record>) {
        // Begin transaction
        let tx = self.db.begin().await?;

        // Batch insert
        for chunk in records.chunks(self.batch_size) {
            tx.insert_batch(chunk).await?;
        }

        // Commit once
        tx.commit().await?;
    }
}
```

## Network Optimization

### QUIC Transport Tuning

```yaml
transport:
  quic:
    # Congestion control
    congestion_control: bbr  # or cubic

    # Stream management
    max_concurrent_streams: 1000
    stream_receive_window: 1MB
    connection_receive_window: 10MB

    # Keep-alive
    keep_alive_interval: 30s
    idle_timeout: 120s

    # 0-RTT for lower latency
    enable_0rtt: true

    # Datagram support
    enable_datagram: true
    max_datagram_size: 1200
```

### TCP Tuning (if not using QUIC)

```yaml
transport:
  tcp:
    # Disable Nagle's algorithm for low latency
    nodelay: true

    # Keep-alive settings
    keepalive: true
    keepalive_time: 30s
    keepalive_interval: 10s
    keepalive_probes: 3

    # Buffer sizes
    send_buffer_size: 256KB
    recv_buffer_size: 256KB

    # Connection settings
    connect_timeout: 5s
    linger: 0
```

## Monitoring Performance

### Key Metrics to Monitor

```bash
# Latency metrics
curl -s localhost:9090/metrics | grep latency
# caxton_routing_latency_seconds{quantile="0.5"} 0.0001
# caxton_routing_latency_seconds{quantile="0.99"} 0.001

# Throughput metrics
curl -s localhost:9090/metrics | grep throughput
# caxton_messages_per_second 45123
# caxton_agents_messages_processed_total 1234567

# Resource utilization
curl -s localhost:9090/metrics | grep resource
# caxton_memory_used_bytes 5242880000
# caxton_cpu_usage_percent 34.5
```

### Performance Testing

Run performance benchmarks:

```bash
# Run standard benchmark suite
caxton benchmark run --suite standard

# Custom load test
caxton benchmark custom \
  --agents 1000 \
  --messages-per-second 10000 \
  --duration 60s \
  --pattern request-reply

# Stress test to find limits
caxton benchmark stress \
  --ramp-up-time 60s \
  --max-agents 10000 \
  --find-breaking-point
```

### Profiling

Profile to identify bottlenecks:

```bash
# CPU profiling
caxton profile cpu --duration 30s --output cpu.prof

# Memory profiling
caxton profile memory --interval 1s --output mem.prof

# Trace profiling for latency analysis
caxton profile trace --messages 1000 --output trace.json
```

## Common Performance Issues

### Issue: High Message Latency

**Symptoms:**
- P99 latency > 10ms
- Message queue growing

**Diagnosis:**
```bash
caxton performance diagnose --issue high-latency
```

**Solutions:**
1. Increase worker threads
2. Enable priority routing
3. Reduce batch size
4. Check for slow agents

### Issue: Memory Growth

**Symptoms:**
- Steadily increasing memory usage
- OOM kills

**Diagnosis:**
```bash
caxton memory analyze --duration 1h
```

**Solutions:**
1. Enable message TTL
2. Reduce agent pool size
3. Increase cleanup frequency
4. Check for memory leaks in agents

### Issue: Gossip Storm

**Symptoms:**
- High network traffic
- Slow convergence

**Diagnosis:**
```bash
caxton cluster analyze-gossip
```

**Solutions:**
1. Increase gossip interval
2. Reduce gossip fanout
3. Tune suspicion multiplier
4. Check for network issues

## Performance Best Practices

1. **Measure First**: Always benchmark before optimizing
2. **Monitor Continuously**: Set up alerts for performance regression
3. **Test Under Load**: Test with realistic workloads
4. **Profile Regularly**: Regular profiling catches issues early
5. **Tune Gradually**: Make one change at a time
6. **Document Changes**: Keep records of what worked
7. **Plan Capacity**: Keep 20-30% headroom

## Advanced Optimizations

### CPU Affinity

Pin agents to specific CPU cores:

```yaml
runtime:
  cpu_affinity:
    enabled: true
    strategy: numa_aware

    # Pin critical agents
    pinned_agents:
      - agent_id: router
        cpu_cores: [0, 1]

      - agent_id: coordinator
        cpu_cores: [2, 3]
```

### NUMA Awareness

Optimize for NUMA systems:

```yaml
runtime:
  numa:
    enabled: true
    memory_policy: local_alloc
    cpu_bind: true
    interleave_memory: false
```

### Custom Memory Allocator

Use jemalloc for better performance:

```bash
# Install jemalloc
apt-get install libjemalloc-dev

# Run with jemalloc
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so caxton server start
```

## References

- [ADR-0017: Performance Requirements]../adr/0017-performance-requirements.md
- [Performance Benchmarking Guide]../benchmarks/performance-benchmarking-guide.md
- [Testing Strategy]../development/testing-strategy.md
- [Clustering Guide]../user-guide/clustering.md