cloudllm 0.10.4

A batteries-included Rust toolkit for building intelligent agents with LLM integration, multi-protocol tool support, and multi-agent orchestration.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
# Multi-Agent Orchestration Tutorial: A Practical Cookbook

## Introduction

This tutorial demonstrates how to build multi-agent AI systems using CloudLLM's Orchestration framework. We'll progress through six collaboration patterns from simple to complex, with a focus on understanding **costs, runtime expectations, and real-world tradeoffs**.

**⚠️ Cost & Runtime Warning**: This tutorial emphasizes cost implications because multi-agent orchestrations can run up bills quickly. We provide concrete examples with token estimates and timing for each mode.

---

## Quick Reference: Modes by Complexity & Cost

| Mode | Complexity | Est. Runtime | Est. Cost (4 agents) | Best For | ⚠️ Cost Risk |
|------|-----------|--------------|---------------------|----------|-------------|
| **AnthropicAgentTeams** | ★★★★★ | 2-5 min | $0.30-$1.00 | Large task pools | HIGH if max_iterations too high |
| **RALPH** | ★★★☆☆ | 3-20 min | $0.40-$9.00 | Checklist completion | MEDIUM (controlled iterations) |
| **Debate** | ★★★★☆ | 5-15 min | $0.60-$2.00 | Consensus building | **VERY HIGH** (exponential with rounds) |
| **Parallel** | ★☆☆☆☆ | 10-20 sec | $0.10-$0.30 | Independent opinions | LOW |
| **RoundRobin** | ★★☆☆☆ | 20-60 sec | $0.15-$0.50 | Sequential refinement | LOW-MEDIUM |
| **Moderated** | ★★★☆☆ | 30-90 sec | $0.20-$0.60 | Q&A sessions | MEDIUM |
| **Hierarchical** | ★★★★☆ | 1-3 min | $0.25-$0.80 | Multi-level problems | MEDIUM |

---

# MODE 1: AnthropicAgentTeams — Decentralized Task Coordination

## Overview

**AnthropicAgentTeams** is a **completely decentralized** orchestration mode where agents autonomously discover, claim, and complete tasks from a shared pool with **no central orchestrator**. This is the most powerful mode for large, complex projects but also the easiest to over-run and waste money.

**Key Insight**: Instead of the orchestration engine assigning tasks (like RALPH), agents use Memory to coordinate work peer-to-peer. This enables true autonomous multi-agent teams.

### ⚠️ COST WARNING

- **Per Iteration Cost**: ~$0.05-$0.15 per agent (4 agents = $0.20-$0.60/iteration)
- **Default Settings**: 4 iterations × 8 tasks = 16-32 LLM calls
- **Worst Case**: Setting `max_iterations: 100` with 4 agents = **3200 LLM calls** = **$1000+** in costs
- **How to Avoid**: Always cap `max_iterations` to ~2-3x your task count. For 8 tasks with 4 agents: use `max_iterations: 5` max.

### Runtime Expectations

- **Best case**: All tasks claimed and completed → ~2-3 minutes
- **Average case**: Agents work through pool → ~3-5 minutes
- **Worst case**: Poor task design, many retries → 10+ minutes

### Example: Research Team with NMN+ Study (8 Tasks)

```rust
use cloudllm::{
    Agent,
    orchestration::{Orchestration, OrchestrationMode, WorkItem},
    clients::openai::OpenAIClient,
    clients::claude::{ClaudeClient, Model},
    event::{EventHandler, OrchestrationEvent},
};
use async_trait::async_trait;
use std::sync::Arc;

/// Event handler for cost monitoring
struct CostTracker {
    iteration: std::sync::atomic::AtomicUsize,
}

#[async_trait]
impl EventHandler for CostTracker {
    async fn on_orchestration_event(&self, event: &OrchestrationEvent) {
        match event {
            OrchestrationEvent::RoundStarted { round, .. } => {
                println!("📍 Iteration {} starting...", round);
            }
            OrchestrationEvent::TaskClaimed {
                agent_name,
                task_id,
                ..
            } => {
                println!("  ✋ {} claimed: {}", agent_name, task_id);
            }
            OrchestrationEvent::TaskCompleted {
                agent_name,
                task_id,
                ..
            } => {
                println!("  ✅ {} completed: {}", agent_name, task_id);
            }
            OrchestrationEvent::RoundCompleted { .. } => {
                println!("  Cost for this iteration: ~$0.30-$0.50");
            }
            _ => {}
        }
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Define task pool
    let tasks = vec![
        WorkItem::new(
            "research_nmn",
            "Research phase — NMN+ mechanisms",
            "Summarize NAD+ pathways, mitochondrial function, sirtuins in 2-3 paragraphs",
        ),
        WorkItem::new(
            "analyze_longevity",
            "Analysis phase — longevity mechanisms",
            "Extract 3-5 key aging reversal pathways; estimate lifespan impact",
        ),
        WorkItem::new(
            "research_alzheimers",
            "Research phase — Alzheimer's pathology",
            "Document amyloid-beta, tau tangles, neuroinflammation; summarize in 2 paragraphs",
        ),
        WorkItem::new(
            "analyze_neuroprotection",
            "Analysis phase — neuroprotective mechanisms",
            "Map how NAD+ restoration combats neurodegeneration (5+ specific mechanisms)",
        ),
        WorkItem::new(
            "memory_recovery",
            "Research phase — memory recovery evidence",
            "Find 3+ studies showing cognitive restoration in AD models; summarize findings",
        ),
        WorkItem::new(
            "clinical_integration",
            "Analysis phase — clinical feasibility",
            "Assess dosing, bioavailability, safety profile; recommend next clinical trial",
        ),
        WorkItem::new(
            "synthesis_report",
            "Writing phase — comprehensive synthesis",
            "Write 3-4 page executive report integrating all findings with clear conclusions",
        ),
        WorkItem::new(
            "final_review",
            "Quality review — peer review assessment",
            "Review report for accuracy, completeness, evidence quality; suggest improvements",
        ),
    ];

    println!("═══════════════════════════════════════════════════════");
    println!("   NMN+ Research Team — AnthropicAgentTeams Mode");
    println!("═══════════════════════════════════════════════════════\n");

    println!("⚠️  COST ESTIMATE:");
    println!("  - 8 tasks × 4 agents = max 32 LLM calls");
    println!("  - At $0.05-0.10/call = $1.60-$3.20 total");
    println!("  - Runtime: ~3-5 minutes\n");

    // Create agents with mixed providers
    let openai_key = std::env::var("OPENAI_API_KEY")?;
    let anthropic_key = std::env::var("ANTHROPIC_API_KEY")?;

    let researcher = Agent::new(
        "researcher",
        "Research Agent (GPT-4o-mini)",
        Arc::new(OpenAIClient::new_with_model_string(&openai_key, "gpt-4o-mini")),
    );

    let analyst = Agent::new(
        "analyst",
        "Analysis Agent (Claude Haiku 4.5)",
        Arc::new(ClaudeClient::new_with_model_enum(&anthropic_key, Model::ClaudeHaiku45)),
    );

    let writer = Agent::new(
        "writer",
        "Writing Agent (GPT-4o-mini)",
        Arc::new(OpenAIClient::new_with_model_string(&openai_key, "gpt-4o-mini")),
    );

    let reviewer = Agent::new(
        "reviewer",
        "Review Agent (Claude Haiku 4.5)",
        Arc::new(ClaudeClient::new_with_model_enum(&anthropic_key, Model::ClaudeHaiku45)),
    );

    // ⚠️ CRITICAL: max_iterations calculation
    // Formula: (task_count / agent_count) * 1.5, capped at 5
    // 8 tasks / 4 agents = 2 * 1.5 = 3, use 4 for safety
    let max_iterations = 4;  // DO NOT SET TO 100!

    let mut orchestration = Orchestration::new(
        "nmn-research-team",
        "NMN+ & Alzheimer's Research Team",
    )
    .with_mode(OrchestrationMode::AnthropicAgentTeams {
        pool_id: "nmn-study-2024".to_string(),
        tasks: tasks.clone(),
        max_iterations,
    })
    .with_system_context(
        "You are a specialized researcher in a coordinated team. \
         Autonomously claim tasks from the shared pool and complete them thoroughly. \
         Build on previous agents' work when relevant. Focus on scientific accuracy \
         and clear communication. When done, report completion.",
    )
    .with_max_tokens(4096)
    .with_event_handler(Arc::new(CostTracker {
        iteration: std::sync::atomic::AtomicUsize::new(0),
    }));

    orchestration.add_agent(researcher)?;
    orchestration.add_agent(analyst)?;
    orchestration.add_agent(writer)?;
    orchestration.add_agent(reviewer)?;

    // Run orchestration
    let prompt = "Prepare a comprehensive scientific report on NMN+ for longevity and \
                   Alzheimer's disease recovery, with specific focus on memory restoration. \
                   The team will autonomously work through the 8 research tasks.";

    println!("👥 Team Members:");
    println!("  1. Researcher (GPT) — finds and summarizes sources");
    println!("  2. Analyst (Claude Haiku) — synthesizes findings");
    println!("  3. Writer (GPT) — drafts comprehensive report");
    println!("  4. Reviewer (Claude Haiku) — ensures quality\n");

    println!("⏱️  Starting orchestration...");

    let start = std::time::Instant::now();
    let response = orchestration.run(prompt, 1).await?;
    let elapsed = start.elapsed();

    println!("\n✨ RESULTS:");
    println!("  ├─ Iterations completed: {}", response.round);
    println!("  ├─ Tasks completed: {:.0}%", response.convergence_score.unwrap_or(0.0) * 100.0);
    println!("  ├─ Total time: {:.1}s", elapsed.as_secs_f32());
    println!("  ├─ Total tokens: {}", response.total_tokens_used);
    println!("  └─ Estimated cost: ${:.2}", (response.total_tokens_used as f64) * 0.00001);

    // Print sample messages
    println!("\n📝 Sample outputs:");
    for (i, msg) in response.messages.iter().take(3).enumerate() {
        if let Some(name) = &msg.agent_name {
            let preview = if msg.content.len() > 200 {
                format!("{}...", &msg.content[..200])
            } else {
                msg.content.to_string()
            };
            println!("  {}. [{}]: {}", i + 1, name, preview);
        }
    }

    Ok(())
}
```

### Key Parameters to Tune

```rust
// ✅ GOOD: Controls cost effectively
max_iterations: 4,           // 8 tasks ÷ 4 agents × 1.5 buffer = ~4 iterations
with_max_tokens(4096),       // Prevents runaway responses

// ❌ BAD: Will waste money
max_iterations: 100,         // Could run for 30+ minutes, $50+ cost
max_iterations: 50,          // Excessive iterations for 8 tasks
with_max_tokens(32768),      // Allows 100KB responses per agent
```

### Best Practices for AnthropicAgentTeams

1. **Task Design**: Keep task IDs short (`research_nmn` not `research_phase_1_nanoparticle_nmn_mechanism`)
2. **Iteration Cap**: `max_iterations = ceil(task_count / agent_count) + 1`
3. **Agent Count**: 3-6 agents per 8-15 tasks (more agents = more parallelism but higher cost)
4. **Monitoring**: Use event handler to detect stuck agents (same task claimed repeatedly)
5. **Early Exit**: If convergence_score reaches 1.0 before max_iterations, orchestration stops automatically
6. **Starter Content + Read-Modify-Write**: For file-producing tasks (e.g., building an HTML game), seed a working starter to disk and Memory (`current_game_html` key) before `run()`. Instruct agents to READ from Memory, MODIFY, and WRITE back via a custom tool that saves to both disk and Memory. See `examples/breakout_game_agent_teams.rs`.

### ⚠️ When AnthropicAgentTeams Gets Expensive

These scenarios can waste $100+:

```rust
// ❌ TOO MANY ITERATIONS
max_iterations: 50,      // Even if tasks complete in 5, runs all 50
tasks: vec![...], // 20 tasks
                         // Result: 50 × 4 agents × 5-10 calls = 1000-2000 calls = $10-50

// ❌ AMBIGUOUS TASKS
WorkItem::new("task1", "Do research", "Complete the task"),  // Agents don't know what "done" is
                         // Result: Agents keep claiming same task, never marking complete

// ❌ TOO MANY AGENTS FOR TASK POOL
max_iterations: 20,
tasks: vec![3_items], // 3 tasks
                         // Result: 4 agents all working on same 3 tasks repeatedly

// ✅ CORRECT
max_iterations: 2,       // 3 tasks ÷ 4 agents + buffer = 2 iterations
tasks: vec![...],
with_max_tokens(4096),   // Reasonable response length
```

---

# MODE 2: RALPH — Iterative Checklist with Agent Turn-Taking

## Overview

**RALPH** (Requirements Addressing Progressive Lite Heuristic) is for problems that can be broken into a **fixed checklist** of tasks. Unlike AnthropicAgentTeams, the orchestration engine manages the task list and agents signal completion via response markers.

**Best For**: Step-by-step project completion where tasks are clearly sequential or grouped.

### ⚠️ COST WARNING

- **Per Iteration**: ~$0.05-$0.15 per agent
- **Typical Cost**: 3-5 iterations × 3-4 agents = $0.45-$2.00
- **Risk**: Setting too high max_iterations for simple tasks
- **How to Avoid**: Monitor completion markers in responses; stop if no progress for 2 iterations

### Runtime Expectations

- **Simple checklist (5 items, 3 agents)**: 2-3 minutes, $0.30-$0.60
- **Medium checklist (10 items, 4 agents)**: 10-20 minutes, $3-9
- **Complex checklist (15+ items)**: 30-80 minutes, $5-10+

### Example: Breakout Game Implementation (18 Tasks)

The full breakout game examples use a **starter HTML + read-modify-write** pattern:

1. **Seed a working starter**: Before orchestration starts, a ~4KB working breakout game skeleton (paddle, ball, bricks, game loop) is written to disk and stored in Memory under `current_game_html`.
2. **Read-Modify-Write loop**: Each agent reads the current HTML from Memory (`G current_game_html`), modifies it to implement their assigned feature, then writes the updated HTML back via the `write_game_file` tool (which persists to both disk and Memory).
3. **Post-run recovery**: After orchestration completes, the code checks Memory first for the latest HTML, falls back to message extraction, then to the starter on disk.

This ensures every agent builds incrementally on the team's cumulative work and there is always a playable game on disk.

```rust
use cloudllm::{
    Agent,
    orchestration::{Orchestration, OrchestrationMode, RalphTask},
    clients::claude::{ClaudeClient, Model},
};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    println!("═══════════════════════════════════════════════════════");
    println!("   Breakout Game Implementation — RALPH Mode");
    println!("═══════════════════════════════════════════════════════\n");

    println!("⚠️  COST ESTIMATE:");
    println!("  - 18 tasks × 4 agents × ~5 iterations = many LLM calls");
    println!("  - At $0.05-0.15/call = $3-$9 total");
    println!("  - Runtime: ~10-20 minutes\n");

    // Define task checklist (18 tasks covering core mechanics, audio, powerups, etc.)
    let tasks = vec![
        RalphTask::new("html_structure", "HTML Structure", "Canvas element and game container"),
        RalphTask::new("game_states", "Game States", "MENU, PLAYING, PAUSED, GAME_OVER, LEVEL_COMPLETE"),
        RalphTask::new("paddle_control", "Paddle Control", "Keyboard and touch controls for paddle"),
        RalphTask::new("ball_physics", "Ball Physics", "Movement, angle reflection, boundary collision"),
        RalphTask::new("brick_grid", "Brick Grid", "Multi-hit bricks (1-5 HP) with color coding"),
        RalphTask::new("collision", "Collision Detection", "Ball-paddle, ball-brick, ball-wall"),
        RalphTask::new("scoring", "Score System", "Points, lives, level progression"),
        RalphTask::new("audio_engine", "Audio Engine", "Web Audio API chiptune music and SFX"),
        RalphTask::new("powerup_system", "Powerup System", "8 powerup types: paddle, speed, lava, etc."),
        RalphTask::new("particle_effects", "Particle Effects", "Fire bursts, paddle jets, 1UP displays"),
        RalphTask::new("brick_patterns", "Brick Patterns", "10+ procedural patterns per level"),
        RalphTask::new("difficulty", "Difficulty Scaling", "Dynamic difficulty by level"),
        RalphTask::new("mobile_controls", "Mobile Controls", "Touch/swipe with responsive canvas"),
        // ... (18 tasks total — see examples/breakout_game_ralph.rs for full list)
    ];

    let anthropic_key = std::env::var("ANTHROPIC_API_KEY")?;
    let make_client = || Arc::new(ClaudeClient::new_with_model_enum(&anthropic_key, Model::ClaudeSonnet45));

    let architect = Agent::new("architect", "Game Architect", make_client());
    let programmer = Agent::new("programmer", "Implementation Specialist", make_client());
    let sound_dev = Agent::new("sound", "Sound Designer", make_client());
    let powerup_dev = Agent::new("powerup", "Powerup Engineer", make_client());

    // Seed starter HTML to disk and Memory before orchestration
    // (see breakout_game_ralph.rs for full starter HTML and Memory setup)

    let mut orchestration = Orchestration::new("breakout-game", "Atari Breakout Implementation")
        .with_mode(OrchestrationMode::Ralph {
            tasks: tasks.clone(),
            max_iterations: 10,  // ⚠️ Safety cap (18 tasks / 4 agents + buffer)
        })
        .with_system_context(
            "You are implementing an Atari Breakout game in a single HTML file. \
             WORKFLOW: 1) READ current_game_html from Memory, 2) MODIFY it to \
             implement your assigned task, 3) WRITE back via write_game_file. \
             Mark done with [TASK_COMPLETE:task_id]. NEVER start from scratch.",
        )
        .with_max_tokens(180_000);

    orchestration.add_agent(architect)?;
    orchestration.add_agent(programmer)?;
    orchestration.add_agent(sound_dev)?;
    orchestration.add_agent(powerup_dev)?;

    let response = orchestration.run("Build an Atari Breakout game", 1).await?;

    println!("Iterations: {}", response.round);
    println!("Progress: {:.0}%", response.convergence_score.unwrap_or(0.0) * 100.0);

    // Post-run: check Memory first for latest HTML, then messages, then starter on disk
    Ok(())
}
```

### RALPH vs. AnthropicAgentTeams: Decision Matrix

| Scenario | Use RALPH | Use AnthropicAgentTeams |
|----------|-----------|------------------------|
| < 8 tasks | ✅ Yes | ❌ No (overkill) |
| 8-20 tasks | ✅ Maybe | ✅ Yes (better) |
| 20+ tasks | ❌ No | ✅ Yes (scales better) |
| Tasks are sequential | ✅ Yes | ✅ Yes (but looser) |
| Need tight orchestration control | ✅ Yes | ❌ No |
| Want agent autonomy | ❌ No | ✅ Yes |
| Building a game/app | ✅ Yes | ✅ Yes (both work) |
| Research/analysis project | ❌ No | ✅ Yes |

---

# MODE 3: Debate — Consensus Through Adversarial Refinement

## Overview

**Debate** mode has agents argue positions and refine their stances based on counterarguments. Agents continue until they reach **convergence** (word-set similarity) or hit max_rounds.

**Best For**: Contested decisions, exploring tradeoff spaces, stress-testing assumptions.

### ⚠️ COST WARNING — THIS ONE IS EXPENSIVE

- **Per Round**: ~$0.10-$0.30 per agent (5 agents = $0.50-$1.50/round)
- **Typical Run**: 3-5 rounds = $1.50-$7.50
- **Worst Case**: 5 agents × 10 rounds = **$5-15** easily
- **Exponential Risk**: Each extra round doubles cost. Going from 3 to 5 rounds = +$1.50-$3.00
- **How to Avoid**: Start with `max_rounds: 3`, increase only if needed; set `convergence_threshold: 0.70` (looser = fewer rounds)

### Runtime Expectations

- **Fast debate (2-3 rounds)**: 3-5 minutes
- **Medium debate (4-5 rounds)**: 6-10 minutes
- **Long debate (6+ rounds)**: 12+ minutes, **$10+ cost**

### Example: Carbon Pricing Debate (5 Positions)

```rust
use cloudllm::{
    Agent,
    orchestration::{Orchestration, OrchestrationMode},
    clients::openai::OpenAIClient,
    clients::claude::{ClaudeClient, Model},
    clients::gemini::GeminiClient,
};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    println!("═══════════════════════════════════════════════════════");
    println!("   Carbon Pricing Debate — Debate Mode");
    println!("═══════════════════════════════════════════════════════\n");

    println!("⚠️  COST WARNING (THIS IS EXPENSIVE!):");
    println!("  - 5 agents × 3 rounds minimum = 15 LLM calls");
    println!("  - Per-call cost: $0.03-0.10");
    println!("  - Estimated total: $0.45-$1.50");
    println!("  - But if agents don't converge, can go to 5 rounds = $0.75-$2.50");
    println!("  - Worst case (no convergence, 10 rounds): $1.50-$5.00\n");

    println!("⏱️  ESTIMATED TIME: 4-10 minutes (watch the clock!)\n");

    // Create agents with distinct perspectives
    let openai_key = std::env::var("OPENAI_API_KEY")?;
    let anthropic_key = std::env::var("ANTHROPIC_API_KEY")?;
    let gemini_key = std::env::var("GEMINI_API_KEY")?;

    let optimist = Agent::new(
        "market-optimist",
        "Dr. Chen (Market Optimist)",
        Arc::new(OpenAIClient::new_with_model_string(&openai_key, "gpt-4o")),
    )
    .with_expertise("Market mechanisms, technology cost curves, innovation economics")
    .with_personality(
        "Believes technology curves will make carbon capture cost-effective. \
         Advocates low carbon price ($25-50/ton) with strong R&D support.",
    );

    let hawk = Agent::new(
        "climate-hawk",
        "Dr. Andersson (Climate Emergency Advocate)",
        Arc::new(ClaudeClient::new_with_model_enum(&anthropic_key, Model::ClaudeSonnet45)),
    )
    .with_expertise("Climate science, tipping points, social cost of carbon")
    .with_personality(
        "Emphasizes climate urgency and intergenerational justice. \
         Advocates high carbon price ($150-200/ton) to reflect true social cost.",
    );

    let pragmatist = Agent::new(
        "pragmatist",
        "Dr. Patel (Economic Pragmatist)",
        Arc::new(GeminiClient::new_with_model_string(&gemini_key, "gemini-1.5-pro")),
    )
    .with_expertise("Development economics, political feasibility, policy design")
    .with_personality(
        "Balances climate urgency with political reality. \
         Advocates moderate, escalating carbon price ($50-100/ton, rising $5/year).",
    );

    let industry = Agent::new(
        "industry-realist",
        "Dr. Mueller (Industrial Engineer)",
        Arc::new(OpenAIClient::new_with_model_string(&openai_key, "gpt-4o-mini")),
    )
    .with_expertise("Industrial capital investment, competitiveness, carbon leakage")
    .with_personality(
        "Represents industry constraints. Warns high prices cause carbon leakage. \
         Advocates $30-60/ton with competitiveness safeguards.",
    );

    let analyst = Agent::new(
        "systems-analyst",
        "Dr. Okonkwo (Systems Analyst)",
        Arc::new(ClaudeClient::new_with_model_enum(&anthropic_key, Model::ClaudeHaiku45)),
    )
    .with_expertise("Policy modeling, feedback loops, unintended consequences")
    .with_personality(
        "Analyzes second- and third-order effects. Seeks price that optimizes \
         multiple objectives: climate action, economic efficiency, equity.",
    );

    // Create orchestration
    let mut orchestration = Orchestration::new("carbon-pricing-debate", "Carbon Pricing Policy Debate")
        .with_mode(OrchestrationMode::Debate {
            max_rounds: 4,                      // ⚠️ CRITICAL: Cap at 4, not 10!
            convergence_threshold: Some(0.70), // Higher threshold = earlier convergence = lower cost
        })
        .with_system_context(
            "You are a policy expert in a rigorous debate. Argue your position with evidence. \
             Acknowledge valid points from others. Seek common ground where possible. \
             Aim for robust consensus, not groupthink.",
        )
        .with_max_tokens(6144);

    orchestration.add_agent(optimist)?;
    orchestration.add_agent(hawk)?;
    orchestration.add_agent(pragmatist)?;
    orchestration.add_agent(industry)?;
    orchestration.add_agent(analyst)?;

    let prompt = "What carbon price ($/ton CO2) should be implemented globally? \
                  Consider: CCS costs ($50-150/ton), social cost of carbon ($75-200/ton), \
                  political feasibility, industrial competitiveness, climate urgency.";

    println!("🎙️  Debate participants: 5 agents with distinct perspectives");
    println!("📊 Max rounds: 4 (prevents runaway costs)");
    println!("⏱️  Starting debate...\n");

    let start = std::time::Instant::now();
    let response = orchestration.run(prompt, 1).await?;
    let elapsed = start.elapsed();

    println!("\n✨ DEBATE RESULTS:");
    println!("  ├─ Rounds completed: {}", response.round);
    println!("  ├─ Converged: {}", response.is_complete);
    if let Some(score) = response.convergence_score {
        println!("  ├─ Convergence score: {:.1}%", score * 100.0);
    }
    println!("  ├─ Time: {:.1}s", elapsed.as_secs_f32());
    println!("  ├─ Tokens: {}", response.total_tokens_used);
    println!("  └─ Cost: ${:.2}", (response.total_tokens_used as f64) * 0.00002);

    println!("\n💡 Interpretation:");
    if response.is_complete {
        println!("  ✅ Agents converged to consensus position");
    } else {
        println!("  ⚠️  Max rounds reached without full convergence (diverse views remain)");
    }

    // Show final positions
    println!("\n📄 Final positions (last 2 messages):");
    for msg in response.messages.iter().rev().take(2) {
        if let Some(name) = &msg.agent_name {
            let preview = if msg.content.len() > 250 {
                format!("{}...", &msg.content[..250])
            } else {
                msg.content.clone()
            };
            println!("\n  [{}]: {}", name, preview);
        }
    }

    Ok(())
}
```

### Debate Convergence Tuning

**The convergence_threshold parameter controls cost directly:**

```rust
// ❌ COSTS $5+: Requires high agreement to stop
OrchestrationMode::Debate {
    max_rounds: 10,
    convergence_threshold: Some(0.95),  // Need 95% similarity = many rounds
}

// ✅ COSTS $1-2: Balanced
OrchestrationMode::Debate {
    max_rounds: 5,
    convergence_threshold: Some(0.70),  // 70% similar = stops sooner
}

// ✅ COSTS $0.50: Loose consensus
OrchestrationMode::Debate {
    max_rounds: 3,
    convergence_threshold: Some(0.60),  // 60% = stops very quickly
}
```

---

# MODE 4: Parallel — Independent Expert Analysis

## Overview

**Parallel** mode is the **cheapest and fastest** — all agents respond simultaneously to the same prompt, with no interaction.

**Best For**: Independent opinions, quick polls, parallel processing.

### Cost Profile

- **Cost**: $0.05-$0.15 per agent, regardless of rounds
- **Time**: 15-30 seconds for most responses
- **Example**: 4 agents, 1 round = $0.20-$0.60, 30 seconds

### Example

```rust
let mut orchestration = Orchestration::new("parallel-demo", "Parallel Analysis")
    .with_mode(OrchestrationMode::Parallel);

// Add agents...

let response = orchestration.run(
    "Analyze these three carbon capture technologies independently. \
     1) Direct Air Capture, 2) Point Source Capture, 3) Ocean-based capture",
    1
).await?;

println!("Completed in 30 seconds, cost $0.25");
```

---

# MODE 5: Round-Robin — Sequential Deliberation

## Overview

Each agent speaks in turn, building on previous agents' responses. Useful for brainstorming, iterative refinement, and getting sequential perspectives.

**Best For**: Creative collaboration, iterative problem-solving, building consensus gradually.

### Cost Profile

- **Cost**: $0.10-$0.40 per round (4 agents × 2 rounds = $0.20-$0.80)
- **Time**: 30-90 seconds per round

### Example

```rust
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let claude_key = std::env::var("ANTHROPIC_API_KEY")?;

    let analyst1 = Agent::new(
        "analyst1",
        "Data Analyst",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    );

    let analyst2 = Agent::new(
        "analyst2",
        "Business Strategist",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    );

    let analyst3 = Agent::new(
        "analyst3",
        "Risk Manager",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    );

    let mut orchestration = Orchestration::new("roundrobin-demo", "Market Analysis Round-Robin")
        .with_mode(OrchestrationMode::RoundRobin { max_rounds: 3 });

    orchestration.add_agent(analyst1)?;
    orchestration.add_agent(analyst2)?;
    orchestration.add_agent(analyst3)?;

    let response = orchestration.run(
        "Analyze the investment potential of electric vehicle manufacturers. \
         Analyst1: Present market data and trends. \
         Analyst2: Build on that with strategic insights. \
         Analyst3: Then address risks and mitigations.",
        1
    ).await?;

    println!("Round-Robin completed in {} rounds, {} tokens", response.round, response.total_tokens_used);

    Ok(())
}
```

---

# MODE 6: Moderated — Expert Routing

## Overview

A moderator agent receives the prompt and decides which experts to consult. Experts only respond when asked by the moderator, optimizing token usage.

**Best For**: Complex questions requiring selective expert consultation, reducing unnecessary API calls.

### Cost Profile

- **Cost**: $0.15-$0.60 per run (moderator + selected experts only)
- **Time**: 45-120 seconds
- **Best for**: Q&A sessions, dynamic problem routing

### Example

```rust
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let claude_key = std::env::var("ANTHROPIC_API_KEY")?;

    let moderator = Agent::new(
        "moderator",
        "Interview Moderator",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    )
    .with_expertise("Directing technical interviews and routing questions to specialists");

    let systems_expert = Agent::new(
        "systems_expert",
        "Systems Design Expert",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    )
    .with_expertise("Large-scale systems architecture, scalability, distributed systems");

    let algo_expert = Agent::new(
        "algo_expert",
        "Algorithms Expert",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    )
    .with_expertise("Algorithm design, time/space complexity, advanced data structures");

    let mut orchestration = Orchestration::new("moderated-demo", "Technical Interview")
        .with_mode(OrchestrationMode::Moderated {
            moderator_id: "moderator".to_string(),
            respondent_ids: vec!["systems_expert".to_string(), "algo_expert".to_string()],
        });

    orchestration.add_agent(moderator)?;
    orchestration.add_agent(systems_expert)?;
    orchestration.add_agent(algo_expert)?;

    let response = orchestration.run(
        "We're building a real-time recommendation system. \
         Question 1: How should we design the system architecture? \
         Question 2: What algorithms would optimize matching speed?",
        1
    ).await?;

    println!("Moderated run: {} tokens (only moderator + selected experts called)", response.total_tokens_used);

    Ok(())
}
```

---

# MODE 7: Hierarchical — Multi-Layer Decision Making

## Overview

Multi-layer processing: Workers generate initial analysis, Supervisors review and synthesize, Executives make final decisions. Each layer's output feeds into the next.

**Best For**: Complex organizational decisions, multi-stage refinement, hierarchical problem decomposition.

### Cost Profile

- **Cost**: $0.25-$0.80 per run
- **Time**: 1-3 minutes

### Example

```rust
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let claude_key = std::env::var("ANTHROPIC_API_KEY")?;

    // Layer 1: Workers (specialists gather information)
    let researcher1 = Agent::new(
        "researcher1",
        "Market Researcher",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    )
    .with_expertise("Market analysis, customer trends, competitive landscape");

    let researcher2 = Agent::new(
        "researcher2",
        "Technical Researcher",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    )
    .with_expertise("Technology feasibility, implementation challenges, engineering effort");

    // Layer 2: Supervisors (synthesize and prioritize)
    let product_lead = Agent::new(
        "product_lead",
        "Product Manager",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    )
    .with_expertise("Product strategy, feature prioritization, user impact");

    // Layer 3: Executive (final decision)
    let ceo = Agent::new(
        "ceo",
        "CEO",
        Arc::new(ClaudeClient::new_with_model_enum(&claude_key, Model::ClaudeHaiku45)),
    )
    .with_expertise("Business strategy, resource allocation, long-term vision");

    let mut orchestration = Orchestration::new("hierarchical-demo", "Product Feature Decision")
        .with_mode(OrchestrationMode::Hierarchical {
            layers: vec![
                vec!["researcher1".to_string(), "researcher2".to_string()],  // Layer 1: Workers
                vec!["product_lead".to_string()],                             // Layer 2: Supervisor
                vec!["ceo".to_string()],                                      // Layer 3: Executive
            ],
        });

    orchestration.add_agent(researcher1)?;
    orchestration.add_agent(researcher2)?;
    orchestration.add_agent(product_lead)?;
    orchestration.add_agent(ceo)?;

    let response = orchestration.run(
        "Should we invest in building an AI-powered personalization engine? \
         Workers: Analyze market demand, technical complexity, implementation timeline. \
         Product: Synthesize findings, prioritize requirements, estimate ROI. \
         CEO: Make final strategic decision with full context.",
        1
    ).await?;

    println!("Hierarchical decision: {} tokens over {} rounds", response.total_tokens_used, response.round);

    Ok(())
}
```

---

## Cost Comparison Summary

| Mode | 4 Agents, 1 Round | Notes |
|------|------------------|-------|
| Parallel | $0.20-$0.60 | Fastest, cheapest |
| RoundRobin | $0.30-$0.80 | 2-3 rounds recommended |
| Moderated | $0.25-$0.70 | Dynamic routing |
| Hierarchical | $0.35-$0.90 | Multi-layer synthesis |
| RALPH | $0.40-$1.20 | Per iteration |
| Debate | $0.50-$2.00 | ⚠️ Varies by convergence |
| AnthropicAgentTeams | $0.30-$1.00 | Per iteration |

---

## Avoiding Expensive Mistakes

### ❌ Mistake #1: Infinite Debate

```rust
// BAD: No cap on rounds
OrchestrationMode::Debate {
    max_rounds: 1000,  // Agents keep arguing, $50+ cost
    convergence_threshold: Some(0.99),  // Convergence never reached
}
```

**Fix**: Cap at 3-5 rounds, set convergence to 0.65-0.75

### ❌ Mistake #2: Too Many Iterations

```rust
// BAD: Excessive iterations for small task pool
max_iterations: 100,   // 100 × 4 agents = 400+ calls
tasks: vec![...],      // Only 5 tasks!
```

**Fix**: Use formula `ceil(task_count / agent_count) + buffer`

### ❌ Mistake #3: Oversized Token Budget

```rust
// BAD: Allows 100KB responses per agent
with_max_tokens(32768),  // 4 agents × 32K tokens = runaway costs
```

**Fix**: Use 4096-8192 for normal tasks

### ✅ Best Practice: Always Monitor

```rust
let response = orchestration.run(prompt, rounds).await?;

// Print cost before accepting results
let estimated_cost = (response.total_tokens_used as f64) * 0.00002;
println!("Cost: ${:.2}", estimated_cost);

if estimated_cost > 5.0 {
    eprintln!("⚠️  WARNING: High cost run. Review mode parameters.");
}
```

---

## Complete Multi-Mode Pipeline Example

```rust
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    println!("🚀 Multi-Stage Orchestration Pipeline");
    println!("   Stage 1: Parallel analysis ($0.30)");
    println!("   Stage 2: Debate for selection ($1.50)");
    println!("   Stage 3: Hierarchical planning ($0.50)");
    println!("   Total estimate: $2.30\n");

    // STAGE 1: Parallel independent analysis
    let mut stage1 = Orchestration::new("stage1", "Tech Assessment")
        .with_mode(OrchestrationMode::Parallel);

    stage1.add_agent(Agent::new("tech1", "DAC Expert", ...))?;
    stage1.add_agent(Agent::new("tech2", "Point Source Expert", ...))?;

    let result1 = stage1.run("Evaluate your assigned technology", 1).await?;
    println!("Stage 1: ${:.2}", (result1.total_tokens_used as f64) * 0.00002);

    // STAGE 2: Debate to select winner
    let mut stage2 = Orchestration::new("stage2", "Technology Selection")
        .with_mode(OrchestrationMode::Debate {
            max_rounds: 3,
            convergence_threshold: Some(0.70),
        });

    stage2.add_agent(Agent::new("advocate1", "DAC Advocate", ...))?;
    stage2.add_agent(Agent::new("advocate2", "Point Source Advocate", ...))?;

    let result2 = stage2.run("Argue for your preferred technology", 1).await?;
    println!("Stage 2: ${:.2}", (result2.total_tokens_used as f64) * 0.00002);

    // STAGE 3: Hierarchical deployment planning
    let mut stage3 = Orchestration::new("stage3", "Deployment Planning")
        .with_mode(OrchestrationMode::Hierarchical {
            layers: vec![
                vec!["regional1", "regional2"],
                vec!["executive"],
            ],
        });

    // Add agents...

    let result3 = stage3.run("Create deployment strategy", 1).await?;
    println!("Stage 3: ${:.2}", (result3.total_tokens_used as f64) * 0.00002);

    let total = result1.total_tokens_used + result2.total_tokens_used + result3.total_tokens_used;
    println!("\nTotal tokens: {}", total);
    println!("Total cost: ${:.2}", (total as f64) * 0.00002);

    Ok(())
}
```

---

## Key Takeaways

1. **Parallel is cheapest** (~$0.30, 30 sec) — use when agents don't need to interact
2. **RALPH is predictable** (~$0.50-$1.00/iteration) — use for fixed checklists
3. **Debate is expensive** (~$1.50-$5.00) — always cap rounds and set convergence threshold
4. **AnthropicAgentTeams is powerful but risks** — cap `max_iterations` strictly
5. **Always monitor tokens** — $0.00002 per token means 50K tokens = $1, 100K tokens = $2
6. **Start conservative** — begin with low iteration counts, increase only if needed

Happy orchestrating! 🤖🤝🤖