swarm-engine-core 0.1.0

Core types and orchestration for SwarmEngine
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
# SwarmEngine

A high-throughput, low-latency agent swarm execution engine written in Rust.

SwarmEngine is designed for running multiple AI agents in parallel with tick-based synchronization, optimized for batch LLM inference and real-time exploration scenarios.

## Features

- **Tick-Driven Architecture**: Configurable tick cycles (default 10ms) with deterministic execution
- **Parallel Agent Execution**: Lock-free parallel worker execution using Rayon
- **Batch LLM Inference**: Optimized for batch processing with llama.cpp server, Ollama, and other LLM providers
- **Exploration Space**: Graph-based state exploration with UCB1, Thompson Sampling, and adaptive selection strategies
- **Offline Learning**: Accumulates session data and learns optimal parameters through offline training
- **Scenario-Based Evaluation**: TOML-based scenario definitions with variants support

## Performance

Measured on troubleshooting scenario (exploration-based, no per-tick LLM calls):

| Metric | Value |
|--------|-------|
| Throughput | ~80 actions/sec |
| Tick latency (exploration) | 0.1-0.2ms per action |
| Task completion | 5 actions in ~60ms |

*Note: LLM-based decision making adds latency per call. The exploration-based mode uses graph traversal instead of per-tick LLM calls.*

## Architecture

```
┌──────────────────────────────────────────────────────────────────────┐
│                            SwarmEngine                               │
│                                                                      │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                         Orchestrator                           │  │
│  │                                                                │  │
│  │   Tick Loop:                                                   │  │
│  │   1. Collect Async Results                                     │  │
│  │   2. Manager Phase (LLM Decision / Exploration)                │  │
│  │   3. Worker Execution (Parallel)                               │  │
│  │   4. Merge Results                                             │  │
│  │   5. Tick Advance                                              │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                                                                      │
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────────┐   │
│  │   SwarmState    │  │ ExplorationSpace│  │   BatchInvoker     │   │
│  │  ├─ SharedState │  │  ├─ GraphMap    │  │  ├─ LlamaCppServer │   │
│  │  └─ WorkerStates│  │  └─ Operators   │  │  └─ Ollama         │   │
│  └─────────────────┘  └─────────────────┘  └────────────────────┘   │
└──────────────────────────────────────────────────────────────────────┘
```

## Crates

| Crate | Description |
|-------|-------------|
| `swarm-engine-core` | Core runtime, orchestrator, state management, exploration, and learning |
| `swarm-engine-llm` | LLM integrations (llama.cpp server, Ollama, prompt building, batch processing) |
| `swarm-engine-eval` | Scenario-based evaluation framework with assertions and metrics |
| `swarm-engine-ui` | CLI and Desktop GUI (egui) |

## Quick Start

### Prerequisites

- Rust 2021 edition or later
- llama.cpp (will be built automatically, or use pre-built binary)
- A GGUF model file (LFM2.5-1.2B recommended for development)

### Installation

```bash
# Clone the repository
git clone https://github.com/ynishi/swarm-engine.git
cd swarm-engine

# Build
cargo build --release
```

### Setting up llama-server with LFM2.5

SwarmEngine uses llama.cpp server as the primary LLM backend. **LFM2.5-1.2B** is the recommended model for development and testing due to its balance of speed and quality.

#### 1. Download the Model

```bash
# Using Hugging Face CLI (recommended)
pip install huggingface_hub
huggingface-cli download LiquidAI/LFM2.5-1.2B-Instruct-GGUF \
  LFM2.5-1.2B-Instruct-Q4_K_M.gguf

# Or download directly from Hugging Face:
# https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
```

#### 2. Start llama-server

```bash
# Start with the downloaded model (using glob pattern for snapshot hash)
cargo run --package swarm-engine-ui -- llama start \
  -m ~/.cache/huggingface/hub/models--LiquidAI--LFM2.5-1.2B-Instruct-GGUF/snapshots/*/LFM2.5-1.2B-Instruct-Q4_K_M.gguf

# With custom options (GPU acceleration, parallel slots)
cargo run --package swarm-engine-ui -- llama start \
  -m ~/.cache/huggingface/hub/models--LiquidAI--LFM2.5-1.2B-Instruct-GGUF/snapshots/*/LFM2.5-1.2B-Instruct-Q4_K_M.gguf \
  --n-gpu-layers 99 \
  --parallel 4 \
  --ctx-size 4096
```

#### 3. Verify Server Status

```bash
# Check if server is running and healthy
cargo run --package swarm-engine-ui -- llama status

# View server logs
cargo run --package swarm-engine-ui -- llama logs -f

# Stop the server
cargo run --package swarm-engine-ui -- llama stop
```

#### Why LFM2.5?

| Model | Size | Speed | Quality | Use Case |
|-------|------|-------|---------|----------|
| **LFM2.5-1.2B** | 1.2B | Fast | Good | Development, testing (recommended) |
| Qwen2.5-Coder-3B | 3B | Medium | Better | Complex scenarios |
| Qwen2.5-Coder-7B | 7B | Slow | Best | Production quality testing |

### Running an Evaluation

```bash
# Run a troubleshooting scenario
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 -v

# With learning data collection
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 --learning
```

### CLI Commands

```bash
# Show help
cargo run --package swarm-engine-ui -- --help

# Initialize configuration
cargo run --package swarm-engine-ui -- init

# Show current configuration
cargo run --package swarm-engine-ui -- config

# Open scenarios directory
cargo run --package swarm-engine-ui -- open scenarios

# Launch Desktop GUI
cargo run --package swarm-engine-ui -- --gui
```

## Scenarios

Scenarios are defined in TOML format and describe the task, environment, actions, and success criteria:

```toml
[meta]
name = "Service Troubleshooting"
id = "user:troubleshooting:v2"
description = "Diagnose and fix a service outage"

[task]
goal = "Diagnose the failing service and restart it"

[llm]
provider = "llama-server"
model = "LFM2.5-1.2B"
endpoint = "http://localhost:8080"

[[actions.actions]]
name = "CheckStatus"
description = "Check the status of services"

[[actions.actions]]
name = "ReadLogs"
description = "Read logs for a specific service"

[app_config]
tick_duration_ms = 10
max_ticks = 150
```

### Scenario Variants

Scenarios can define variants for different configurations:

```bash
# List available variants
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml --list-variants

# Run with a specific variant
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml --variant complex
```

## Learning System

SwarmEngine includes a comprehensive learning system with offline parameter optimization and LoRA fine-tuning support.

### Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                     Learning System                              │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Data Collection                           ││
│  │  Eval (--learning) → ActionEvents → Session Snapshots       ││
│  └─────────────────────────────────────────────────────────────┘│
│                              ↓                                   │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                   Offline Analysis                           ││
│  │  learn once → Stats Analysis → OptimalParamsModel           ││
│  │                             → RecommendedPaths               ││
│  └─────────────────────────────────────────────────────────────┘│
│                              ↓                                   │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Model Application                         ││
│  │  Next Eval → Load OfflineModel → Apply Parameters           ││
│  │           → LoRA Adapter (optional)                         ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
```

### Quick Start

```bash
# 1. Collect data with --learning flag
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 30 --learning

# 2. Run offline learning
cargo run --package swarm-engine-ui -- learn once troubleshooting

# 3. Next eval run will automatically use the learned model
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 -v
# → "Offline model loaded: ucb1_c=X.XXX, strategy=..."
```

### Model Types

| Model | Purpose | Lifetime |
|-------|---------|----------|
| **ScoreModel** | Action selection scores (transitions, N-gram patterns) | 1 session |
| **OptimalParamsModel** | Parameter optimization (ucb1_c, thresholds) | Cross-session |
| **LoRA Adapter** | LLM fine-tuning for decision quality | Persistent |

### Offline Model Parameters

```json
{
  "parameters": {
    "ucb1_c": 1.414,         // UCB1 exploration constant
    "learning_weight": 0.3,   // Learning weight for selection
    "ngram_weight": 1.0       // N-gram pattern weight
  },
  "strategy_config": {
    "initial_strategy": "ucb1",
    "maturity_threshold": 5,
    "error_rate_threshold": 0.45
  },
  "recommended_paths": [...]   // Optimal action sequences
}
```

### Learning Daemon

For continuous learning during long-running evaluations:

```bash
# Start daemon mode (monitors and learns continuously)
cargo run --package swarm-engine-ui -- learn daemon troubleshooting

# Daemon features:
# - Watches for new session data
# - Triggers learning based on configurable conditions
# - Applies learned models via Blue-Green deployment
```

### LoRA Training (Experimental)

Fine-tune LLM for improved decision quality:

```bash
# LoRA training requires:
# - Episode data collected from successful runs
# - llama.cpp with LoRA support
# - Training triggers (count, time, or quality-based)
```

### Data Structure

```
~/.swarm-engine/learning/
├── global_stats.json           # Global statistics across scenarios
└── scenarios/
    └── troubleshooting/        # Per-scenario (learning_key based)
        ├── stats.json          # Accumulated statistics
        ├── offline_model.json  # Learned parameters
        ├── lora/               # LoRA adapters (if trained)
        │   └── v1/
        │       └── adapter.safetensors
        └── sessions/           # Session snapshots
            └── {timestamp}/
                ├── meta.json
                └── stats.json
```

### Selection Strategies

The learning system optimizes selection strategy parameters:

| Strategy | Description | When Used |
|----------|-------------|-----------|
| **UCB1** | Upper Confidence Bound | Early exploration |
| **Thompson** | Bayesian sampling | Probabilistic exploration |
| **Greedy** | Best known action | Exploitation after learning |
| **Adaptive** | Dynamic switching | Production (based on error rate) |

## LLM Providers

### llama-server (Recommended)

llama.cpp server provides true batch processing with continuous batching:

```bash
cargo run --package swarm-engine-ui -- llama start \
  -m model.gguf \
  --parallel 4 \
  --ctx-size 4096 \
  --n-gpu-layers 99
```

### Ollama (Alternative)

Ollama can be used but does not support true batch processing:

```bash
ollama serve
```

**Note**: Ollama processes requests sequentially internally, so throughput measurements may not reflect true parallel performance.

## Configuration

### Global Configuration (`~/.swarm-engine/config.toml`)

```toml
[general]
default_project_type = "eval"

[eval]
default_runs = 30
target_tick_duration_ms = 10

[llm]
default_provider = "llama-server"
cache_enabled = true

[logging]
level = "info"
file_enabled = true
```

### Directory Structure

| Path | Purpose |
|------|---------|
| `~/.swarm-engine/` | System configuration, cache, logs |
| `~/swarm-engine/` | User data: scenarios, reports |
| `./swarm-engine/` | Project-local configuration |

## Development

### Build and Test

```bash
# Type check
cargo check

# Build
cargo build

# Run tests
cargo test

# Run with verbose logging
RUST_LOG=debug cargo run --package swarm-engine-ui -- eval ...
```

### Project Structure

```
swarm-engine/
├── crates/
│   ├── swarm-engine-core/      # Core runtime
│   │   ├── src/
│   │   │   ├── orchestrator/   # Main loop
│   │   │   ├── agent/          # Worker/Manager definitions
│   │   │   ├── exploration/    # Graph-based exploration
│   │   │   ├── learn/          # Offline learning
│   │   │   └── ...
│   ├── swarm-engine-llm/       # LLM integrations
│   ├── swarm-engine-eval/      # Evaluation framework
│   │   └── scenarios/          # Built-in scenarios
│   └── swarm-engine-ui/        # CLI and GUI
```

## Documentation

Detailed design documentation is available in the RustDoc comments of each crate:

```bash
# Generate and open documentation
cargo doc --open --no-deps
```

Key documentation locations:
- **swarm-engine-core**: Core concepts, tick lifecycle, two-tier memory model
- **swarm-engine-eval**: Evaluation framework, scenario format, metrics
- **swarm-engine-llm**: LLM integrations, batch processing, prompt building
- **swarm-engine-ui**: CLI commands, GUI features

## License

MIT License