# SwarmEngine
A high-throughput, low-latency agent swarm execution engine written in Rust.
SwarmEngine is designed for running multiple AI agents in parallel with tick-based synchronization, optimized for batch LLM inference and real-time exploration scenarios.
## Features
- **Tick-Driven Architecture**: Configurable tick cycles (default 10ms) with deterministic execution
- **Parallel Agent Execution**: Lock-free parallel worker execution using Rayon
- **Batch LLM Inference**: Optimized for batch processing with llama.cpp server, Ollama, and other LLM providers
- **Exploration Space**: Graph-based state exploration with UCB1, Thompson Sampling, and adaptive selection strategies
- **Offline Learning**: Accumulates session data and learns optimal parameters through offline training
- **Scenario-Based Evaluation**: TOML-based scenario definitions with variants support
## Performance
Measured on troubleshooting scenario (exploration-based, no per-tick LLM calls):
| Throughput | ~80 actions/sec |
| Tick latency (exploration) | 0.1-0.2ms per action |
| Task completion | 5 actions in ~60ms |
*Note: LLM-based decision making adds latency per call. The exploration-based mode uses graph traversal instead of per-tick LLM calls.*
## Architecture
```
┌──────────────────────────────────────────────────────────────────────┐
│ SwarmEngine │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Orchestrator │ │
│ │ │ │
│ │ Tick Loop: │ │
│ │ 1. Collect Async Results │ │
│ │ 2. Manager Phase (LLM Decision / Exploration) │ │
│ │ 3. Worker Execution (Parallel) │ │
│ │ 4. Merge Results │ │
│ │ 5. Tick Advance │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────────┐ │
│ │ SwarmState │ │ ExplorationSpace│ │ BatchInvoker │ │
│ │ ├─ SharedState │ │ ├─ GraphMap │ │ ├─ LlamaCppServer │ │
│ │ └─ WorkerStates│ │ └─ Operators │ │ └─ Ollama │ │
│ └─────────────────┘ └─────────────────┘ └────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
```
## Crates
| `swarm-engine-core` | Core runtime, orchestrator, state management, exploration, and learning |
| `swarm-engine-llm` | LLM integrations (llama.cpp server, Ollama, prompt building, batch processing) |
| `swarm-engine-eval` | Scenario-based evaluation framework with assertions and metrics |
| `swarm-engine-ui` | CLI and Desktop GUI (egui) |
## Quick Start
### Prerequisites
- Rust 2021 edition or later
- llama.cpp (will be built automatically, or use pre-built binary)
- A GGUF model file (LFM2.5-1.2B recommended for development)
### Installation
```bash
# Clone the repository
git clone https://github.com/ynishi/swarm-engine.git
cd swarm-engine
# Build
cargo build --release
```
### Setting up llama-server with LFM2.5
SwarmEngine uses llama.cpp server as the primary LLM backend. **LFM2.5-1.2B** is the recommended model for development and testing due to its balance of speed and quality.
#### 1. Download the Model
```bash
# Using Hugging Face CLI (recommended)
pip install huggingface_hub
huggingface-cli download LiquidAI/LFM2.5-1.2B-Instruct-GGUF \
LFM2.5-1.2B-Instruct-Q4_K_M.gguf
# Or download directly from Hugging Face:
# https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
```
#### 2. Start llama-server
```bash
# Start with the downloaded model (using glob pattern for snapshot hash)
cargo run --package swarm-engine-ui -- llama start \
-m ~/.cache/huggingface/hub/models--LiquidAI--LFM2.5-1.2B-Instruct-GGUF/snapshots/*/LFM2.5-1.2B-Instruct-Q4_K_M.gguf
# With custom options (GPU acceleration, parallel slots)
cargo run --package swarm-engine-ui -- llama start \
-m ~/.cache/huggingface/hub/models--LiquidAI--LFM2.5-1.2B-Instruct-GGUF/snapshots/*/LFM2.5-1.2B-Instruct-Q4_K_M.gguf \
--n-gpu-layers 99 \
--parallel 4 \
--ctx-size 4096
```
#### 3. Verify Server Status
```bash
# Check if server is running and healthy
cargo run --package swarm-engine-ui -- llama status
# View server logs
cargo run --package swarm-engine-ui -- llama logs -f
# Stop the server
cargo run --package swarm-engine-ui -- llama stop
```
#### Why LFM2.5?
| **LFM2.5-1.2B** | 1.2B | Fast | Good | Development, testing (recommended) |
| Qwen2.5-Coder-3B | 3B | Medium | Better | Complex scenarios |
| Qwen2.5-Coder-7B | 7B | Slow | Best | Production quality testing |
### Running an Evaluation
```bash
# Run a troubleshooting scenario
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 -v
# With learning data collection
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 --learning
```
### CLI Commands
```bash
# Show help
cargo run --package swarm-engine-ui -- --help
# Initialize configuration
cargo run --package swarm-engine-ui -- init
# Show current configuration
cargo run --package swarm-engine-ui -- config
# Open scenarios directory
cargo run --package swarm-engine-ui -- open scenarios
# Launch Desktop GUI
cargo run --package swarm-engine-ui -- --gui
```
## Scenarios
Scenarios are defined in TOML format and describe the task, environment, actions, and success criteria:
```toml
[meta]
name = "Service Troubleshooting"
id = "user:troubleshooting:v2"
description = "Diagnose and fix a service outage"
[task]
goal = "Diagnose the failing service and restart it"
[llm]
provider = "llama-server"
model = "LFM2.5-1.2B"
endpoint = "http://localhost:8080"
[[actions.actions]]
name = "CheckStatus"
description = "Check the status of services"
[[actions.actions]]
name = "ReadLogs"
description = "Read logs for a specific service"
[app_config]
tick_duration_ms = 10
max_ticks = 150
```
### Scenario Variants
Scenarios can define variants for different configurations:
```bash
# List available variants
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml --list-variants
# Run with a specific variant
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml --variant complex
```
## Learning System
SwarmEngine includes a comprehensive learning system with offline parameter optimization and LoRA fine-tuning support.
### Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Learning System │
│ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Data Collection ││
│ │ Eval (--learning) → ActionEvents → Session Snapshots ││
│ └─────────────────────────────────────────────────────────────┘│
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Offline Analysis ││
│ │ learn once → Stats Analysis → OptimalParamsModel ││
│ │ → RecommendedPaths ││
│ └─────────────────────────────────────────────────────────────┘│
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Model Application ││
│ │ Next Eval → Load OfflineModel → Apply Parameters ││
│ │ → LoRA Adapter (optional) ││
│ └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
```
### Quick Start
```bash
# 1. Collect data with --learning flag
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 30 --learning
# 2. Run offline learning
cargo run --package swarm-engine-ui -- learn once troubleshooting
# 3. Next eval run will automatically use the learned model
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 -v
# → "Offline model loaded: ucb1_c=X.XXX, strategy=..."
```
### Model Types
| **ScoreModel** | Action selection scores (transitions, N-gram patterns) | 1 session |
| **OptimalParamsModel** | Parameter optimization (ucb1_c, thresholds) | Cross-session |
| **LoRA Adapter** | LLM fine-tuning for decision quality | Persistent |
### Offline Model Parameters
```json
{
"parameters": {
"ucb1_c": 1.414, // UCB1 exploration constant
"learning_weight": 0.3, // Learning weight for selection
"ngram_weight": 1.0 // N-gram pattern weight
},
"strategy_config": {
"initial_strategy": "ucb1",
"maturity_threshold": 5,
"error_rate_threshold": 0.45
},
"recommended_paths": [...] // Optimal action sequences
}
```
### Learning Daemon
For continuous learning during long-running evaluations:
```bash
# Start daemon mode (monitors and learns continuously)
cargo run --package swarm-engine-ui -- learn daemon troubleshooting
# Daemon features:
# - Watches for new session data
# - Triggers learning based on configurable conditions
# - Applies learned models via Blue-Green deployment
```
### LoRA Training (Experimental)
Fine-tune LLM for improved decision quality:
```bash
# LoRA training requires:
# - Episode data collected from successful runs
# - llama.cpp with LoRA support
# - Training triggers (count, time, or quality-based)
```
### Data Structure
```
~/.swarm-engine/learning/
├── global_stats.json # Global statistics across scenarios
└── scenarios/
└── troubleshooting/ # Per-scenario (learning_key based)
├── stats.json # Accumulated statistics
├── offline_model.json # Learned parameters
├── lora/ # LoRA adapters (if trained)
│ └── v1/
│ └── adapter.safetensors
└── sessions/ # Session snapshots
└── {timestamp}/
├── meta.json
└── stats.json
```
### Selection Strategies
The learning system optimizes selection strategy parameters:
| **UCB1** | Upper Confidence Bound | Early exploration |
| **Thompson** | Bayesian sampling | Probabilistic exploration |
| **Greedy** | Best known action | Exploitation after learning |
| **Adaptive** | Dynamic switching | Production (based on error rate) |
## LLM Providers
### llama-server (Recommended)
llama.cpp server provides true batch processing with continuous batching:
```bash
cargo run --package swarm-engine-ui -- llama start \
-m model.gguf \
--parallel 4 \
--ctx-size 4096 \
--n-gpu-layers 99
```
### Ollama (Alternative)
Ollama can be used but does not support true batch processing:
```bash
ollama serve
```
**Note**: Ollama processes requests sequentially internally, so throughput measurements may not reflect true parallel performance.
## Configuration
### Global Configuration (`~/.swarm-engine/config.toml`)
```toml
[general]
default_project_type = "eval"
[eval]
default_runs = 30
target_tick_duration_ms = 10
[llm]
default_provider = "llama-server"
cache_enabled = true
[logging]
level = "info"
file_enabled = true
```
### Directory Structure
| `~/.swarm-engine/` | System configuration, cache, logs |
| `~/swarm-engine/` | User data: scenarios, reports |
| `./swarm-engine/` | Project-local configuration |
## Development
### Build and Test
```bash
# Type check
cargo check
# Build
cargo build
# Run tests
cargo test
# Run with verbose logging
RUST_LOG=debug cargo run --package swarm-engine-ui -- eval ...
```
### Project Structure
```
swarm-engine/
├── crates/
│ ├── swarm-engine-core/ # Core runtime
│ │ ├── src/
│ │ │ ├── orchestrator/ # Main loop
│ │ │ ├── agent/ # Worker/Manager definitions
│ │ │ ├── exploration/ # Graph-based exploration
│ │ │ ├── learn/ # Offline learning
│ │ │ └── ...
│ ├── swarm-engine-llm/ # LLM integrations
│ ├── swarm-engine-eval/ # Evaluation framework
│ │ └── scenarios/ # Built-in scenarios
│ └── swarm-engine-ui/ # CLI and GUI
```
## Documentation
Detailed design documentation is available in the RustDoc comments of each crate:
```bash
# Generate and open documentation
cargo doc --open --no-deps
```
Key documentation locations:
- **swarm-engine-core**: Core concepts, tick lifecycle, two-tier memory model
- **swarm-engine-eval**: Evaluation framework, scenario format, metrics
- **swarm-engine-llm**: LLM integrations, batch processing, prompt building
- **swarm-engine-ui**: CLI commands, GUI features
## License
MIT License