SwarmEngine
A high-throughput, low-latency agent swarm execution engine written in Rust.
SwarmEngine is designed for running multiple AI agents in parallel with tick-based synchronization, optimized for batch LLM inference and real-time exploration scenarios.
Features
- Tick-Driven Architecture: Configurable tick cycles (default 10ms) with deterministic execution
- Parallel Agent Execution: Lock-free parallel worker execution using Rayon
- Batch LLM Inference: Optimized for batch processing with llama.cpp server, Ollama, and other LLM providers
- Exploration Space: Graph-based state exploration with UCB1, Thompson Sampling, and adaptive selection strategies
- Offline Learning: Accumulates session data and learns optimal parameters through offline training
- Scenario-Based Evaluation: TOML-based scenario definitions with variants support
Performance
Measured on troubleshooting scenario (exploration-based, no per-tick LLM calls):
| Metric | Value |
|---|---|
| Throughput | ~80 actions/sec |
| Tick latency (exploration) | 0.1-0.2ms per action |
| Task completion | 5 actions in ~60ms |
Note: LLM-based decision making adds latency per call. The exploration-based mode uses graph traversal instead of per-tick LLM calls.
Architecture
┌──────────────────────────────────────────────────────────────────────┐
│ SwarmEngine │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Orchestrator │ │
│ │ │ │
│ │ Tick Loop: │ │
│ │ 1. Collect Async Results │ │
│ │ 2. Manager Phase (LLM Decision / Exploration) │ │
│ │ 3. Worker Execution (Parallel) │ │
│ │ 4. Merge Results │ │
│ │ 5. Tick Advance │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────────┐ │
│ │ SwarmState │ │ ExplorationSpace│ │ BatchInvoker │ │
│ │ ├─ SharedState │ │ ├─ GraphMap │ │ ├─ LlamaCppServer │ │
│ │ └─ WorkerStates│ │ └─ Operators │ │ └─ Ollama │ │
│ └─────────────────┘ └─────────────────┘ └────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
Crates
| Crate | Description |
|---|---|
swarm-engine-core |
Core runtime, orchestrator, state management, exploration, and learning |
swarm-engine-llm |
LLM integrations (llama.cpp server, Ollama, prompt building, batch processing) |
swarm-engine-eval |
Scenario-based evaluation framework with assertions and metrics |
swarm-engine-ui |
CLI and Desktop GUI (egui) |
Quick Start
Prerequisites
- Rust 2021 edition or later
- llama.cpp (will be built automatically, or use pre-built binary)
- A GGUF model file (LFM2.5-1.2B recommended for development)
Installation
# Clone the repository
# Build
Setting up llama-server with LFM2.5
SwarmEngine uses llama.cpp server as the primary LLM backend. LFM2.5-1.2B is the recommended model for development and testing due to its balance of speed and quality.
1. Download the Model
# Using Hugging Face CLI (recommended)
# Or download directly from Hugging Face:
# https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
2. Start llama-server
# Start with the downloaded model (using glob pattern for snapshot hash)
# With custom options (GPU acceleration, parallel slots)
3. Verify Server Status
# Check if server is running and healthy
# View server logs
# Stop the server
Why LFM2.5?
| Model | Size | Speed | Quality | Use Case |
|---|---|---|---|---|
| LFM2.5-1.2B | 1.2B | Fast | Good | Development, testing (recommended) |
| Qwen2.5-Coder-3B | 3B | Medium | Better | Complex scenarios |
| Qwen2.5-Coder-7B | 7B | Slow | Best | Production quality testing |
Running an Evaluation
# Run a troubleshooting scenario
# With learning data collection
CLI Commands
# Show help
# Initialize configuration
# Show current configuration
# Open scenarios directory
# Launch Desktop GUI
Scenarios
Scenarios are defined in TOML format and describe the task, environment, actions, and success criteria:
[]
= "Service Troubleshooting"
= "user:troubleshooting:v2"
= "Diagnose and fix a service outage"
[]
= "Diagnose the failing service and restart it"
[]
= "llama-server"
= "LFM2.5-1.2B"
= "http://localhost:8080"
[[]]
= "CheckStatus"
= "Check the status of services"
[[]]
= "ReadLogs"
= "Read logs for a specific service"
[]
= 10
= 150
Scenario Variants
Scenarios can define variants for different configurations:
# List available variants
# Run with a specific variant
Learning System
SwarmEngine includes a comprehensive learning system with offline parameter optimization and LoRA fine-tuning support.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Learning System │
│ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Data Collection ││
│ │ Eval (--learning) → ActionEvents → Session Snapshots ││
│ └─────────────────────────────────────────────────────────────┘│
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Offline Analysis ││
│ │ learn once → Stats Analysis → OptimalParamsModel ││
│ │ → RecommendedPaths ││
│ └─────────────────────────────────────────────────────────────┘│
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Model Application ││
│ │ Next Eval → Load OfflineModel → Apply Parameters ││
│ │ → LoRA Adapter (optional) ││
│ └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
Quick Start
# 1. Collect data with --learning flag
# 2. Run offline learning
# 3. Next eval run will automatically use the learned model
# → "Offline model loaded: ucb1_c=X.XXX, strategy=..."
Model Types
| Model | Purpose | Lifetime |
|---|---|---|
| ScoreModel | Action selection scores (transitions, N-gram patterns) | 1 session |
| OptimalParamsModel | Parameter optimization (ucb1_c, thresholds) | Cross-session |
| LoRA Adapter | LLM fine-tuning for decision quality | Persistent |
Offline Model Parameters
Learning Daemon
For continuous learning during long-running evaluations:
# Start daemon mode (monitors and learns continuously)
# Daemon features:
# - Watches for new session data
# - Triggers learning based on configurable conditions
# - Applies learned models via Blue-Green deployment
LoRA Training (Experimental)
Fine-tune LLM for improved decision quality:
# LoRA training requires:
# - Episode data collected from successful runs
# - llama.cpp with LoRA support
# - Training triggers (count, time, or quality-based)
Data Structure
~/.swarm-engine/learning/
├── global_stats.json # Global statistics across scenarios
└── scenarios/
└── troubleshooting/ # Per-scenario (learning_key based)
├── stats.json # Accumulated statistics
├── offline_model.json # Learned parameters
├── lora/ # LoRA adapters (if trained)
│ └── v1/
│ └── adapter.safetensors
└── sessions/ # Session snapshots
└── {timestamp}/
├── meta.json
└── stats.json
Selection Strategies
The learning system optimizes selection strategy parameters:
| Strategy | Description | When Used |
|---|---|---|
| UCB1 | Upper Confidence Bound | Early exploration |
| Thompson | Bayesian sampling | Probabilistic exploration |
| Greedy | Best known action | Exploitation after learning |
| Adaptive | Dynamic switching | Production (based on error rate) |
LLM Providers
llama-server (Recommended)
llama.cpp server provides true batch processing with continuous batching:
Ollama (Alternative)
Ollama can be used but does not support true batch processing:
Note: Ollama processes requests sequentially internally, so throughput measurements may not reflect true parallel performance.
Configuration
Global Configuration (~/.swarm-engine/config.toml)
[]
= "eval"
[]
= 30
= 10
[]
= "llama-server"
= true
[]
= "info"
= true
Directory Structure
| Path | Purpose |
|---|---|
~/.swarm-engine/ |
System configuration, cache, logs |
~/swarm-engine/ |
User data: scenarios, reports |
./swarm-engine/ |
Project-local configuration |
Development
Build and Test
# Type check
# Build
# Run tests
# Run with verbose logging
RUST_LOG=debug
Project Structure
swarm-engine/
├── crates/
│ ├── swarm-engine-core/ # Core runtime
│ │ ├── src/
│ │ │ ├── orchestrator/ # Main loop
│ │ │ ├── agent/ # Worker/Manager definitions
│ │ │ ├── exploration/ # Graph-based exploration
│ │ │ ├── learn/ # Offline learning
│ │ │ └── ...
│ ├── swarm-engine-llm/ # LLM integrations
│ ├── swarm-engine-eval/ # Evaluation framework
│ │ └── scenarios/ # Built-in scenarios
│ └── swarm-engine-ui/ # CLI and GUI
Documentation
Detailed design documentation is available in the RustDoc comments of each crate:
# Generate and open documentation
Key documentation locations:
- swarm-engine-core: Core concepts, tick lifecycle, two-tier memory model
- swarm-engine-eval: Evaluation framework, scenario format, metrics
- swarm-engine-llm: LLM integrations, batch processing, prompt building
- swarm-engine-ui: CLI commands, GUI features
License
MIT License