swarm-engine-core 0.1.0

# SwarmEngine

A high-throughput, low-latency agent swarm execution engine written in Rust.

SwarmEngine is designed for running multiple AI agents in parallel with tick-based synchronization, optimized for batch LLM inference and real-time exploration scenarios.

## Features

- **Tick-Driven Architecture**: Configurable tick cycles (default 10ms) with deterministic execution
- **Parallel Agent Execution**: Lock-free parallel worker execution using Rayon
- **Batch LLM Inference**: Optimized for batch processing with llama.cpp server, Ollama, and other LLM providers
- **Exploration Space**: Graph-based state exploration with UCB1, Thompson Sampling, and adaptive selection strategies
- **Offline Learning**: Accumulates session data and learns optimal parameters through offline training
- **Scenario-Based Evaluation**: TOML-based scenario definitions with variants support

## Performance

Measured on troubleshooting scenario (exploration-based, no per-tick LLM calls):

| Metric | Value |
|--------|-------|
| Throughput | ~80 actions/sec |
| Tick latency (exploration) | 0.1-0.2ms per action |
| Task completion | 5 actions in ~60ms |

*Note: LLM-based decision making adds latency per call. The exploration-based mode uses graph traversal instead of per-tick LLM calls.*

## Architecture

```
┌──────────────────────────────────────────────────────────────────────┐
│                            SwarmEngine                               │
│                                                                      │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                         Orchestrator                           │  │
│  │                                                                │  │
│  │   Tick Loop:                                                   │  │
│  │   1. Collect Async Results                                     │  │
│  │   2. Manager Phase (LLM Decision / Exploration)                │  │
│  │   3. Worker Execution (Parallel)                               │  │
│  │   4. Merge Results                                             │  │
│  │   5. Tick Advance                                              │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                                                                      │
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────────┐   │
│  │   SwarmState    │  │ ExplorationSpace│  │   BatchInvoker     │   │
│  │  ├─ SharedState │  │  ├─ GraphMap    │  │  ├─ LlamaCppServer │   │
│  │  └─ WorkerStates│  │  └─ Operators   │  │  └─ Ollama         │   │
│  └─────────────────┘  └─────────────────┘  └────────────────────┘   │
└──────────────────────────────────────────────────────────────────────┘
```

## Crates

| Crate | Description |
|-------|-------------|
| `swarm-engine-core` | Core runtime, orchestrator, state management, exploration, and learning |
| `swarm-engine-llm` | LLM integrations (llama.cpp server, Ollama, prompt building, batch processing) |
| `swarm-engine-eval` | Scenario-based evaluation framework with assertions and metrics |
| `swarm-engine-ui` | CLI and Desktop GUI (egui) |

## Quick Start

### Prerequisites

- Rust 2021 edition or later
- llama.cpp (will be built automatically, or use pre-built binary)
- A GGUF model file (LFM2.5-1.2B recommended for development)

### Installation

```bash
# Clone the repository
git clone https://github.com/ynishi/swarm-engine.git
cd swarm-engine

# Build
cargo build --release
```

### Setting up llama-server with LFM2.5

SwarmEngine uses llama.cpp server as the primary LLM backend. **LFM2.5-1.2B** is the recommended model for development and testing due to its balance of speed and quality.

#### 1. Download the Model

```bash
# Using Hugging Face CLI (recommended)
pip install huggingface_hub
huggingface-cli download LiquidAI/LFM2.5-1.2B-Instruct-GGUF \
  LFM2.5-1.2B-Instruct-Q4_K_M.gguf

# Or download directly from Hugging Face:
# https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
```

#### 2. Start llama-server

```bash
# Start with the downloaded model (using glob pattern for snapshot hash)
cargo run --package swarm-engine-ui -- llama start \
  -m ~/.cache/huggingface/hub/models--LiquidAI--LFM2.5-1.2B-Instruct-GGUF/snapshots/*/LFM2.5-1.2B-Instruct-Q4_K_M.gguf

# With custom options (GPU acceleration, parallel slots)
cargo run --package swarm-engine-ui -- llama start \
  -m ~/.cache/huggingface/hub/models--LiquidAI--LFM2.5-1.2B-Instruct-GGUF/snapshots/*/LFM2.5-1.2B-Instruct-Q4_K_M.gguf \
  --n-gpu-layers 99 \
  --parallel 4 \
  --ctx-size 4096
```

#### 3. Verify Server Status

```bash
# Check if server is running and healthy
cargo run --package swarm-engine-ui -- llama status

# View server logs
cargo run --package swarm-engine-ui -- llama logs -f

# Stop the server
cargo run --package swarm-engine-ui -- llama stop
```

#### Why LFM2.5?

| Model | Size | Speed | Quality | Use Case |
|-------|------|-------|---------|----------|
| **LFM2.5-1.2B** | 1.2B | Fast | Good | Development, testing (recommended) |
| Qwen2.5-Coder-3B | 3B | Medium | Better | Complex scenarios |
| Qwen2.5-Coder-7B | 7B | Slow | Best | Production quality testing |

### Running an Evaluation

```bash
# Run a troubleshooting scenario
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 -v

# With learning data collection
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 --learning
```

### CLI Commands

```bash
# Show help
cargo run --package swarm-engine-ui -- --help

# Initialize configuration
cargo run --package swarm-engine-ui -- init

# Show current configuration
cargo run --package swarm-engine-ui -- config

# Open scenarios directory
cargo run --package swarm-engine-ui -- open scenarios

# Launch Desktop GUI
cargo run --package swarm-engine-ui -- --gui
```

## Scenarios

Scenarios are defined in TOML format and describe the task, environment, actions, and success criteria:

```toml
[meta]
name = "Service Troubleshooting"
id = "user:troubleshooting:v2"
description = "Diagnose and fix a service outage"

[task]
goal = "Diagnose the failing service and restart it"

[llm]
provider = "llama-server"
model = "LFM2.5-1.2B"
endpoint = "http://localhost:8080"

[[actions.actions]]
name = "CheckStatus"
description = "Check the status of services"

[[actions.actions]]
name = "ReadLogs"
description = "Read logs for a specific service"

[app_config]
tick_duration_ms = 10
max_ticks = 150
```

### Scenario Variants

Scenarios can define variants for different configurations:

```bash
# List available variants
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml --list-variants

# Run with a specific variant
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml --variant complex
```

## Learning System

SwarmEngine includes a comprehensive learning system with offline parameter optimization and LoRA fine-tuning support.

### Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                     Learning System                              │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Data Collection                           ││
│  │  Eval (--learning) → ActionEvents → Session Snapshots       ││
│  └─────────────────────────────────────────────────────────────┘│
│                              ↓                                   │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                   Offline Analysis                           ││
│  │  learn once → Stats Analysis → OptimalParamsModel           ││
│  │                             → RecommendedPaths               ││
│  └─────────────────────────────────────────────────────────────┘│
│                              ↓                                   │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Model Application                         ││
│  │  Next Eval → Load OfflineModel → Apply Parameters           ││
│  │           → LoRA Adapter (optional)                         ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
```

### Quick Start

```bash
# 1. Collect data with --learning flag
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 30 --learning

# 2. Run offline learning
cargo run --package swarm-engine-ui -- learn once troubleshooting

# 3. Next eval run will automatically use the learned model
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 -v
# → "Offline model loaded: ucb1_c=X.XXX, strategy=..."
```

### Model Types

| Model | Purpose | Lifetime |
|-------|---------|----------|
| **ScoreModel** | Action selection scores (transitions, N-gram patterns) | 1 session |
| **OptimalParamsModel** | Parameter optimization (ucb1_c, thresholds) | Cross-session |
| **LoRA Adapter** | LLM fine-tuning for decision quality | Persistent |

### Offline Model Parameters

```json
{
  "parameters": {
    "ucb1_c": 1.414,         // UCB1 exploration constant
    "learning_weight": 0.3,   // Learning weight for selection
    "ngram_weight": 1.0       // N-gram pattern weight
  },
  "strategy_config": {
    "initial_strategy": "ucb1",
    "maturity_threshold": 5,
    "error_rate_threshold": 0.45
  },
  "recommended_paths": [...]   // Optimal action sequences
}
```

### Learning Daemon

For continuous learning during long-running evaluations:

```bash
# Start daemon mode (monitors and learns continuously)
cargo run --package swarm-engine-ui -- learn daemon troubleshooting

# Daemon features:
# - Watches for new session data
# - Triggers learning based on configurable conditions
# - Applies learned models via Blue-Green deployment
```

### LoRA Training (Experimental)

Fine-tune LLM for improved decision quality:

```bash
# LoRA training requires:
# - Episode data collected from successful runs
# - llama.cpp with LoRA support
# - Training triggers (count, time, or quality-based)
```

### Data Structure

```
~/.swarm-engine/learning/
├── global_stats.json           # Global statistics across scenarios
└── scenarios/
    └── troubleshooting/        # Per-scenario (learning_key based)
        ├── stats.json          # Accumulated statistics
        ├── offline_model.json  # Learned parameters
        ├── lora/               # LoRA adapters (if trained)
        │   └── v1/
        │       └── adapter.safetensors
        └── sessions/           # Session snapshots
            └── {timestamp}/
                ├── meta.json
                └── stats.json
```

### Selection Strategies

The learning system optimizes selection strategy parameters:

| Strategy | Description | When Used |
|----------|-------------|-----------|
| **UCB1** | Upper Confidence Bound | Early exploration |
| **Thompson** | Bayesian sampling | Probabilistic exploration |
| **Greedy** | Best known action | Exploitation after learning |
| **Adaptive** | Dynamic switching | Production (based on error rate) |

## LLM Providers

### llama-server (Recommended)

llama.cpp server provides true batch processing with continuous batching:

```bash
cargo run --package swarm-engine-ui -- llama start \
  -m model.gguf \
  --parallel 4 \
  --ctx-size 4096 \
  --n-gpu-layers 99
```

### Ollama (Alternative)

Ollama can be used but does not support true batch processing:

```bash
ollama serve
```

**Note**: Ollama processes requests sequentially internally, so throughput measurements may not reflect true parallel performance.

## Configuration

### Global Configuration (`~/.swarm-engine/config.toml`)

```toml
[general]
default_project_type = "eval"

[eval]
default_runs = 30
target_tick_duration_ms = 10

[llm]
default_provider = "llama-server"
cache_enabled = true

[logging]
level = "info"
file_enabled = true
```

### Directory Structure

| Path | Purpose |
|------|---------|
| `~/.swarm-engine/` | System configuration, cache, logs |
| `~/swarm-engine/` | User data: scenarios, reports |
| `./swarm-engine/` | Project-local configuration |

## Development

### Build and Test

```bash
# Type check
cargo check

# Build
cargo build

# Run tests
cargo test

# Run with verbose logging
RUST_LOG=debug cargo run --package swarm-engine-ui -- eval ...
```

### Project Structure

```
swarm-engine/
├── crates/
│   ├── swarm-engine-core/      # Core runtime
│   │   ├── src/
│   │   │   ├── orchestrator/   # Main loop
│   │   │   ├── agent/          # Worker/Manager definitions
│   │   │   ├── exploration/    # Graph-based exploration
│   │   │   ├── learn/          # Offline learning
│   │   │   └── ...
│   ├── swarm-engine-llm/       # LLM integrations
│   ├── swarm-engine-eval/      # Evaluation framework
│   │   └── scenarios/          # Built-in scenarios
│   └── swarm-engine-ui/        # CLI and GUI
```

## Documentation

Detailed design documentation is available in the RustDoc comments of each crate:

```bash
# Generate and open documentation
cargo doc --open --no-deps
```

Key documentation locations:
- **swarm-engine-core**: Core concepts, tick lifecycle, two-tier memory model
- **swarm-engine-eval**: Evaluation framework, scenario format, metrics
- **swarm-engine-llm**: LLM integrations, batch processing, prompt building
- **swarm-engine-ui**: CLI commands, GUI features

## License

MIT License