🔥 Helios Engine - LLM Agent Framework

Helios Engine is a powerful and flexible Rust framework for building LLM-powered agents with tool support, streaming chat capabilities, and easy configuration management. Create intelligent agents that can interact with users, call tools, and maintain conversation context - with both online and offline local model support.

Features

🆕 Forest of Agents: Multi-agent collaboration system where agents can communicate, delegate tasks, and share context
Agent System: Create multiple agents with different personalities and capabilities
Tool Registry: Extensible tool system for adding custom functionality
Chat Management: Built-in conversation history and session management
Session Memory: Track agent state and metadata across conversations
Extensive Tool Suite: 16+ built-in tools including web scraping, JSON parsing, timestamp operations, file I/O, shell commands, HTTP requests, system info, and text processing
File Management Tools: Built-in tools for searching, reading, writing, editing, and listing files
Web & API Tools: Web scraping, HTTP requests, and JSON manipulation capabilities
System Integration: Shell command execution, system information retrieval, and timestamp operations
Text Processing: Advanced text search, replace, formatting, and analysis tools
🆕 RAG System: Retrieval-Augmented Generation with vector stores (InMemory and Qdrant)
Streaming Support: True real-time response streaming for both remote and local models with immediate token delivery
Local Model Support: Run local models offline using llama.cpp with HuggingFace integration (optional local feature)
LLM Support: Compatible with OpenAI API, any OpenAI-compatible API, and local models
HTTP Server & API: Expose OpenAI-compatible API endpoints with full parameter support (temperature, max_tokens, stop) for agents and LLM clients
Async/Await: Built on Tokio for high-performance async operations
Type-Safe: Leverages Rust's type system for safe and reliable code
Extensible: Easy to add custom tools and extend functionality
Thinking Tags: Automatic detection and display of model reasoning process
Dual Mode Support: Auto, online (remote API), and offline (local) modes
Clean Output: Suppresses verbose debugging in offline mode for clean user experience
CLI & Library: Use as both a command-line tool and a Rust library crate
🆕 Feature Flags: Optional local feature for offline model support - build only what you need!

Installation
Feature Flags
Quick Start
CLI Usage
Configuration
Local Inference Setup
Architecture
Built-in Tools
Creating Custom Tools
API Documentation
Project Structure
Examples
Contributing
License

Installation

Helios Engine can be used both as a command-line tool and as a library crate in your Rust projects.

As a CLI Tool (Recommended for Quick Start)

Install globally using Cargo (once published):

# Install without local model support (lighter, faster install)
cargo install helios-engine

# Install with local model support (enables offline mode with llama-cpp-2)
cargo install helios-engine --features local

Then use anywhere:

# Initialize configuration
helios-engine init

# Start interactive chat (default command)
helios-engine
# or explicitly
helios-engine chat

# Ask a quick question
helios-engine ask "What is Rust?"

# Get help
helios-engine --help

#  NEW: Use offline mode with local models (no internet required)
helios-engine --mode offline chat

# Use online mode (forces remote API usage)
helios-engine --mode online chat

# Auto mode (uses local if configured, otherwise remote)
helios-engine --mode auto chat

# Verbose logging for debugging
helios-engine --verbose chat

# Custom system prompt
helios-engine chat --system-prompt "You are a Python expert"

# One-off question with custom config
helios-engine --config /path/to/config.toml ask "Calculate 15 * 7"

# NEW: Serve OpenAI-compatible API endpoints
helios-engine serve --port 8000 --host 127.0.0.1

# Serve on all interfaces
helios-engine serve --host 0.0.0.0

As a Library Crate

Add Helios-Engine to your Cargo.toml:

[dependencies]
# Without local model support (lighter dependency)
helios-engine = "0.3.3"
tokio = { version = "1.35", features = ["full"] }

# OR with local model support for offline inference
helios-engine = { version = "0.3.3", features = ["local"] }
tokio = { version = "1.35", features = ["full"] }

Or use a local path during development:

[dependencies]
helios-engine = { path = "../helios" }
tokio = { version = "1.35", features = ["full"] }

Build from Source

git clone https://github.com/Ammar-Alnagar/Helios-Engine.git
cd Helios-Engine

# Build without local model support
cargo build --release

# OR build with local model support
cargo build --release --features local

# Install locally (without local support)
cargo install --path .

# OR install with local model support
cargo install --path . --features local

🚩 Feature Flags

Helios Engine supports optional feature flags to control which dependencies are included in your build. This allows you to create lighter builds when you don't need certain functionality.

Available Features

`local` - Local Model Support

Enables offline inference using local models via llama-cpp-2. When disabled, the engine only supports remote API calls, resulting in:

Faster compilation times - No need to build llama-cpp-2 and its dependencies
Smaller binary size - Excludes large native libraries
Simpler dependencies - Reduces the dependency tree significantly

Enables:

LocalLLMProvider - Run models locally using llama.cpp
LocalConfig - Configuration for local model setup
Offline mode (--mode offline) in the CLI
HuggingFace model downloading and caching

When to use:

✅ Use --features local if you need offline inference or want to run models locally
❌ Skip it if you only use remote APIs (OpenAI, Azure, etc.) for faster builds

Example:

# Without local support (lightweight, remote API only)
cargo install helios-engine
cargo build --release

# With local support (includes llama-cpp-2 for offline inference)
cargo install helios-engine --features local
cargo build --release --features local

In Cargo.toml:

# Remote API only
[dependencies]
helios-engine = "0.3.3"

# With local model support
[dependencies]
helios-engine = { version = "0.3.3", features = ["local"] }

Quick Start

Using as a Library Crate

The simplest way to use Helios Engine is to call LLM models directly:

use helios_engine::{LLMClient, ChatMessage, llm::LLMProviderType};
use helios_engine::config::LLMConfig;

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    // Configure the LLM
    let llm_config = LLMConfig {
        model_name: "gpt-3.5-turbo".to_string(),
        base_url: "https://api.openai.com/v1".to_string(),
        api_key: std::env::var("OPENAI_API_KEY").unwrap(), // export api key not .env file
        temperature: 0.7,
        max_tokens: 2048,
    };

    // Create client with remote provider type
    let client = LLMClient::new(LLMProviderType::Remote(llm_config)).await?;

    // Make a call
    let messages = vec![
        ChatMessage::system("You are a helpful assistant."),
        ChatMessage::user("What is the capital of France?"),
    ];

    let response = client.chat(messages, None).await?;
    println!("Response: {}", response.content);

    Ok(())
}

For detailed examples of using Helios Engine as a crate, see Using as a Crate Guide

Using Offline Mode with Local Models

Run models locally without internet connection:

use helios_engine::{LLMClient, ChatMessage, llm::LLMProviderType};
use helios_engine::config::LocalConfig;

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    // Configure local model
    let local_config = LocalConfig {
        huggingface_repo: "unsloth/Qwen3-0.6B-GGUF".to_string(),
        model_file: "Qwen3-0.6B-Q4_K_M.gguf".to_string(),
        temperature: 0.7,
        max_tokens: 2048,
        context_size: 8192
    };

    // Create client with local provider
    let client = LLMClient::new(LLMProviderType::Local(local_config)).await?;

    let messages = vec![
        ChatMessage::system("You are a helpful AI assistant."),
        ChatMessage::user("What is Rust programming?"),
    ];

    let response = client.chat(messages, None).await?;
    println!("Response: {}", response.content);

    Ok(())
}

Note: First run downloads the model. Subsequent runs use the cached model.

Using with Agent System

For more advanced use cases with tools and persistent conversation:

1. Configure Your LLM

Create a config.toml file (supports both remote and local):

[llm]
model_name = "gpt-3.5-turbo"
base_url = "https://api.openai.com/v1"
api_key = "your-api-key-here"
temperature = 0.7
max_tokens = 2048

# Optional: Add local configuration for offline mode
[local]
huggingface_repo = "unsloth/Qwen3-0.6B-GGUF"
model_file = "Qwen3-0.6B-Q4_K_M.gguf"
temperature = 0.7
max_tokens = 2048

2. Create Your First Agent

use helios_engine::{Agent, Config, CalculatorTool};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    // Load configuration
    let config = Config::from_file("config.toml")?;

    // Create an agent with tools
    let mut agent = Agent::builder("HeliosAgent")
        .config(config)
        .system_prompt("You are a helpful AI assistant.")
        .tool(Box::new(CalculatorTool))
        .build()
        .await?;

    // Chat with the agent
    let response = agent.chat("What is 15 * 7?").await?;
    println!("Agent: {}", response);

    Ok(())
}

3. Run the Interactive Demo

cargo run

🆕 Forest of Agents

Create a collaborative multi-agent system where agents can communicate, delegate tasks, and share context:

use helios_engine::{Agent, Config, ForestBuilder};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    let config = Config::from_file("config.toml")?;

    // Create a forest with specialized agents
    let mut forest = ForestBuilder::new()
        .config(config)
        .agent(
            "coordinator".to_string(),
            Agent::builder("coordinator")
                .system_prompt("You coordinate team projects and delegate tasks.")
        )
        .agent(
            "researcher".to_string(),
            Agent::builder("researcher")
                .system_prompt("You research and analyze information.")
        )
        .agent(
            "writer".to_string(),
            Agent::builder("writer")
                .system_prompt("You create content and documentation.")
        )
        .build()
        .await?;

    // Execute collaborative tasks
    let result = forest
        .execute_collaborative_task(
            &"coordinator".to_string(),
            "Create a guide on sustainable practices".to_string(),
            vec!["researcher".to_string(), "writer".to_string()],
        )
        .await?;

    println!("Collaborative result: {}", result);

    // Direct inter-agent communication
    forest
        .send_message(
            &"coordinator".to_string(),
            Some(&"researcher".to_string()),
            "Please research the latest findings.".to_string(),
        )
        .await?;

    Ok(())
}

Features:

Multi-agent collaboration on complex tasks
Inter-agent communication (direct messages and broadcasts)
Task delegation between agents
Shared context and memory
Specialized agent roles working together

🆕 RAG System

Use Retrieval-Augmented Generation to provide context-aware responses:

use helios_engine::{Agent, Config, RAGTool, InMemoryVectorStore, OpenAIEmbeddings};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    let config = Config::from_file("config.toml")?;

    // Create a RAG system
    let embeddings = OpenAIEmbeddings::new(
        "https://api.openai.com/v1/embeddings".to_string(),
        std::env::var("OPENAI_API_KEY").unwrap(),
    );

    let vector_store = InMemoryVectorStore::new(embeddings);
    let rag_tool = RAGTool::new(vector_store);

    // Create an agent with RAG capabilities
    let mut agent = Agent::builder("RAGAgent")
        .config(config)
        .system_prompt("You have access to a knowledge base. Use the rag_tool to retrieve relevant information.")
        .tool(Box::new(rag_tool))
        .build()
        .await?;

    // Add documents to the knowledge base
    agent.chat("Add this document about Rust: 'Rust is a systems programming language...'")
        .await?;

    // Query with semantic search
    let response = agent.chat("What is Rust programming?").await?;
    println!("Response: {}", response);

    Ok(())
}

Features:

Vector-based semantic search for document retrieval
Multiple vector store backends (InMemory, Qdrant)
Automatic document chunking and embedding
Context-aware responses with relevant information
Easy integration with existing agents

Serve API

Expose your agents and LLM configurations as fully OpenAI-compatible HTTP API endpoints with real-time streaming and parameter control:

Serve an LLM Client (Direct API Access)

use helios_engine::{Config, serve};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    let config = Config::from_file("config.toml")?;
    serve::start_server(config, "127.0.0.1:8000").await?;
    Ok(())
}

Serve an Agent with Tools

use helios_engine::{Agent, Config, CalculatorTool, serve};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    // Initialize tracing
    tracing_subscriber::fmt().init();

    let config = Config::from_file("config.toml")?;

    // Create an agent with tools
    let agent = Agent::builder("API Agent")
        .config(config)
        .system_prompt("You are a helpful AI assistant with access to a calculator tool.")
        .tool(Box::new(CalculatorTool))
        .max_iterations(5)
        .build()
        .await?;

    // Start the server
    println!("Starting server on http://127.0.0.1:8000");
    println!("Try: curl http://127.0.0.1:8000/v1/chat/completions \\");
    println!("  -H 'Content-Type: application/json' \\");
    println!("  -d '{{\"model\": \"local-model\", \"messages\": [{{\"role\": \"user\", \"content\": \"What is 15 * 7?\"}}]}}'");

    serve::start_server_with_agent(agent, "local-model".to_string(), "127.0.0.1:8000").await?;

    Ok(())
}

API Endpoints

The server exposes OpenAI-compatible endpoints:

POST /v1/chat/completions - Chat completions (with streaming support)
GET /v1/models - List available models
GET /health - Health check

Custom Endpoints

You can define additional custom endpoints alongside the OpenAI-compatible API. Custom endpoints allow you to expose static JSON responses for monitoring, configuration, or integration purposes.

Create a custom endpoints configuration file (custom_endpoints.toml):

[[endpoints]]
method = "GET"
path = "/api/version"
response = { version = "0.3.3", service = "Helios Engine" }
status_code = 200

[[endpoints]]
method = "GET"
path = "/api/status"
response = { status = "operational", uptime = "unknown" }
status_code = 200

[[endpoints]]
method = "POST"
path = "/api/echo"
response = { message = "Echo endpoint", note = "Static response" }
status_code = 200

Use custom endpoints programmatically:

use helios_engine::{serve, CustomEndpointsConfig, CustomEndpoint};

let custom_endpoints = CustomEndpointsConfig {
    endpoints: vec![
        CustomEndpoint {
            method: "GET".to_string(),
            path: "/api/config".to_string(),
            response: serde_json::json!({
                "model": "configurable",
                "features": ["chat", "tools", "streaming"]
            }),
            status_code: 200,
        }
    ]
};

serve::start_server_with_custom_endpoints(config, "127.0.0.1:8000", Some(custom_endpoints)).await?;

Or serve an agent with custom endpoints:

serve::start_server_with_agent_and_custom_endpoints(
    agent,
    "model-name".to_string(),
    "127.0.0.1:8000",
    Some(custom_endpoints)
).await?;

Example API Usage

The API supports full OpenAI-compatible parameters for fine-grained control over generation:

# Basic non-streaming request
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'

# Advanced request with generation parameters
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Write a short poem about Rust programming"}
    ],
    "temperature": 0.8,
    "max_tokens": 150,
    "stop": ["\n\n"],
    "stream": false
  }'

# Real-time streaming request with parameters
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model",
    "messages": [{"role": "user", "content": "Tell me a creative story"}],
    "temperature": 0.9,
    "max_tokens": 500,
    "stream": true
  }'

Supported Parameters:

temperature (0.0-2.0): Controls randomness (lower = more deterministic)
max_tokens: Maximum tokens to generate
stop: Array of strings that stop generation when encountered
stream: Enable real-time token streaming for immediate responses

Note: When parameters are not specified, the server uses configuration defaults. Agents maintain conversation context across requests for natural multi-turn conversations.

CLI Usage

Helios Engine provides a powerful command-line interface with multiple modes and options:

Interactive Chat Mode

Start an interactive chat session:

# Default chat session
helios-engine

# With custom system prompt
helios-engine chat --system-prompt "You are a helpful coding assistant"

# With custom max iterations for tool calls
helios-engine chat --max-iterations 10

# With verbose logging for debugging
helios-engine --verbose chat

One-off Questions

Ask a single question without interactive mode:

# Ask a single question
helios-engine ask "What is the capital of France?"

# Ask with custom config file
helios-engine --config /path/to/config.toml ask "Calculate 123 * 456"

Configuration Management

Initialize and manage configuration:

# Create a new configuration file
helios-engine init

# Create config in custom location
helios-engine init --output ~/.helios/config.toml

HTTP Server (Serve Command)

Serve OpenAI-compatible API endpoints:

# Start server with default settings (port 8000, localhost)
helios-engine serve

# Serve on custom port and host
helios-engine serve --port 3000 --host 127.0.0.1

# Serve on all interfaces (accessible from other machines)
helios-engine serve --host 0.0.0.0

# Serve with custom endpoints from a configuration file
helios-engine serve --custom-endpoints custom_endpoints.toml

# Serve with verbose logging
helios-engine --verbose serve

The serve command exposes the following endpoints:

POST /v1/chat/completions - Chat completions with real-time streaming and full parameter support (temperature, max_tokens, stop)
GET /v1/models - List available models
GET /health - Health check endpoint
Custom endpoints (when --custom-endpoints is specified)

Mode Selection

Choose between different operation modes:

# Auto mode (uses local if configured, otherwise remote API)
helios-engine --mode auto chat

# Online mode (forces remote API usage)
helios-engine --mode online chat

# Offline mode (uses local models only)
helios-engine --mode offline chat

Interactive Commands

During an interactive session, use these commands:

exit or quit - Exit the chat session
clear - Clear conversation history
history - Show conversation history
help - Show help message

Configuration

Helios Engine uses TOML for configuration. You can configure either remote API access or local model inference with the dual LLMProviderType system.

Remote API Configuration (Default)

[llm]
# The model name (e.g., gpt-3.5-turbo, gpt-4, claude-3, etc.)
model_name = "gpt-3.5-turbo"

# Base URL for the API (OpenAI or compatible)
base_url = "https://api.openai.com/v1"

# Your API key
api_key = "your-api-key-here"

# Temperature for response generation (0.0 - 2.0)
temperature = 0.7

# Maximum tokens in response
max_tokens = 2048

Local Model Configuration (Offline Mode with llama.cpp)

[llm]
# Remote config still needed for auto mode fallback
model_name = "gpt-3.5-turbo"
base_url = "https://api.openai.com/v1"
api_key = "your-api-key-here"
temperature = 0.7
max_tokens = 2048

# Local model configuration for offline mode
[local]
# HuggingFace repository and model file
huggingface_repo = "unsloth/Qwen3-0.6B-GGUF"
model_file = "Qwen3-0.6B-Q4_K_M.gguf"

# Local model settings
temperature = 0.7
max_tokens = 2048

Auto Mode Configuration (Remote + Local)

For maximum flexibility, configure both remote and local models to enable auto mode:

[llm]
model_name = "gpt-3.5-turbo"
base_url = "https://api.openai.com/v1"
api_key = "your-api-key-here"
temperature = 0.7
max_tokens = 2048

# Local model as fallback
[local]
huggingface_repo = "unsloth/Qwen3-0.6B-GGUF"
model_file = "Qwen3-0.6B-Q4_K_M.gguf"
temperature = 0.7
max_tokens = 2048

Supported LLM Providers

Helios Engine supports both remote APIs and local model inference:

Remote APIs (Online Mode)

Helios Engine works with any OpenAI-compatible API:

OpenAI: https://api.openai.com/v1
Azure OpenAI: https://your-resource.openai.azure.com/openai/deployments/your-deployment
Local Models (LM Studio): http://localhost:1234/v1
Ollama with OpenAI compatibility: http://localhost:11434/v1
Any OpenAI-compatible API

Local Models (Offline Mode)

Run models locally using llama.cpp without internet connection:

GGUF Models: Compatible with all GGUF format models from HuggingFace
Automatic Download: Models are downloaded automatically from HuggingFace
GPU Acceleration: Uses GPU if available (via llama.cpp)
Clean Output: Suppresses verbose debugging for clean user experience
Popular Models: Works with Qwen, Llama, Mistral, and other GGUF models

Supported Model Sources:

HuggingFace Hub repositories
Local GGUF files
Automatic model caching

Local Inference Setup

Helios Engine supports running large language models locally using llama.cpp through the LLMProviderType system, providing privacy, offline capability, and no API costs.

Prerequisites

HuggingFace Account: Sign up at huggingface.co (free)

HuggingFace CLI: Install the CLI tool:

pip install huggingface_hub
huggingface-cli login  # Login with your token

Setting Up Local Models

Find a GGUF Model: Browse HuggingFace Models for compatible models

Update Configuration: Add local model config to your config.toml:

[local]
huggingface_repo = "unsloth/Qwen3-0.6B-GGUF"
model_file = "Qwen3-0.6B-Q4_K_M.gguf"
temperature = 0.7
max_tokens = 2048

Run in Offline Mode:

# First run downloads the model
helios-engine --mode offline ask "Hello world"

# Subsequent runs use cached model
helios-engine --mode offline chat

Recommended Models

Model	Size	Use Case	Repository
Qwen3-0.6B	~400MB	Fast, good quality	`unsloth/Qwen3-0.6B-GGUF`
Llama-3.2-1B	~700MB	Balanced performance	`unsloth/Llama-3.2-1B-Instruct-GGUF`
Mistral-7B	~4GB	High quality	`TheBloke/Mistral-7B-Instruct-v0.1-GGUF`

Performance & Features

GPU Acceleration: Models automatically use GPU if available via llama.cpp's n_gpu_layers parameter
Model Caching: Downloaded models are cached locally (~/.cache/huggingface)
Memory Usage: Larger models need more RAM/VRAM
First Run: Initial model download may take time depending on connection
Clean Output Mode: Suppresses verbose debugging from llama.cpp for clean user experience

Streaming Support with Local Models

Local models now support real-time token-by-token streaming just like remote models! The LLMClient automatically handles streaming for both remote and local models through the same unified API, providing a consistent experience.

Architecture

For detailed architecture documentation including system design, component interactions, and execution flows, see docs/ARCHITECTURE.md.

Quick System Overview

Helios Engine follows a modular architecture with clear separation of concerns:

Agent: Orchestrates conversations and tool execution
LLM Client: Handles communication with language models
Tool Registry: Manages and executes tools
Chat Session: Maintains conversation history
Configuration: Manages settings and preferences

Built-in Tools

Helios Engine includes 16+ built-in tools for common tasks. All tools follow the same pattern and can be easily added to agents.

Core Tools

`CalculatorTool`

Performs basic arithmetic operations.

Parameters:

expression (string, required): Mathematical expression to evaluate

Example:

agent.tool(Box::new(CalculatorTool));

`EchoTool`

Echoes back a message.

Parameters:

message (string, required): Message to echo

Example:

agent.tool(Box::new(EchoTool));

File Management Tools

`FileSearchTool`

Search for files by name pattern or content within files.

Parameters:

path (string, optional): Directory path to search in (default: current directory)
pattern (string, optional): File name pattern with wildcards (e.g., *.rs)
content (string, optional): Text content to search for within files
max_results (number, optional): Maximum number of results (default: 50)

`FileReadTool`

Read the contents of a file with optional line range selection.

Parameters:

path (string, required): File path to read
start_line (number, optional): Starting line number (1-indexed)
end_line (number, optional): Ending line number (1-indexed)

`FileWriteTool`

Write content to a file (creates new or overwrites existing).

Parameters:

path (string, required): File path to write to
content (string, required): Content to write

`FileEditTool`

Edit a file by replacing specific text (find and replace).

Parameters:

path (string, required): File path to edit
find (string, required): Text to find
replace (string, required): Replacement text

`FileIOTool`

Unified file operations: read, write, append, delete, copy, move, exists, size.

Parameters:

operation (string, required): Operation type
path (string, optional): File path for operations
src_path (string, optional): Source path for copy/move
dst_path (string, optional): Destination path for copy/move
content (string, optional): Content for write/append
recursive (boolean, optional): Allow recursive directory deletion (default: false for safety)

`FileListTool`

List directory contents with detailed metadata.

Parameters:

path (string, optional): Directory path to list
show_hidden (boolean, optional): Show hidden files
recursive (boolean, optional): List recursively
max_depth (number, optional): Maximum recursion depth

Web & API Tools

`WebScraperTool`

Fetch and extract content from web URLs.

Parameters:

url (string, required): URL to scrape
extract_text (boolean, optional): Extract readable text from HTML
timeout_seconds (number, optional): Request timeout

`HttpRequestTool`

Make HTTP requests with various methods.

Parameters:

method (string, required): HTTP method (GET, POST, PUT, DELETE, etc.)
url (string, required): Request URL
headers (object, optional): Request headers
body (string, optional): Request body
timeout_seconds (number, optional): Request timeout

`JsonParserTool`

Parse, validate, format, and manipulate JSON data.

Operations:

parse - Parse and validate JSON
stringify - Format JSON with optional indentation
get_value - Extract values by JSON path
set_value - Modify JSON values
validate - Check JSON validity

System & Utility Tools

`ShellCommandTool`

Execute shell commands safely with security restrictions.

Parameters:

command (string, required): Shell command to execute
timeout_seconds (number, optional): Command timeout

`SystemInfoTool`

Retrieve system information (OS, CPU, memory, disk, network).

Parameters:

category (string, optional): Info category (all, os, cpu, memory, disk, network)

`TimestampTool`

Work with timestamps and date/time operations.

Operations:

now - Current time
format - Format timestamps
parse - Parse timestamp strings
add/subtract - Time arithmetic
diff - Time difference calculation

`TextProcessorTool`

Process and manipulate text with various operations.

Operations:

search - Regex-based text search
replace - Find and replace with regex
split/join - Text splitting and joining
count - Character, word, and line counts
uppercase/lowercase - Case conversion
trim - Whitespace removal
lines/words - Text formatting

Data Storage Tools

`MemoryDBTool`

In-memory key-value database for caching data during conversations.

Operations:

set - Store key-value pairs
get - Retrieve values
delete - Remove entries
list - Show all stored data
clear - Remove all data
exists - Check key existence

`QdrantRAGTool`

RAG (Retrieval-Augmented Generation) tool with Qdrant vector database.

Operations:

add_document - Store and embed documents
search - Semantic search
delete - Remove documents
clear - Clear collection

Creating Custom Tools

Implement the Tool trait to create custom tools:

use async_trait::async_trait;
use helios_engine::{Tool, ToolParameter, ToolResult};
use serde_json::Value;
use std::collections::HashMap;

struct WeatherTool;

#[async_trait]
impl Tool for WeatherTool {
    fn name(&self) -> &str {
        "get_weather"
    }

    fn description(&self) -> &str {
        "Get the current weather for a location"
    }

    fn parameters(&self) -> HashMap<String, ToolParameter> {
        let mut params = HashMap::new();
        params.insert(
            "location".to_string(),
            ToolParameter {
                param_type: "string".to_string(),
                description: "City name".to_string(),
                required: Some(true),
            },
        );
        params
    }

    async fn execute(&self, args: Value) -> helios_engine::Result<ToolResult> {
        let location = args["location"].as_str().unwrap_or("Unknown");

        // Your weather API logic here
        let weather = format!("Weather in {}: Sunny, 72°F", location);

        Ok(ToolResult::success(weather))
    }
}

// Use your custom tool
#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    let config = Config::from_file("config.toml")?;

    let mut agent = Agent::builder("WeatherAgent")
        .config(config)
        .tool(Box::new(WeatherTool))
        .build()
        .await?;

    let response = agent.chat("What's the weather in Tokyo?").await?;
    println!("{}", response);

    Ok(())
}

API Documentation

Core Types

`Agent`

The main agent struct that manages conversation and tool execution.

Methods:

builder(name) - Create a new agent builder
chat(message) - Send a message and get a response
register_tool(tool) - Add a tool to the agent
clear_history() - Clear conversation history
set_system_prompt(prompt) - Set the system prompt
set_max_iterations(max) - Set maximum tool call iterations
set_memory(key, value) - Set a memory value for the agent
get_memory(key) - Get a memory value
remove_memory(key) - Remove a memory value
clear_memory() - Clear all agent memory (preserves session metadata)
get_session_summary() - Get a summary of the current session
increment_counter(key) - Increment a counter in memory
increment_tasks_completed() - Increment the tasks_completed counter

`Config`

Configuration management for LLM settings.

Methods:

from_file(path) - Load config from TOML file
default() - Create default configuration
save(path) - Save config to file

`LLMClient`

Client for interacting with LLM providers (remote or local).

Methods:

new(provider_type) - Create client with LLMProviderType (Remote or Local)
chat(messages, tools) - Send messages and get response
chat_stream(messages, tools, callback) - Send messages and stream response with callback function
generate(request) - Low-level generation method

`LLMProviderType`

Enumeration for different LLM provider types.

Variants:

Remote(LLMConfig) - For remote API providers (OpenAI, Azure, etc.)
Local(LocalConfig) - For local llama.cpp models

`ToolRegistry`

Manages and executes tools.

Methods:

new() - Create empty registry
register(tool) - Register a new tool
execute(name, args) - Execute a tool by name
get_definitions() - Get all tool definitions
list_tools() - List registered tool names

`ChatSession`

Manages conversation history and session metadata.

Methods:

new() - Create new session
with_system_prompt(prompt) - Set system prompt
add_message(message) - Add message to history
add_user_message(content) - Add a user message
add_assistant_message(content) - Add an assistant message
get_messages() - Get all messages
clear() - Clear all messages
set_metadata(key, value) - Set session metadata
get_metadata(key) - Get session metadata
remove_metadata(key) - Remove session metadata
get_summary() - Get a summary of the session

Built-in Tools

For detailed documentation of all 16+ built-in tools including usage examples, see the Built-in Tools section above.

Legacy Tool Documentation

`CalculatorTool`

Performs basic arithmetic operations.

Parameters:

expression (string, required): Mathematical expression to evaluate

Example:

agent.tool(Box::new(CalculatorTool));

`EchoTool`

Echoes back a message.

Parameters:

message (string, required): Message to echo

Example:

agent.tool(Box::new(EchoTool));

`FileSearchTool`

Search for files by name pattern or search for content within files.

Parameters:

path (string, optional): Directory path to search in (default: current directory)
pattern (string, optional): File name pattern with wildcards (e.g., *.rs)
content (string, optional): Text content to search for within files
max_results (number, optional): Maximum number of results (default: 50)

Example:

agent.tool(Box::new(FileSearchTool));

`FileReadTool`

Read the contents of a file with optional line range selection.

Parameters:

path (string, required): File path to read
start_line (number, optional): Starting line number (1-indexed)
end_line (number, optional): Ending line number (1-indexed)

Example:

agent.tool(Box::new(FileReadTool));

`FileWriteTool`

Write content to a file (creates new or overwrites existing).

Parameters:

path (string, required): File path to write to
content (string, required): Content to write

Example:

agent.tool(Box::new(FileWriteTool));

`FileEditTool`

Edit a file by replacing specific text (find and replace).

Parameters:

path (string, required): File path to edit
find (string, required): Text to find
replace (string, required): Replacement text

Example:

agent.tool(Box::new(FileEditTool));

`MemoryDBTool`

In-memory key-value database for caching data during conversations.

Parameters:

operation (string, required): Operation to perform: set, get, delete, list, clear, exists
key (string, optional): Key for set, get, delete, exists operations
value (string, optional): Value for set operation

Supported Operations:

set - Store a key-value pair
get - Retrieve a value by key
delete - Remove a key-value pair
list - List all stored items
clear - Clear all data
exists - Check if a key exists

Example:

agent.tool(Box::new(MemoryDBTool::new()));

Usage in conversation:

// Agent can now cache data
agent.chat("Store my name as 'Alice' in the database").await?;
agent.chat("What's my name?").await?; // Agent retrieves from DB

`QdrantRAGTool`

RAG (Retrieval-Augmented Generation) tool with Qdrant vector database for semantic search and document retrieval.

Parameters:

operation (string, required): Operation: add_document, search, delete, clear
text (string, optional): Document text or search query
doc_id (string, optional): Document ID for delete operation
limit (number, optional): Number of search results (default: 5)
metadata (object, optional): Additional metadata for documents

Supported Operations:

add_document - Embed and store a document
search - Semantic search with vector similarity
delete - Remove a document by ID
clear - Clear all documents from collection

Example:

let rag_tool = QdrantRAGTool::new(
    "http://localhost:6333",                    // Qdrant URL
    "my_collection",                             // Collection name
    "https://api.openai.com/v1/embeddings",     // Embedding API
    std::env::var("OPENAI_API_KEY").unwrap(),   // API key
);

agent.tool(Box::new(rag_tool));

Prerequisites:

Qdrant running: docker run -p 6333:6333 qdrant/qdrant
OpenAI API key for embeddings

`WebScraperTool`

Scrape web content from URLs with automatic text extraction and cleaning.

Parameters:

url (string, required): URL to scrape
max_length (number, optional): Maximum content length (default: 10000)

Example:

agent.tool(Box::new(WebScraperTool));

`JsonParserTool`

Parse, validate, stringify, and extract values from JSON data.

Parameters:

operation (string, required): Operation: parse, stringify, get_value, validate
json (string, optional): JSON string for parse/stringify operations
path (string, optional): JSON path for get_value operation (e.g., "$.key" or "$.array[0]")

Supported Operations:

parse - Parse and validate JSON string
stringify - Convert JSON to formatted string
get_value - Extract value using JSON path
validate - Validate JSON structure

Example:

agent.tool(Box::new(JsonParserTool));

`TimestampTool`

Work with timestamps, perform date/time operations and formatting.

Parameters:

operation (string, required): Operation: now, format, add, diff
timestamp (number, optional): Unix timestamp for format/add operations
format (string, optional): Date format string (default: "%Y-%m-%d %H:%M:%S")
amount (number, optional): Time amount to add/subtract
unit (string, optional): Time unit: seconds, minutes, hours, days, weeks

Supported Operations:

now - Get current timestamp
format - Format timestamp to string
add - Add time to timestamp
diff - Calculate difference between timestamps

Example:

agent.tool(Box::new(TimestampTool));

`ShellCommandTool`

Execute shell commands safely with timeout and output capture.

Parameters:

command (string, required): Shell command to execute
timeout (number, optional): Timeout in seconds (default: 30)

Example:

agent.tool(Box::new(ShellCommandTool));

`HttpRequestTool`

Make HTTP requests with full support for methods, headers, and body.

Parameters:

method (string, required): HTTP method: GET, POST, PUT, DELETE, etc.
url (string, required): Request URL
headers (object, optional): HTTP headers as key-value pairs
body (string, optional): Request body for POST/PUT requests

Example:

agent.tool(Box::new(HttpRequestTool));

`FileListTool`

List directory contents with filtering and detailed information.

Parameters:

path (string, optional): Directory path (default: current directory)
pattern (string, optional): File name pattern with wildcards
recursive (boolean, optional): Include subdirectories (default: false)
max_results (number, optional): Maximum number of results (default: 100)

Example:

agent.tool(Box::new(FileListTool));

`SystemInfoTool`

Retrieve system information including CPU, memory, disk, and OS details.

Parameters:

None required

Example:

agent.tool(Box::new(SystemInfoTool));

`TextProcessorTool`

Process and analyze text with various operations like counting, trimming, and searching.

Parameters:

operation (string, required): Operation: count, trim, uppercase, lowercase, replace, search
text (string, required): Input text
find (string, optional): Text to find (for replace/search operations)
replace (string, optional): Replacement text (for replace operation)
case_sensitive (boolean, optional): Case sensitivity for search (default: true)

Supported Operations:

count - Count characters, words, lines
trim - Remove whitespace
uppercase/lowercase - Change case
replace - Find and replace text
search - Search for text patterns

Example:

agent.tool(Box::new(TextProcessorTool));

`FileIOTool`

Perform file I/O operations including reading, writing, copying, and moving files.

Parameters:

operation (string, required): Operation: read, write, copy, move, delete, exists
path (string, required): File path
content (string, optional): Content for write operation
destination (string, optional): Destination path for copy/move operations

Supported Operations:

read - Read file content
write - Write content to file
copy - Copy file to new location
move - Move file to new location
delete - Delete file
exists - Check if file exists

Example:

agent.tool(Box::new(FileIOTool));

Project Structure

helios/
├── Cargo.toml              # Project configuration
├── README.md               # This file
├── config.example.toml     # Example configuration
├── .gitignore             # Git ignore rules
│
├── src/
│   ├── lib.rs             # Library entry point
│   ├── main.rs            # Binary entry point (interactive demo)
│   ├── agent.rs           # Agent implementation
│   ├── llm.rs             # LLM client and provider
│   ├── tools.rs           # Tool system and built-in tools
│   ├── chat.rs            # Chat message and session types
│   ├── config.rs          # Configuration management
│   ├── serve.rs           # HTTP server for OpenAI-compatible API
│   └── error.rs           # Error types
│
├── docs/
│   ├── API.md                    # API reference
│   ├── QUICKSTART.md             # Quick start guide
│   ├── TUTORIAL.md               # Detailed tutorial
│   └── USING_AS_CRATE.md         # Using Helios as a library
│
└── examples/
    ├── basic_chat.rs             # Simple chat example
    ├── agent_with_tools.rs       # Tool usage example
    ├── agent_with_file_tools.rs  # File management tools example
    ├── agent_with_memory_db.rs   # Memory database tool example
    ├── agent_with_rag.rs         # Agent with RAG capabilities
    ├── custom_tool.rs            # Custom tool implementation
    ├── multiple_agents.rs        # Multiple agents example
    ├── forest_of_agents.rs       # Multi-agent collaboration system
    ├── send_message_tool_demo.rs # SendMessageTool functionality demo
    ├── direct_llm_usage.rs       # Direct LLM client usage
    ├── streaming_chat.rs         # Streaming responses example
    ├── local_streaming.rs        # Local model streaming example
    ├── rag_in_memory.rs          # RAG with in-memory vector store
    ├── rag_advanced.rs           # RAG with Qdrant vector store
    ├── rag_qdrant_comparison.rs  # Compare RAG implementations
    ├── serve_agent.rs            # Serve agent via HTTP API
    ├── serve_with_custom_endpoints.rs # Serve with custom endpoints
    └── complete_demo.rs          # Complete feature demonstration

Module Overview

helios-engine/
│
├──  agent           - Agent system and builder pattern
├──  chat            - Chat messages and session management
├──  config          - TOML configuration loading/saving
├──  error           - Error types and Result alias
├──  forest          - Forest of Agents - multi-agent collaboration system
├──  llm             - LLM client and API communication
├──  rag             - RAG (Retrieval-Augmented Generation) system
├──  rag_tool        - RAG tool implementation for agents
├──  serve           - HTTP server for OpenAI-compatible API
└──  tools           - Tool registry and implementations

Examples

For comprehensive examples demonstrating various Helios Engine features, see the examples/ directory.

The examples include:

Basic chat and agent usage
Tool integration examples
File management demonstrations
API serving examples
Streaming and advanced features

See examples/README.md for detailed documentation and usage instructions.

Testing

Run tests:

cargo test

Run with logging:

RUST_LOG=debug cargo run

🔍 Advanced Features

Custom LLM Providers

Implement the LLMProvider trait for custom backends:

use async_trait::async_trait;
use helios_engine::{LLMProvider, LLMRequest, LLMResponse};

struct CustomProvider;

#[async_trait]
impl LLMProvider for CustomProvider {
    async fn generate(&self, request: LLMRequest) -> helios_engine::Result<LLMResponse> {
        // Your custom implementation
        todo!()
    }
}

Tool Chaining

Agents automatically chain tool calls:

// The agent can use multiple tools in sequence
let response = agent.chat(
    "Calculate 10 * 5, then echo the result"
).await?;

Thinking Tags Display

Helios Engine automatically detects and displays thinking tags from LLM responses:

The CLI displays thinking tags with visual indicators: 💭 [Thinking...]
Streaming responses show thinking tags in real-time
Supports both <thinking> and <think> tag formats
In offline mode, thinking tags are processed and removed from final output

Conversation Context

Maintain conversation history:

let mut agent = Agent::builder("Assistant")
    .config(config)
    .system_prompt("You are a helpful assistant.")
    .build()
    .await?;

let response1 = agent.chat("My name is Alice").await?;
let response2 = agent.chat("What is my name?").await?; // Agent remembers: "Alice"

println!("{response1}");

println!("{response2");

Clean Output Mode

In offline mode, Helios Engine suppresses all verbose debugging output from llama.cpp:

No model loading messages
No layer information display
No verbose internal operations
Clean, user-focused experience during local inference

Session Memory & Metadata

Track agent state and conversation metadata across interactions:

// Set agent memory (namespaced under "agent:" prefix)
agent.set_memory("user_preference", "concise");
agent.set_memory("tasks_completed", "0");

// Get memory values
if let Some(pref) = agent.get_memory("user_preference") {
    println!("User prefers: {}", pref);
}

// Increment counters
agent.increment_tasks_completed();
agent.increment_counter("files_processed");

// Get session summary
println!("{}", agent.get_session_summary());

// Clear only agent memory (preserves general session metadata)
agent.clear_memory();

Session metadata in ChatSession:

let mut session = ChatSession::new();

// Set general session metadata
session.set_metadata("session_id", "abc123");
session.set_metadata("start_time", chrono::Utc::now().to_rfc3339());

// Retrieve metadata
if let Some(id) = session.get_metadata("session_id") {
    println!("Session ID: {}", id);
}

// Get session summary
println!("{}", session.get_summary());

File Management Tools

Built-in tools for file operations:

use helios_engine::{Agent, Config, FileSearchTool, FileReadTool, FileWriteTool, FileEditTool};

let mut agent = Agent::builder("FileAgent")
    .config(config)
    .tool(Box::new(FileSearchTool))    // Search files by name or content
    .tool(Box::new(FileReadTool))      // Read file contents
    .tool(Box::new(FileWriteTool))     // Write/create files
    .tool(Box::new(FileEditTool))      // Find and replace in files
    .build()
    .await?;

// Agent can now search, read, write, and edit files
let response = agent.chat("Find all .rs files and show me main.rs").await?;
println!("{response}");

In-Memory Database Tool

Cache and retrieve data during agent conversations:

use helios_engine::{Agent, Config, MemoryDBTool};

let mut agent = Agent::builder("DataAgent")
    .config(config)
    .system_prompt("You can store and retrieve data using the memory_db tool.")
    .tool(Box::new(MemoryDBTool::new()))
    .build()
    .await?;

// Store data
agent.chat("Remember that my favorite color is blue").await?;

// Agent automatically uses the database to remember
agent.chat("What's my favorite color?").await?;
// Response: "Your favorite color is blue"

// Cache expensive computations
agent.chat("Calculate 12345 * 67890 and save it as 'result'").await?;
agent.chat("What was the result I asked you to calculate?").await?;

// List all cached data
let response = agent.chat("Show me everything you've stored").await?;
println!("{response}");

Shared Database Between Agents:

use std::sync::{Arc, Mutex};
use std::collections::HashMap;

// Create a shared database
let shared_db = Arc::new(Mutex::new(HashMap::new()));

// Multiple agents sharing the same database
let mut agent1 = Agent::builder("Agent1")
    .config(config.clone())
    .tool(Box::new(MemoryDBTool::with_shared_db(shared_db.clone())))
    .build()
    .await?;

let mut agent2 = Agent::builder("Agent2")
    .config(config)
    .tool(Box::new(MemoryDBTool::with_shared_db(shared_db.clone())))
    .build()
    .await?;

// Data stored by agent1 is accessible to agent2
agent1.chat("Store 'project_status' as 'in_progress'").await?;
agent2.chat("What is the project status?").await?; // Gets "in_progress"

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Setup

Clone the repository:

git clone https://github.com/Ammar-Alnagar/Helios-Engine.git
cd Helios-Engine

Build the project:

cargo build

Run tests:

cargo test

Format code:

cargo fmt

Check for issues:

cargo clippy

License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with ❤️ in Rust

⚠️ ☠️ HERE BE DRAGONS ☠️ ⚠️

🔥 ABANDON ALL HOPE, YE WHO ENTER HERE 🔥

Greetings, Foolish Mortal

What lies before you is not code—it is a CURSE.

A labyrinth of logic so twisted, so arcane, that it defies comprehension itself.

⚡ What Holds This Monstrosity Together

🩹 Duct tape (metaphorical and spiritual)
🙏 Prayers whispered at 3 AM
📚 Stack Overflow answers from 2009
😱 Pure, unfiltered desperation
😭 The tears of junior developers
🎲 Luck (mostly luck)

📜 The Legend

Once, two beings understood this code:

⚡ God and Me ⚡

Now... I have forgotten.

Only God remains.

And I'm not sure He's still watching.

helios-engine 0.3.4

🔥 Helios Engine - LLM Agent Framework

Features

Table of Contents

Installation

As a CLI Tool (Recommended for Quick Start)

As a Library Crate

Build from Source

🚩 Feature Flags

Available Features

local - Local Model Support

Quick Start

Using as a Library Crate

Using Offline Mode with Local Models

Using with Agent System

1. Configure Your LLM

2. Create Your First Agent

3. Run the Interactive Demo

🆕 Forest of Agents

🆕 RAG System

Serve API

Serve an LLM Client (Direct API Access)

Serve an Agent with Tools

API Endpoints

Custom Endpoints

Example API Usage

CLI Usage

Interactive Chat Mode

One-off Questions

Configuration Management

HTTP Server (Serve Command)

Mode Selection

Interactive Commands

Configuration

Remote API Configuration (Default)

Local Model Configuration (Offline Mode with llama.cpp)

Auto Mode Configuration (Remote + Local)

Supported LLM Providers

Remote APIs (Online Mode)

Local Models (Offline Mode)

Local Inference Setup

Prerequisites

Setting Up Local Models

Recommended Models

Performance & Features

Streaming Support with Local Models

Architecture

Quick System Overview

Built-in Tools

Core Tools

CalculatorTool

EchoTool

File Management Tools

FileSearchTool

FileReadTool

FileWriteTool

FileEditTool

FileIOTool

FileListTool

Web & API Tools

WebScraperTool

HttpRequestTool

JsonParserTool

System & Utility Tools

ShellCommandTool

SystemInfoTool

TimestampTool

TextProcessorTool

Data Storage Tools

MemoryDBTool

QdrantRAGTool

Creating Custom Tools

API Documentation

Core Types

Agent

Config

LLMClient

LLMProviderType

ToolRegistry

ChatSession

`local` - Local Model Support

`CalculatorTool`

`EchoTool`

`FileSearchTool`

`FileReadTool`

`FileWriteTool`

`FileEditTool`

`FileIOTool`

`FileListTool`

`WebScraperTool`

`HttpRequestTool`

`JsonParserTool`

`ShellCommandTool`

`SystemInfoTool`

`TimestampTool`

`TextProcessorTool`

`MemoryDBTool`

`QdrantRAGTool`

`Agent`

`Config`

`LLMClient`

`LLMProviderType`

`ToolRegistry`

`ChatSession`

`CalculatorTool`

`EchoTool`

`FileSearchTool`

`FileReadTool`

`FileWriteTool`

`FileEditTool`

`MemoryDBTool`

`QdrantRAGTool`

`WebScraperTool`

`JsonParserTool`

`TimestampTool`

`ShellCommandTool`

`HttpRequestTool`

`FileListTool`

`SystemInfoTool`

`TextProcessorTool`

`FileIOTool`