helios-engine 0.3.3

A powerful and flexible Rust framework for building LLM-powered agents with tool support, both locally and online
Documentation

πŸ”₯ Helios Engine - LLM Agent Framework

Crates.io

docs.rs downloads

issues stars last commit

Helios Engine is a powerful and flexible Rust framework for building LLM-powered agents with tool support, streaming chat capabilities, and easy configuration management. Create intelligent agents that can interact with users, call tools, and maintain conversation context - with both online and offline local model support.

Features

  • πŸ†• Forest of Agents: Multi-agent collaboration system where agents can communicate, delegate tasks, and share context
  • Agent System: Create multiple agents with different personalities and capabilities
  • Tool Registry: Extensible tool system for adding custom functionality
  • Chat Management: Built-in conversation history and session management
  • Session Memory: Track agent state and metadata across conversations
  • Extensive Tool Suite: 16+ built-in tools including web scraping, JSON parsing, timestamp operations, file I/O, shell commands, HTTP requests, system info, and text processing
  • File Management Tools: Built-in tools for searching, reading, writing, editing, and listing files
  • Web & API Tools: Web scraping, HTTP requests, and JSON manipulation capabilities
  • System Integration: Shell command execution, system information retrieval, and timestamp operations
  • Text Processing: Advanced text search, replace, formatting, and analysis tools
  • πŸ†• RAG System: Retrieval-Augmented Generation with vector stores (InMemory and Qdrant)
  • Streaming Support: True real-time response streaming for both remote and local models with immediate token delivery
  • Local Model Support: Run local models offline using llama.cpp with HuggingFace integration (optional local feature)
  • LLM Support: Compatible with OpenAI API, any OpenAI-compatible API, and local models
  • HTTP Server & API: Expose OpenAI-compatible API endpoints with full parameter support (temperature, max_tokens, stop) for agents and LLM clients
  • Async/Await: Built on Tokio for high-performance async operations
  • Type-Safe: Leverages Rust's type system for safe and reliable code
  • Extensible: Easy to add custom tools and extend functionality
  • Thinking Tags: Automatic detection and display of model reasoning process
  • Dual Mode Support: Auto, online (remote API), and offline (local) modes
  • Clean Output: Suppresses verbose debugging in offline mode for clean user experience
  • CLI & Library: Use as both a command-line tool and a Rust library crate
  • πŸ†• Feature Flags: Optional local feature for offline model support - build only what you need!

Table of Contents

Installation

Helios Engine can be used both as a command-line tool and as a library crate in your Rust projects.

As a CLI Tool (Recommended for Quick Start)

Install globally using Cargo (once published):

# Install without local model support (lighter, faster install)
cargo install helios-engine

# Install with local model support (enables offline mode with llama-cpp-2)
cargo install helios-engine --features local

Then use anywhere:

# Initialize configuration
helios-engine init

# Start interactive chat (default command)
helios-engine
# or explicitly
helios-engine chat

# Ask a quick question
helios-engine ask "What is Rust?"

# Get help
helios-engine --help

#  NEW: Use offline mode with local models (no internet required)
helios-engine --mode offline chat

# Use online mode (forces remote API usage)
helios-engine --mode online chat

# Auto mode (uses local if configured, otherwise remote)
helios-engine --mode auto chat

# Verbose logging for debugging
helios-engine --verbose chat

# Custom system prompt
helios-engine chat --system-prompt "You are a Python expert"

# One-off question with custom config
helios-engine --config /path/to/config.toml ask "Calculate 15 * 7"

# NEW: Serve OpenAI-compatible API endpoints
helios-engine serve --port 8000 --host 127.0.0.1

# Serve on all interfaces
helios-engine serve --host 0.0.0.0

As a Library Crate

Add Helios-Engine to your Cargo.toml:

[dependencies]
# Without local model support (lighter dependency)
helios-engine = "0.3.3"
tokio = { version = "1.35", features = ["full"] }

# OR with local model support for offline inference
helios-engine = { version = "0.3.3", features = ["local"] }
tokio = { version = "1.35", features = ["full"] }

Or use a local path during development:

[dependencies]
helios-engine = { path = "../helios" }
tokio = { version = "1.35", features = ["full"] }

Build from Source

git clone https://github.com/Ammar-Alnagar/Helios-Engine.git
cd Helios-Engine

# Build without local model support
cargo build --release

# OR build with local model support
cargo build --release --features local

# Install locally (without local support)
cargo install --path .

# OR install with local model support
cargo install --path . --features local

🚩 Feature Flags

Helios Engine supports optional feature flags to control which dependencies are included in your build. This allows you to create lighter builds when you don't need certain functionality.

Available Features

local - Local Model Support

Enables offline inference using local models via llama-cpp-2. When disabled, the engine only supports remote API calls, resulting in:

  • Faster compilation times - No need to build llama-cpp-2 and its dependencies
  • Smaller binary size - Excludes large native libraries
  • Simpler dependencies - Reduces the dependency tree significantly

Enables:

  • LocalLLMProvider - Run models locally using llama.cpp
  • LocalConfig - Configuration for local model setup
  • Offline mode (--mode offline) in the CLI
  • HuggingFace model downloading and caching

When to use:

  • βœ… Use --features local if you need offline inference or want to run models locally
  • ❌ Skip it if you only use remote APIs (OpenAI, Azure, etc.) for faster builds

Example:

# Without local support (lightweight, remote API only)
cargo install helios-engine
cargo build --release

# With local support (includes llama-cpp-2 for offline inference)
cargo install helios-engine --features local
cargo build --release --features local

In Cargo.toml:

# Remote API only
[dependencies]
helios-engine = "0.3.3"

# With local model support
[dependencies]
helios-engine = { version = "0.3.3", features = ["local"] }

Quick Start

Using as a Library Crate

The simplest way to use Helios Engine is to call LLM models directly:

use helios_engine::{LLMClient, ChatMessage, llm::LLMProviderType};
use helios_engine::config::LLMConfig;

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    // Configure the LLM
    let llm_config = LLMConfig {
        model_name: "gpt-3.5-turbo".to_string(),
        base_url: "https://api.openai.com/v1".to_string(),
        api_key: std::env::var("OPENAI_API_KEY").unwrap(), // export api key not .env file
        temperature: 0.7,
        max_tokens: 2048,
    };

    // Create client with remote provider type
    let client = LLMClient::new(LLMProviderType::Remote(llm_config)).await?;

    // Make a call
    let messages = vec![
        ChatMessage::system("You are a helpful assistant."),
        ChatMessage::user("What is the capital of France?"),
    ];

    let response = client.chat(messages, None).await?;
    println!("Response: {}", response.content);

    Ok(())
}

For detailed examples of using Helios Engine as a crate, see Using as a Crate Guide

Using Offline Mode with Local Models

Run models locally without internet connection:

use helios_engine::{LLMClient, ChatMessage, llm::LLMProviderType};
use helios_engine::config::LocalConfig;

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    // Configure local model
    let local_config = LocalConfig {
        huggingface_repo: "unsloth/Qwen3-0.6B-GGUF".to_string(),
        model_file: "Qwen3-0.6B-Q4_K_M.gguf".to_string(),
        temperature: 0.7,
        max_tokens: 2048,
        context_size: 8192
    };

    // Create client with local provider
    let client = LLMClient::new(LLMProviderType::Local(local_config)).await?;

    let messages = vec![
        ChatMessage::system("You are a helpful AI assistant."),
        ChatMessage::user("What is Rust programming?"),
    ];

    let response = client.chat(messages, None).await?;
    println!("Response: {}", response.content);

    Ok(())
}

Note: First run downloads the model. Subsequent runs use the cached model.

Using with Agent System

For more advanced use cases with tools and persistent conversation:

1. Configure Your LLM

Create a config.toml file (supports both remote and local):

[llm]
model_name = "gpt-3.5-turbo"
base_url = "https://api.openai.com/v1"
api_key = "your-api-key-here"
temperature = 0.7
max_tokens = 2048

# Optional: Add local configuration for offline mode
[local]
huggingface_repo = "unsloth/Qwen3-0.6B-GGUF"
model_file = "Qwen3-0.6B-Q4_K_M.gguf"
temperature = 0.7
max_tokens = 2048

2. Create Your First Agent

use helios_engine::{Agent, Config, CalculatorTool};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    // Load configuration
    let config = Config::from_file("config.toml")?;

    // Create an agent with tools
    let mut agent = Agent::builder("HeliosAgent")
        .config(config)
        .system_prompt("You are a helpful AI assistant.")
        .tool(Box::new(CalculatorTool))
        .build()
        .await?;

    // Chat with the agent
    let response = agent.chat("What is 15 * 7?").await?;
    println!("Agent: {}", response);

    Ok(())
}

3. Run the Interactive Demo

cargo run

πŸ†• Forest of Agents

Create a collaborative multi-agent system where agents can communicate, delegate tasks, and share context:

use helios_engine::{Agent, Config, ForestBuilder};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    let config = Config::from_file("config.toml")?;

    // Create a forest with specialized agents
    let mut forest = ForestBuilder::new()
        .config(config)
        .agent(
            "coordinator".to_string(),
            Agent::builder("coordinator")
                .system_prompt("You coordinate team projects and delegate tasks.")
        )
        .agent(
            "researcher".to_string(),
            Agent::builder("researcher")
                .system_prompt("You research and analyze information.")
        )
        .agent(
            "writer".to_string(),
            Agent::builder("writer")
                .system_prompt("You create content and documentation.")
        )
        .build()
        .await?;

    // Execute collaborative tasks
    let result = forest
        .execute_collaborative_task(
            &"coordinator".to_string(),
            "Create a guide on sustainable practices".to_string(),
            vec!["researcher".to_string(), "writer".to_string()],
        )
        .await?;

    println!("Collaborative result: {}", result);

    // Direct inter-agent communication
    forest
        .send_message(
            &"coordinator".to_string(),
            Some(&"researcher".to_string()),
            "Please research the latest findings.".to_string(),
        )
        .await?;

    Ok(())
}

Features:

  • Multi-agent collaboration on complex tasks
  • Inter-agent communication (direct messages and broadcasts)
  • Task delegation between agents
  • Shared context and memory
  • Specialized agent roles working together

πŸ†• RAG System

Use Retrieval-Augmented Generation to provide context-aware responses:

use helios_engine::{Agent, Config, RAGTool, InMemoryVectorStore, OpenAIEmbeddings};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    let config = Config::from_file("config.toml")?;

    // Create a RAG system
    let embeddings = OpenAIEmbeddings::new(
        "https://api.openai.com/v1/embeddings".to_string(),
        std::env::var("OPENAI_API_KEY").unwrap(),
    );

    let vector_store = InMemoryVectorStore::new(embeddings);
    let rag_tool = RAGTool::new(vector_store);

    // Create an agent with RAG capabilities
    let mut agent = Agent::builder("RAGAgent")
        .config(config)
        .system_prompt("You have access to a knowledge base. Use the rag_tool to retrieve relevant information.")
        .tool(Box::new(rag_tool))
        .build()
        .await?;

    // Add documents to the knowledge base
    agent.chat("Add this document about Rust: 'Rust is a systems programming language...'")
        .await?;

    // Query with semantic search
    let response = agent.chat("What is Rust programming?").await?;
    println!("Response: {}", response);

    Ok(())
}

Features:

  • Vector-based semantic search for document retrieval
  • Multiple vector store backends (InMemory, Qdrant)
  • Automatic document chunking and embedding
  • Context-aware responses with relevant information
  • Easy integration with existing agents

Serve API

Expose your agents and LLM configurations as fully OpenAI-compatible HTTP API endpoints with real-time streaming and parameter control:

Serve an LLM Client (Direct API Access)

use helios_engine::{Config, serve};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    let config = Config::from_file("config.toml")?;
    serve::start_server(config, "127.0.0.1:8000").await?;
    Ok(())
}

Serve an Agent with Tools

use helios_engine::{Agent, Config, CalculatorTool, serve};

#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    // Initialize tracing
    tracing_subscriber::fmt().init();

    let config = Config::from_file("config.toml")?;

    // Create an agent with tools
    let agent = Agent::builder("API Agent")
        .config(config)
        .system_prompt("You are a helpful AI assistant with access to a calculator tool.")
        .tool(Box::new(CalculatorTool))
        .max_iterations(5)
        .build()
        .await?;

    // Start the server
    println!("Starting server on http://127.0.0.1:8000");
    println!("Try: curl http://127.0.0.1:8000/v1/chat/completions \\");
    println!("  -H 'Content-Type: application/json' \\");
    println!("  -d '{{\"model\": \"local-model\", \"messages\": [{{\"role\": \"user\", \"content\": \"What is 15 * 7?\"}}]}}'");

    serve::start_server_with_agent(agent, "local-model".to_string(), "127.0.0.1:8000").await?;

    Ok(())
}

API Endpoints

The server exposes OpenAI-compatible endpoints:

  • POST /v1/chat/completions - Chat completions (with streaming support)
  • GET /v1/models - List available models
  • GET /health - Health check

Custom Endpoints

You can define additional custom endpoints alongside the OpenAI-compatible API. Custom endpoints allow you to expose static JSON responses for monitoring, configuration, or integration purposes.

Create a custom endpoints configuration file (custom_endpoints.toml):

[[endpoints]]
method = "GET"
path = "/api/version"
response = { version = "0.3.3", service = "Helios Engine" }
status_code = 200

[[endpoints]]
method = "GET"
path = "/api/status"
response = { status = "operational", uptime = "unknown" }
status_code = 200

[[endpoints]]
method = "POST"
path = "/api/echo"
response = { message = "Echo endpoint", note = "Static response" }
status_code = 200

Use custom endpoints programmatically:

use helios_engine::{serve, CustomEndpointsConfig, CustomEndpoint};

let custom_endpoints = CustomEndpointsConfig {
    endpoints: vec![
        CustomEndpoint {
            method: "GET".to_string(),
            path: "/api/config".to_string(),
            response: serde_json::json!({
                "model": "configurable",
                "features": ["chat", "tools", "streaming"]
            }),
            status_code: 200,
        }
    ]
};

serve::start_server_with_custom_endpoints(config, "127.0.0.1:8000", Some(custom_endpoints)).await?;

Or serve an agent with custom endpoints:

serve::start_server_with_agent_and_custom_endpoints(
    agent,
    "model-name".to_string(),
    "127.0.0.1:8000",
    Some(custom_endpoints)
).await?;

Example API Usage

The API supports full OpenAI-compatible parameters for fine-grained control over generation:

# Basic non-streaming request
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'

# Advanced request with generation parameters
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Write a short poem about Rust programming"}
    ],
    "temperature": 0.8,
    "max_tokens": 150,
    "stop": ["\n\n"],
    "stream": false
  }'

# Real-time streaming request with parameters
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model",
    "messages": [{"role": "user", "content": "Tell me a creative story"}],
    "temperature": 0.9,
    "max_tokens": 500,
    "stream": true
  }'

Supported Parameters:

  • temperature (0.0-2.0): Controls randomness (lower = more deterministic)
  • max_tokens: Maximum tokens to generate
  • stop: Array of strings that stop generation when encountered
  • stream: Enable real-time token streaming for immediate responses

Note: When parameters are not specified, the server uses configuration defaults. Agents maintain conversation context across requests for natural multi-turn conversations.

CLI Usage

Helios Engine provides a powerful command-line interface with multiple modes and options:

Interactive Chat Mode

Start an interactive chat session:

# Default chat session
helios-engine

# With custom system prompt
helios-engine chat --system-prompt "You are a helpful coding assistant"

# With custom max iterations for tool calls
helios-engine chat --max-iterations 10

# With verbose logging for debugging
helios-engine --verbose chat

One-off Questions

Ask a single question without interactive mode:

# Ask a single question
helios-engine ask "What is the capital of France?"

# Ask with custom config file
helios-engine --config /path/to/config.toml ask "Calculate 123 * 456"

Configuration Management

Initialize and manage configuration:

# Create a new configuration file
helios-engine init

# Create config in custom location
helios-engine init --output ~/.helios/config.toml

HTTP Server (Serve Command)

Serve OpenAI-compatible API endpoints:

# Start server with default settings (port 8000, localhost)
helios-engine serve

# Serve on custom port and host
helios-engine serve --port 3000 --host 127.0.0.1

# Serve on all interfaces (accessible from other machines)
helios-engine serve --host 0.0.0.0

# Serve with custom endpoints from a configuration file
helios-engine serve --custom-endpoints custom_endpoints.toml

# Serve with verbose logging
helios-engine --verbose serve

The serve command exposes the following endpoints:

  • POST /v1/chat/completions - Chat completions with real-time streaming and full parameter support (temperature, max_tokens, stop)
  • GET /v1/models - List available models
  • GET /health - Health check endpoint
  • Custom endpoints (when --custom-endpoints is specified)

Mode Selection

Choose between different operation modes:

# Auto mode (uses local if configured, otherwise remote API)
helios-engine --mode auto chat

# Online mode (forces remote API usage)
helios-engine --mode online chat

# Offline mode (uses local models only)
helios-engine --mode offline chat

Interactive Commands

During an interactive session, use these commands:

  • exit or quit - Exit the chat session
  • clear - Clear conversation history
  • history - Show conversation history
  • help - Show help message

Configuration

Helios Engine uses TOML for configuration. You can configure either remote API access or local model inference with the dual LLMProviderType system.

Remote API Configuration (Default)

[llm]
# The model name (e.g., gpt-3.5-turbo, gpt-4, claude-3, etc.)
model_name = "gpt-3.5-turbo"

# Base URL for the API (OpenAI or compatible)
base_url = "https://api.openai.com/v1"

# Your API key
api_key = "your-api-key-here"

# Temperature for response generation (0.0 - 2.0)
temperature = 0.7

# Maximum tokens in response
max_tokens = 2048

Local Model Configuration (Offline Mode with llama.cpp)

[llm]
# Remote config still needed for auto mode fallback
model_name = "gpt-3.5-turbo"
base_url = "https://api.openai.com/v1"
api_key = "your-api-key-here"
temperature = 0.7
max_tokens = 2048

# Local model configuration for offline mode
[local]
# HuggingFace repository and model file
huggingface_repo = "unsloth/Qwen3-0.6B-GGUF"
model_file = "Qwen3-0.6B-Q4_K_M.gguf"

# Local model settings
temperature = 0.7
max_tokens = 2048

Auto Mode Configuration (Remote + Local)

For maximum flexibility, configure both remote and local models to enable auto mode:

[llm]
model_name = "gpt-3.5-turbo"
base_url = "https://api.openai.com/v1"
api_key = "your-api-key-here"
temperature = 0.7
max_tokens = 2048

# Local model as fallback
[local]
huggingface_repo = "unsloth/Qwen3-0.6B-GGUF"
model_file = "Qwen3-0.6B-Q4_K_M.gguf"
temperature = 0.7
max_tokens = 2048

Supported LLM Providers

Helios Engine supports both remote APIs and local model inference:

Remote APIs (Online Mode)

Helios Engine works with any OpenAI-compatible API:

  • OpenAI: https://api.openai.com/v1
  • Azure OpenAI: https://your-resource.openai.azure.com/openai/deployments/your-deployment
  • Local Models (LM Studio): http://localhost:1234/v1
  • Ollama with OpenAI compatibility: http://localhost:11434/v1
  • Any OpenAI-compatible API

Local Models (Offline Mode)

Run models locally using llama.cpp without internet connection:

  • GGUF Models: Compatible with all GGUF format models from HuggingFace
  • Automatic Download: Models are downloaded automatically from HuggingFace
  • GPU Acceleration: Uses GPU if available (via llama.cpp)
  • Clean Output: Suppresses verbose debugging for clean user experience
  • Popular Models: Works with Qwen, Llama, Mistral, and other GGUF models

Supported Model Sources:

  • HuggingFace Hub repositories
  • Local GGUF files
  • Automatic model caching

Local Inference Setup

Helios Engine supports running large language models locally using llama.cpp through the LLMProviderType system, providing privacy, offline capability, and no API costs.

Prerequisites

  • HuggingFace Account: Sign up at huggingface.co (free)
  • HuggingFace CLI: Install the CLI tool:
    pip install huggingface_hub
    huggingface-cli login  # Login with your token
    

Setting Up Local Models

  1. Find a GGUF Model: Browse HuggingFace Models for compatible models

  2. Update Configuration: Add local model config to your config.toml:

    [local]
    huggingface_repo = "unsloth/Qwen3-0.6B-GGUF"
    model_file = "Qwen3-0.6B-Q4_K_M.gguf"
    temperature = 0.7
    max_tokens = 2048
    
  3. Run in Offline Mode:

    # First run downloads the model
    helios-engine --mode offline ask "Hello world"
    
    # Subsequent runs use cached model
    helios-engine --mode offline chat
    

Recommended Models

Model Size Use Case Repository
Qwen3-0.6B ~400MB Fast, good quality unsloth/Qwen3-0.6B-GGUF
Llama-3.2-1B ~700MB Balanced performance unsloth/Llama-3.2-1B-Instruct-GGUF
Mistral-7B ~4GB High quality TheBloke/Mistral-7B-Instruct-v0.1-GGUF

Performance & Features

  • GPU Acceleration: Models automatically use GPU if available via llama.cpp's n_gpu_layers parameter
  • Model Caching: Downloaded models are cached locally (~/.cache/huggingface)
  • Memory Usage: Larger models need more RAM/VRAM
  • First Run: Initial model download may take time depending on connection
  • Clean Output Mode: Suppresses verbose debugging from llama.cpp for clean user experience

Streaming Support with Local Models

Local models now support real-time token-by-token streaming just like remote models! The LLMClient automatically handles streaming for both remote and local models through the same unified API, providing a consistent experience.

Architecture

For detailed architecture documentation including system design, component interactions, and execution flows, see docs/ARCHITECTURE.md.

Quick System Overview

Helios Engine follows a modular architecture with clear separation of concerns:

  • Agent: Orchestrates conversations and tool execution
  • LLM Client: Handles communication with language models
  • Tool Registry: Manages and executes tools
  • Chat Session: Maintains conversation history
  • Configuration: Manages settings and preferences

Built-in Tools

Helios Engine includes 16+ built-in tools for common tasks. All tools follow the same pattern and can be easily added to agents.

Core Tools

CalculatorTool

Performs basic arithmetic operations.

Parameters:

  • expression (string, required): Mathematical expression to evaluate

Example:

agent.tool(Box::new(CalculatorTool));

EchoTool

Echoes back a message.

Parameters:

  • message (string, required): Message to echo

Example:

agent.tool(Box::new(EchoTool));

File Management Tools

FileSearchTool

Search for files by name pattern or content within files.

Parameters:

  • path (string, optional): Directory path to search in (default: current directory)
  • pattern (string, optional): File name pattern with wildcards (e.g., *.rs)
  • content (string, optional): Text content to search for within files
  • max_results (number, optional): Maximum number of results (default: 50)

FileReadTool

Read the contents of a file with optional line range selection.

Parameters:

  • path (string, required): File path to read
  • start_line (number, optional): Starting line number (1-indexed)
  • end_line (number, optional): Ending line number (1-indexed)

FileWriteTool

Write content to a file (creates new or overwrites existing).

Parameters:

  • path (string, required): File path to write to
  • content (string, required): Content to write

FileEditTool

Edit a file by replacing specific text (find and replace).

Parameters:

  • path (string, required): File path to edit
  • find (string, required): Text to find
  • replace (string, required): Replacement text

FileIOTool

Unified file operations: read, write, append, delete, copy, move, exists, size.

Parameters:

  • operation (string, required): Operation type
  • path (string, optional): File path for operations
  • src_path (string, optional): Source path for copy/move
  • dst_path (string, optional): Destination path for copy/move
  • content (string, optional): Content for write/append
  • recursive (boolean, optional): Allow recursive directory deletion (default: false for safety)

FileListTool

List directory contents with detailed metadata.

Parameters:

  • path (string, optional): Directory path to list
  • show_hidden (boolean, optional): Show hidden files
  • recursive (boolean, optional): List recursively
  • max_depth (number, optional): Maximum recursion depth

Web & API Tools

WebScraperTool

Fetch and extract content from web URLs.

Parameters:

  • url (string, required): URL to scrape
  • extract_text (boolean, optional): Extract readable text from HTML
  • timeout_seconds (number, optional): Request timeout

HttpRequestTool

Make HTTP requests with various methods.

Parameters:

  • method (string, required): HTTP method (GET, POST, PUT, DELETE, etc.)
  • url (string, required): Request URL
  • headers (object, optional): Request headers
  • body (string, optional): Request body
  • timeout_seconds (number, optional): Request timeout

JsonParserTool

Parse, validate, format, and manipulate JSON data.

Operations:

  • parse - Parse and validate JSON
  • stringify - Format JSON with optional indentation
  • get_value - Extract values by JSON path
  • set_value - Modify JSON values
  • validate - Check JSON validity

System & Utility Tools

ShellCommandTool

Execute shell commands safely with security restrictions.

Parameters:

  • command (string, required): Shell command to execute
  • timeout_seconds (number, optional): Command timeout

SystemInfoTool

Retrieve system information (OS, CPU, memory, disk, network).

Parameters:

  • category (string, optional): Info category (all, os, cpu, memory, disk, network)

TimestampTool

Work with timestamps and date/time operations.

Operations:

  • now - Current time
  • format - Format timestamps
  • parse - Parse timestamp strings
  • add/subtract - Time arithmetic
  • diff - Time difference calculation

TextProcessorTool

Process and manipulate text with various operations.

Operations:

  • search - Regex-based text search
  • replace - Find and replace with regex
  • split/join - Text splitting and joining
  • count - Character, word, and line counts
  • uppercase/lowercase - Case conversion
  • trim - Whitespace removal
  • lines/words - Text formatting

Data Storage Tools

MemoryDBTool

In-memory key-value database for caching data during conversations.

Operations:

  • set - Store key-value pairs
  • get - Retrieve values
  • delete - Remove entries
  • list - Show all stored data
  • clear - Remove all data
  • exists - Check key existence

QdrantRAGTool

RAG (Retrieval-Augmented Generation) tool with Qdrant vector database.

Operations:

  • add_document - Store and embed documents
  • search - Semantic search
  • delete - Remove documents
  • clear - Clear collection

Creating Custom Tools

Implement the Tool trait to create custom tools:

use async_trait::async_trait;
use helios_engine::{Tool, ToolParameter, ToolResult};
use serde_json::Value;
use std::collections::HashMap;

struct WeatherTool;

#[async_trait]
impl Tool for WeatherTool {
    fn name(&self) -> &str {
        "get_weather"
    }

    fn description(&self) -> &str {
        "Get the current weather for a location"
    }

    fn parameters(&self) -> HashMap<String, ToolParameter> {
        let mut params = HashMap::new();
        params.insert(
            "location".to_string(),
            ToolParameter {
                param_type: "string".to_string(),
                description: "City name".to_string(),
                required: Some(true),
            },
        );
        params
    }

    async fn execute(&self, args: Value) -> helios_engine::Result<ToolResult> {
        let location = args["location"].as_str().unwrap_or("Unknown");

        // Your weather API logic here
        let weather = format!("Weather in {}: Sunny, 72Β°F", location);

        Ok(ToolResult::success(weather))
    }
}

// Use your custom tool
#[tokio::main]
async fn main() -> helios_engine::Result<()> {
    let config = Config::from_file("config.toml")?;

    let mut agent = Agent::builder("WeatherAgent")
        .config(config)
        .tool(Box::new(WeatherTool))
        .build()
        .await?;

    let response = agent.chat("What's the weather in Tokyo?").await?;
    println!("{}", response);

    Ok(())
}

API Documentation

Core Types

Agent

The main agent struct that manages conversation and tool execution.

Methods:

  • builder(name) - Create a new agent builder
  • chat(message) - Send a message and get a response
  • register_tool(tool) - Add a tool to the agent
  • clear_history() - Clear conversation history
  • set_system_prompt(prompt) - Set the system prompt
  • set_max_iterations(max) - Set maximum tool call iterations
  • set_memory(key, value) - Set a memory value for the agent
  • get_memory(key) - Get a memory value
  • remove_memory(key) - Remove a memory value
  • clear_memory() - Clear all agent memory (preserves session metadata)
  • get_session_summary() - Get a summary of the current session
  • increment_counter(key) - Increment a counter in memory
  • increment_tasks_completed() - Increment the tasks_completed counter

Config

Configuration management for LLM settings.

Methods:

  • from_file(path) - Load config from TOML file
  • default() - Create default configuration
  • save(path) - Save config to file

LLMClient

Client for interacting with LLM providers (remote or local).

Methods:

  • new(provider_type) - Create client with LLMProviderType (Remote or Local)
  • chat(messages, tools) - Send messages and get response
  • chat_stream(messages, tools, callback) - Send messages and stream response with callback function
  • generate(request) - Low-level generation method

LLMProviderType

Enumeration for different LLM provider types.

Variants:

  • Remote(LLMConfig) - For remote API providers (OpenAI, Azure, etc.)
  • Local(LocalConfig) - For local llama.cpp models

ToolRegistry

Manages and executes tools.

Methods:

  • new() - Create empty registry
  • register(tool) - Register a new tool
  • execute(name, args) - Execute a tool by name
  • get_definitions() - Get all tool definitions
  • list_tools() - List registered tool names

ChatSession

Manages conversation history and session metadata.

Methods:

  • new() - Create new session
  • with_system_prompt(prompt) - Set system prompt
  • add_message(message) - Add message to history
  • add_user_message(content) - Add a user message
  • add_assistant_message(content) - Add an assistant message
  • get_messages() - Get all messages
  • clear() - Clear all messages
  • set_metadata(key, value) - Set session metadata
  • get_metadata(key) - Get session metadata
  • remove_metadata(key) - Remove session metadata
  • get_summary() - Get a summary of the session

Built-in Tools

For detailed documentation of all 16+ built-in tools including usage examples, see the Built-in Tools section above.

Legacy Tool Documentation

CalculatorTool

Performs basic arithmetic operations.

Parameters:

  • expression (string, required): Mathematical expression to evaluate

Example:

agent.tool(Box::new(CalculatorTool));

EchoTool

Echoes back a message.

Parameters:

  • message (string, required): Message to echo

Example:

agent.tool(Box::new(EchoTool));

FileSearchTool

Search for files by name pattern or search for content within files.

Parameters:

  • path (string, optional): Directory path to search in (default: current directory)
  • pattern (string, optional): File name pattern with wildcards (e.g., *.rs)
  • content (string, optional): Text content to search for within files
  • max_results (number, optional): Maximum number of results (default: 50)

Example:

agent.tool(Box::new(FileSearchTool));

FileReadTool

Read the contents of a file with optional line range selection.

Parameters:

  • path (string, required): File path to read
  • start_line (number, optional): Starting line number (1-indexed)
  • end_line (number, optional): Ending line number (1-indexed)

Example:

agent.tool(Box::new(FileReadTool));

FileWriteTool

Write content to a file (creates new or overwrites existing).

Parameters:

  • path (string, required): File path to write to
  • content (string, required): Content to write

Example:

agent.tool(Box::new(FileWriteTool));

FileEditTool

Edit a file by replacing specific text (find and replace).

Parameters:

  • path (string, required): File path to edit
  • find (string, required): Text to find
  • replace (string, required): Replacement text

Example:

agent.tool(Box::new(FileEditTool));

MemoryDBTool

In-memory key-value database for caching data during conversations.

Parameters:

  • operation (string, required): Operation to perform: set, get, delete, list, clear, exists
  • key (string, optional): Key for set, get, delete, exists operations
  • value (string, optional): Value for set operation

Supported Operations:

  • set - Store a key-value pair
  • get - Retrieve a value by key
  • delete - Remove a key-value pair
  • list - List all stored items
  • clear - Clear all data
  • exists - Check if a key exists

Example:

agent.tool(Box::new(MemoryDBTool::new()));

Usage in conversation:

// Agent can now cache data
agent.chat("Store my name as 'Alice' in the database").await?;
agent.chat("What's my name?").await?; // Agent retrieves from DB

QdrantRAGTool

RAG (Retrieval-Augmented Generation) tool with Qdrant vector database for semantic search and document retrieval.

Parameters:

  • operation (string, required): Operation: add_document, search, delete, clear
  • text (string, optional): Document text or search query
  • doc_id (string, optional): Document ID for delete operation
  • limit (number, optional): Number of search results (default: 5)
  • metadata (object, optional): Additional metadata for documents

Supported Operations:

  • add_document - Embed and store a document
  • search - Semantic search with vector similarity
  • delete - Remove a document by ID
  • clear - Clear all documents from collection

Example:

let rag_tool = QdrantRAGTool::new(
    "http://localhost:6333",                    // Qdrant URL
    "my_collection",                             // Collection name
    "https://api.openai.com/v1/embeddings",     // Embedding API
    std::env::var("OPENAI_API_KEY").unwrap(),   // API key
);

agent.tool(Box::new(rag_tool));

Prerequisites:

  • Qdrant running: docker run -p 6333:6333 qdrant/qdrant
  • OpenAI API key for embeddings

WebScraperTool

Scrape web content from URLs with automatic text extraction and cleaning.

Parameters:

  • url (string, required): URL to scrape
  • max_length (number, optional): Maximum content length (default: 10000)

Example:

agent.tool(Box::new(WebScraperTool));

JsonParserTool

Parse, validate, stringify, and extract values from JSON data.

Parameters:

  • operation (string, required): Operation: parse, stringify, get_value, validate
  • json (string, optional): JSON string for parse/stringify operations
  • path (string, optional): JSON path for get_value operation (e.g., "$.key" or "$.array[0]")

Supported Operations:

  • parse - Parse and validate JSON string
  • stringify - Convert JSON to formatted string
  • get_value - Extract value using JSON path
  • validate - Validate JSON structure

Example:

agent.tool(Box::new(JsonParserTool));

TimestampTool

Work with timestamps, perform date/time operations and formatting.

Parameters:

  • operation (string, required): Operation: now, format, add, diff
  • timestamp (number, optional): Unix timestamp for format/add operations
  • format (string, optional): Date format string (default: "%Y-%m-%d %H:%M:%S")
  • amount (number, optional): Time amount to add/subtract
  • unit (string, optional): Time unit: seconds, minutes, hours, days, weeks

Supported Operations:

  • now - Get current timestamp
  • format - Format timestamp to string
  • add - Add time to timestamp
  • diff - Calculate difference between timestamps

Example:

agent.tool(Box::new(TimestampTool));

ShellCommandTool

Execute shell commands safely with timeout and output capture.

Parameters:

  • command (string, required): Shell command to execute
  • timeout (number, optional): Timeout in seconds (default: 30)

Example:

agent.tool(Box::new(ShellCommandTool));

HttpRequestTool

Make HTTP requests with full support for methods, headers, and body.

Parameters:

  • method (string, required): HTTP method: GET, POST, PUT, DELETE, etc.
  • url (string, required): Request URL
  • headers (object, optional): HTTP headers as key-value pairs
  • body (string, optional): Request body for POST/PUT requests

Example:

agent.tool(Box::new(HttpRequestTool));

FileListTool

List directory contents with filtering and detailed information.

Parameters:

  • path (string, optional): Directory path (default: current directory)
  • pattern (string, optional): File name pattern with wildcards
  • recursive (boolean, optional): Include subdirectories (default: false)
  • max_results (number, optional): Maximum number of results (default: 100)

Example:

agent.tool(Box::new(FileListTool));

SystemInfoTool

Retrieve system information including CPU, memory, disk, and OS details.

Parameters:

  • None required

Example:

agent.tool(Box::new(SystemInfoTool));

TextProcessorTool

Process and analyze text with various operations like counting, trimming, and searching.

Parameters:

  • operation (string, required): Operation: count, trim, uppercase, lowercase, replace, search
  • text (string, required): Input text
  • find (string, optional): Text to find (for replace/search operations)
  • replace (string, optional): Replacement text (for replace operation)
  • case_sensitive (boolean, optional): Case sensitivity for search (default: true)

Supported Operations:

  • count - Count characters, words, lines
  • trim - Remove whitespace
  • uppercase/lowercase - Change case
  • replace - Find and replace text
  • search - Search for text patterns

Example:

agent.tool(Box::new(TextProcessorTool));

FileIOTool

Perform file I/O operations including reading, writing, copying, and moving files.

Parameters:

  • operation (string, required): Operation: read, write, copy, move, delete, exists
  • path (string, required): File path
  • content (string, optional): Content for write operation
  • destination (string, optional): Destination path for copy/move operations

Supported Operations:

  • read - Read file content
  • write - Write content to file
  • copy - Copy file to new location
  • move - Move file to new location
  • delete - Delete file
  • exists - Check if file exists

Example:

agent.tool(Box::new(FileIOTool));

Project Structure

helios/
β”œβ”€β”€ Cargo.toml              # Project configuration
β”œβ”€β”€ README.md               # This file
β”œβ”€β”€ config.example.toml     # Example configuration
β”œβ”€β”€ .gitignore             # Git ignore rules
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ lib.rs             # Library entry point
β”‚   β”œβ”€β”€ main.rs            # Binary entry point (interactive demo)
β”‚   β”œβ”€β”€ agent.rs           # Agent implementation
β”‚   β”œβ”€β”€ llm.rs             # LLM client and provider
β”‚   β”œβ”€β”€ tools.rs           # Tool system and built-in tools
β”‚   β”œβ”€β”€ chat.rs            # Chat message and session types
β”‚   β”œβ”€β”€ config.rs          # Configuration management
β”‚   β”œβ”€β”€ serve.rs           # HTTP server for OpenAI-compatible API
β”‚   └── error.rs           # Error types
β”‚
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ API.md                    # API reference
β”‚   β”œβ”€β”€ QUICKSTART.md             # Quick start guide
β”‚   β”œβ”€β”€ TUTORIAL.md               # Detailed tutorial
β”‚   └── USING_AS_CRATE.md         # Using Helios as a library
β”‚
└── examples/
    β”œβ”€β”€ basic_chat.rs             # Simple chat example
    β”œβ”€β”€ agent_with_tools.rs       # Tool usage example
    β”œβ”€β”€ agent_with_file_tools.rs  # File management tools example
    β”œβ”€β”€ agent_with_memory_db.rs   # Memory database tool example
    β”œβ”€β”€ agent_with_rag.rs         # Agent with RAG capabilities
    β”œβ”€β”€ custom_tool.rs            # Custom tool implementation
    β”œβ”€β”€ multiple_agents.rs        # Multiple agents example
    β”œβ”€β”€ forest_of_agents.rs       # Multi-agent collaboration system
    β”œβ”€β”€ send_message_tool_demo.rs # SendMessageTool functionality demo
    β”œβ”€β”€ direct_llm_usage.rs       # Direct LLM client usage
    β”œβ”€β”€ streaming_chat.rs         # Streaming responses example
    β”œβ”€β”€ local_streaming.rs        # Local model streaming example
    β”œβ”€β”€ rag_in_memory.rs          # RAG with in-memory vector store
    β”œβ”€β”€ rag_advanced.rs           # RAG with Qdrant vector store
    β”œβ”€β”€ rag_qdrant_comparison.rs  # Compare RAG implementations
    β”œβ”€β”€ serve_agent.rs            # Serve agent via HTTP API
    β”œβ”€β”€ serve_with_custom_endpoints.rs # Serve with custom endpoints
    └── complete_demo.rs          # Complete feature demonstration

Module Overview

helios-engine/
β”‚
β”œβ”€β”€  agent           - Agent system and builder pattern
β”œβ”€β”€  chat            - Chat messages and session management
β”œβ”€β”€  config          - TOML configuration loading/saving
β”œβ”€β”€  error           - Error types and Result alias
β”œβ”€β”€  forest          - Forest of Agents - multi-agent collaboration system
β”œβ”€β”€  llm             - LLM client and API communication
β”œβ”€β”€  rag             - RAG (Retrieval-Augmented Generation) system
β”œβ”€β”€  rag_tool        - RAG tool implementation for agents
β”œβ”€β”€  serve           - HTTP server for OpenAI-compatible API
└──  tools           - Tool registry and implementations

Examples

For comprehensive examples demonstrating various Helios Engine features, see the examples/ directory.

The examples include:

  • Basic chat and agent usage
  • Tool integration examples
  • File management demonstrations
  • API serving examples
  • Streaming and advanced features

See examples/README.md for detailed documentation and usage instructions.

Testing

Run tests:

cargo test

Run with logging:

RUST_LOG=debug cargo run

πŸ” Advanced Features

Custom LLM Providers

Implement the LLMProvider trait for custom backends:

use async_trait::async_trait;
use helios_engine::{LLMProvider, LLMRequest, LLMResponse};

struct CustomProvider;

#[async_trait]
impl LLMProvider for CustomProvider {
    async fn generate(&self, request: LLMRequest) -> helios_engine::Result<LLMResponse> {
        // Your custom implementation
        todo!()
    }
}

Tool Chaining

Agents automatically chain tool calls:

// The agent can use multiple tools in sequence
let response = agent.chat(
    "Calculate 10 * 5, then echo the result"
).await?;

Thinking Tags Display

Helios Engine automatically detects and displays thinking tags from LLM responses:

  • The CLI displays thinking tags with visual indicators: πŸ’­ [Thinking...]
  • Streaming responses show thinking tags in real-time
  • Supports both <thinking> and <think> tag formats
  • In offline mode, thinking tags are processed and removed from final output

Conversation Context

Maintain conversation history:

let mut agent = Agent::builder("Assistant")
    .config(config)
    .system_prompt("You are a helpful assistant.")
    .build()
    .await?;

let response1 = agent.chat("My name is Alice").await?;
let response2 = agent.chat("What is my name?").await?; // Agent remembers: "Alice"

println!("{response1}");

println!("{response2");

Clean Output Mode

In offline mode, Helios Engine suppresses all verbose debugging output from llama.cpp:

  • No model loading messages
  • No layer information display
  • No verbose internal operations
  • Clean, user-focused experience during local inference

Session Memory & Metadata

Track agent state and conversation metadata across interactions:

// Set agent memory (namespaced under "agent:" prefix)
agent.set_memory("user_preference", "concise");
agent.set_memory("tasks_completed", "0");

// Get memory values
if let Some(pref) = agent.get_memory("user_preference") {
    println!("User prefers: {}", pref);
}

// Increment counters
agent.increment_tasks_completed();
agent.increment_counter("files_processed");

// Get session summary
println!("{}", agent.get_session_summary());

// Clear only agent memory (preserves general session metadata)
agent.clear_memory();

Session metadata in ChatSession:

let mut session = ChatSession::new();

// Set general session metadata
session.set_metadata("session_id", "abc123");
session.set_metadata("start_time", chrono::Utc::now().to_rfc3339());

// Retrieve metadata
if let Some(id) = session.get_metadata("session_id") {
    println!("Session ID: {}", id);
}

// Get session summary
println!("{}", session.get_summary());

File Management Tools

Built-in tools for file operations:

use helios_engine::{Agent, Config, FileSearchTool, FileReadTool, FileWriteTool, FileEditTool};

let mut agent = Agent::builder("FileAgent")
    .config(config)
    .tool(Box::new(FileSearchTool))    // Search files by name or content
    .tool(Box::new(FileReadTool))      // Read file contents
    .tool(Box::new(FileWriteTool))     // Write/create files
    .tool(Box::new(FileEditTool))      // Find and replace in files
    .build()
    .await?;

// Agent can now search, read, write, and edit files
let response = agent.chat("Find all .rs files and show me main.rs").await?;
println!("{response}");

In-Memory Database Tool

Cache and retrieve data during agent conversations:

use helios_engine::{Agent, Config, MemoryDBTool};

let mut agent = Agent::builder("DataAgent")
    .config(config)
    .system_prompt("You can store and retrieve data using the memory_db tool.")
    .tool(Box::new(MemoryDBTool::new()))
    .build()
    .await?;

// Store data
agent.chat("Remember that my favorite color is blue").await?;

// Agent automatically uses the database to remember
agent.chat("What's my favorite color?").await?;
// Response: "Your favorite color is blue"

// Cache expensive computations
agent.chat("Calculate 12345 * 67890 and save it as 'result'").await?;
agent.chat("What was the result I asked you to calculate?").await?;

// List all cached data
let response = agent.chat("Show me everything you've stored").await?;
println!("{response}");

Shared Database Between Agents:

use std::sync::{Arc, Mutex};
use std::collections::HashMap;

// Create a shared database
let shared_db = Arc::new(Mutex::new(HashMap::new()));

// Multiple agents sharing the same database
let mut agent1 = Agent::builder("Agent1")
    .config(config.clone())
    .tool(Box::new(MemoryDBTool::with_shared_db(shared_db.clone())))
    .build()
    .await?;

let mut agent2 = Agent::builder("Agent2")
    .config(config)
    .tool(Box::new(MemoryDBTool::with_shared_db(shared_db.clone())))
    .build()
    .await?;

// Data stored by agent1 is accessible to agent2
agent1.chat("Store 'project_status' as 'in_progress'").await?;
agent2.chat("What is the project status?").await?; // Gets "in_progress"

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Setup

  1. Clone the repository:
git clone https://github.com/Ammar-Alnagar/Helios-Engine.git
cd Helios-Engine
  1. Build the project:
cargo build
  1. Run tests:
cargo test
  1. Format code:
cargo fmt
  1. Check for issues:
cargo clippy

License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with ❀️ in Rust

⚠️ ☠️ HERE BE DRAGONS ☠️ ⚠️


πŸ”₯ ABANDON ALL HOPE, YE WHO ENTER HERE πŸ”₯


Greetings, Foolish Mortal

What lies before you is not codeβ€”it is a CURSE.

A labyrinth of logic so twisted, so arcane, that it defies comprehension itself.


⚑ What Holds This Monstrosity Together

  • 🩹 Duct tape (metaphorical and spiritual)
  • πŸ™ Prayers whispered at 3 AM
  • πŸ“š Stack Overflow answers from 2009
  • 😱 Pure, unfiltered desperation
  • 😭 The tears of junior developers
  • 🎲 Luck (mostly luck)

πŸ“œ The Legend

Once, two beings understood this code:

⚑ God and Me ⚑

Now... I have forgotten.

Only God remains.

And I'm not sure He's still watching.