π₯ Helios Engine - LLM Agent Framework
Helios Engine is a powerful and flexible Rust framework for building LLM-powered agents with tool support, streaming chat capabilities, and easy configuration management. Create intelligent agents that can interact with users, call tools, and maintain conversation context - with both online and offline local model support.
Features
- π Forest of Agents: Multi-agent collaboration system where agents can communicate, delegate tasks, and share context
- Agent System: Create multiple agents with different personalities and capabilities
- Tool Registry: Extensible tool system for adding custom functionality
- Chat Management: Built-in conversation history and session management
- Session Memory: Track agent state and metadata across conversations
- Extensive Tool Suite: 16+ built-in tools including web scraping, JSON parsing, timestamp operations, file I/O, shell commands, HTTP requests, system info, and text processing
- File Management Tools: Built-in tools for searching, reading, writing, editing, and listing files
- Web & API Tools: Web scraping, HTTP requests, and JSON manipulation capabilities
- System Integration: Shell command execution, system information retrieval, and timestamp operations
- Text Processing: Advanced text search, replace, formatting, and analysis tools
- π RAG System: Retrieval-Augmented Generation with vector stores (InMemory and Qdrant)
- Streaming Support: True real-time response streaming for both remote and local models with immediate token delivery
- Local Model Support: Run local models offline using llama.cpp with HuggingFace integration (optional
localfeature) - LLM Support: Compatible with OpenAI API, any OpenAI-compatible API, and local models
- HTTP Server & API: Expose OpenAI-compatible API endpoints with full parameter support (temperature, max_tokens, stop) for agents and LLM clients
- Async/Await: Built on Tokio for high-performance async operations
- Type-Safe: Leverages Rust's type system for safe and reliable code
- Extensible: Easy to add custom tools and extend functionality
- Thinking Tags: Automatic detection and display of model reasoning process
- Dual Mode Support: Auto, online (remote API), and offline (local) modes
- Clean Output: Suppresses verbose debugging in offline mode for clean user experience
- CLI & Library: Use as both a command-line tool and a Rust library crate
- π Feature Flags: Optional
localfeature for offline model support - build only what you need!
Table of Contents
- Installation
- Feature Flags
- Quick Start
- CLI Usage
- Configuration
- Local Inference Setup
- Architecture
- Built-in Tools
- Creating Custom Tools
- API Documentation
- Project Structure
- Examples
- Contributing
- License
Installation
Helios Engine can be used both as a command-line tool and as a library crate in your Rust projects.
As a CLI Tool (Recommended for Quick Start)
Install globally using Cargo (once published):
# Install without local model support (lighter, faster install)
# Install with local model support (enables offline mode with llama-cpp-2)
Then use anywhere:
# Initialize configuration
# Start interactive chat (default command)
# or explicitly
# Ask a quick question
# Get help
# NEW: Use offline mode with local models (no internet required)
# Use online mode (forces remote API usage)
# Auto mode (uses local if configured, otherwise remote)
# Verbose logging for debugging
# Custom system prompt
# One-off question with custom config
# NEW: Serve OpenAI-compatible API endpoints
# Serve on all interfaces
As a Library Crate
Add Helios-Engine to your Cargo.toml:
[]
# Without local model support (lighter dependency)
= "0.3.3"
= { = "1.35", = ["full"] }
# OR with local model support for offline inference
= { = "0.3.3", = ["local"] }
= { = "1.35", = ["full"] }
Or use a local path during development:
[]
= { = "../helios" }
= { = "1.35", = ["full"] }
Build from Source
# Build without local model support
# OR build with local model support
# Install locally (without local support)
# OR install with local model support
π© Feature Flags
Helios Engine supports optional feature flags to control which dependencies are included in your build. This allows you to create lighter builds when you don't need certain functionality.
Available Features
local - Local Model Support
Enables offline inference using local models via llama-cpp-2. When disabled, the engine only supports remote API calls, resulting in:
- Faster compilation times - No need to build llama-cpp-2 and its dependencies
- Smaller binary size - Excludes large native libraries
- Simpler dependencies - Reduces the dependency tree significantly
Enables:
LocalLLMProvider- Run models locally using llama.cppLocalConfig- Configuration for local model setup- Offline mode (
--mode offline) in the CLI - HuggingFace model downloading and caching
When to use:
- β
Use
--features localif you need offline inference or want to run models locally - β Skip it if you only use remote APIs (OpenAI, Azure, etc.) for faster builds
Example:
# Without local support (lightweight, remote API only)
# With local support (includes llama-cpp-2 for offline inference)
In Cargo.toml:
# Remote API only
[]
= "0.3.3"
# With local model support
[]
= { = "0.3.3", = ["local"] }
Quick Start
Using as a Library Crate
The simplest way to use Helios Engine is to call LLM models directly:
use ;
use LLMConfig;
async
For detailed examples of using Helios Engine as a crate, see Using as a Crate Guide
Using Offline Mode with Local Models
Run models locally without internet connection:
use ;
use LocalConfig;
async
Note: First run downloads the model. Subsequent runs use the cached model.
Using with Agent System
For more advanced use cases with tools and persistent conversation:
1. Configure Your LLM
Create a config.toml file (supports both remote and local):
[]
= "gpt-3.5-turbo"
= "https://api.openai.com/v1"
= "your-api-key-here"
= 0.7
= 2048
# Optional: Add local configuration for offline mode
[]
= "unsloth/Qwen3-0.6B-GGUF"
= "Qwen3-0.6B-Q4_K_M.gguf"
= 0.7
= 2048
2. Create Your First Agent
use ;
async
3. Run the Interactive Demo
π Forest of Agents
Create a collaborative multi-agent system where agents can communicate, delegate tasks, and share context:
use ;
async
Features:
- Multi-agent collaboration on complex tasks
- Inter-agent communication (direct messages and broadcasts)
- Task delegation between agents
- Shared context and memory
- Specialized agent roles working together
π RAG System
Use Retrieval-Augmented Generation to provide context-aware responses:
use ;
async
Features:
- Vector-based semantic search for document retrieval
- Multiple vector store backends (InMemory, Qdrant)
- Automatic document chunking and embedding
- Context-aware responses with relevant information
- Easy integration with existing agents
Serve API
Expose your agents and LLM configurations as fully OpenAI-compatible HTTP API endpoints with real-time streaming and parameter control:
Serve an LLM Client (Direct API Access)
use ;
async
Serve an Agent with Tools
use ;
async
API Endpoints
The server exposes OpenAI-compatible endpoints:
POST /v1/chat/completions- Chat completions (with streaming support)GET /v1/models- List available modelsGET /health- Health check
Custom Endpoints
You can define additional custom endpoints alongside the OpenAI-compatible API. Custom endpoints allow you to expose static JSON responses for monitoring, configuration, or integration purposes.
Create a custom endpoints configuration file (custom_endpoints.toml):
[[]]
= "GET"
= "/api/version"
= { = "0.3.3", = "Helios Engine" }
= 200
[[]]
= "GET"
= "/api/status"
= { = "operational", = "unknown" }
= 200
[[]]
= "POST"
= "/api/echo"
= { = "Echo endpoint", = "Static response" }
= 200
Use custom endpoints programmatically:
use ;
let custom_endpoints = CustomEndpointsConfig ;
start_server_with_custom_endpoints.await?;
Or serve an agent with custom endpoints:
start_server_with_agent_and_custom_endpoints.await?;
Example API Usage
The API supports full OpenAI-compatible parameters for fine-grained control over generation:
# Basic non-streaming request
# Advanced request with generation parameters
# Real-time streaming request with parameters
Supported Parameters:
temperature(0.0-2.0): Controls randomness (lower = more deterministic)max_tokens: Maximum tokens to generatestop: Array of strings that stop generation when encounteredstream: Enable real-time token streaming for immediate responses
Note: When parameters are not specified, the server uses configuration defaults. Agents maintain conversation context across requests for natural multi-turn conversations.
CLI Usage
Helios Engine provides a powerful command-line interface with multiple modes and options:
Interactive Chat Mode
Start an interactive chat session:
# Default chat session
# With custom system prompt
# With custom max iterations for tool calls
# With verbose logging for debugging
One-off Questions
Ask a single question without interactive mode:
# Ask a single question
# Ask with custom config file
Configuration Management
Initialize and manage configuration:
# Create a new configuration file
# Create config in custom location
HTTP Server (Serve Command)
Serve OpenAI-compatible API endpoints:
# Start server with default settings (port 8000, localhost)
# Serve on custom port and host
# Serve on all interfaces (accessible from other machines)
# Serve with custom endpoints from a configuration file
# Serve with verbose logging
The serve command exposes the following endpoints:
POST /v1/chat/completions- Chat completions with real-time streaming and full parameter support (temperature, max_tokens, stop)GET /v1/models- List available modelsGET /health- Health check endpoint- Custom endpoints (when
--custom-endpointsis specified)
Mode Selection
Choose between different operation modes:
# Auto mode (uses local if configured, otherwise remote API)
# Online mode (forces remote API usage)
# Offline mode (uses local models only)
Interactive Commands
During an interactive session, use these commands:
exitorquit- Exit the chat sessionclear- Clear conversation historyhistory- Show conversation historyhelp- Show help message
Configuration
Helios Engine uses TOML for configuration. You can configure either remote API access or local model inference with the dual LLMProviderType system.
Remote API Configuration (Default)
[]
# The model name (e.g., gpt-3.5-turbo, gpt-4, claude-3, etc.)
= "gpt-3.5-turbo"
# Base URL for the API (OpenAI or compatible)
= "https://api.openai.com/v1"
# Your API key
= "your-api-key-here"
# Temperature for response generation (0.0 - 2.0)
= 0.7
# Maximum tokens in response
= 2048
Local Model Configuration (Offline Mode with llama.cpp)
[]
# Remote config still needed for auto mode fallback
= "gpt-3.5-turbo"
= "https://api.openai.com/v1"
= "your-api-key-here"
= 0.7
= 2048
# Local model configuration for offline mode
[]
# HuggingFace repository and model file
= "unsloth/Qwen3-0.6B-GGUF"
= "Qwen3-0.6B-Q4_K_M.gguf"
# Local model settings
= 0.7
= 2048
Auto Mode Configuration (Remote + Local)
For maximum flexibility, configure both remote and local models to enable auto mode:
[]
= "gpt-3.5-turbo"
= "https://api.openai.com/v1"
= "your-api-key-here"
= 0.7
= 2048
# Local model as fallback
[]
= "unsloth/Qwen3-0.6B-GGUF"
= "Qwen3-0.6B-Q4_K_M.gguf"
= 0.7
= 2048
Supported LLM Providers
Helios Engine supports both remote APIs and local model inference:
Remote APIs (Online Mode)
Helios Engine works with any OpenAI-compatible API:
- OpenAI:
https://api.openai.com/v1 - Azure OpenAI:
https://your-resource.openai.azure.com/openai/deployments/your-deployment - Local Models (LM Studio):
http://localhost:1234/v1 - Ollama with OpenAI compatibility:
http://localhost:11434/v1 - Any OpenAI-compatible API
Local Models (Offline Mode)
Run models locally using llama.cpp without internet connection:
- GGUF Models: Compatible with all GGUF format models from HuggingFace
- Automatic Download: Models are downloaded automatically from HuggingFace
- GPU Acceleration: Uses GPU if available (via llama.cpp)
- Clean Output: Suppresses verbose debugging for clean user experience
- Popular Models: Works with Qwen, Llama, Mistral, and other GGUF models
Supported Model Sources:
- HuggingFace Hub repositories
- Local GGUF files
- Automatic model caching
Local Inference Setup
Helios Engine supports running large language models locally using llama.cpp through the LLMProviderType system, providing privacy, offline capability, and no API costs.
Prerequisites
- HuggingFace Account: Sign up at huggingface.co (free)
- HuggingFace CLI: Install the CLI tool:
Setting Up Local Models
-
Find a GGUF Model: Browse HuggingFace Models for compatible models
-
Update Configuration: Add local model config to your
config.toml:[] = "unsloth/Qwen3-0.6B-GGUF" = "Qwen3-0.6B-Q4_K_M.gguf" = 0.7 = 2048 -
Run in Offline Mode:
# First run downloads the model # Subsequent runs use cached model
Recommended Models
| Model | Size | Use Case | Repository |
|---|---|---|---|
| Qwen3-0.6B | ~400MB | Fast, good quality | unsloth/Qwen3-0.6B-GGUF |
| Llama-3.2-1B | ~700MB | Balanced performance | unsloth/Llama-3.2-1B-Instruct-GGUF |
| Mistral-7B | ~4GB | High quality | TheBloke/Mistral-7B-Instruct-v0.1-GGUF |
Performance & Features
- GPU Acceleration: Models automatically use GPU if available via llama.cpp's n_gpu_layers parameter
- Model Caching: Downloaded models are cached locally (~/.cache/huggingface)
- Memory Usage: Larger models need more RAM/VRAM
- First Run: Initial model download may take time depending on connection
- Clean Output Mode: Suppresses verbose debugging from llama.cpp for clean user experience
Streaming Support with Local Models
Local models now support real-time token-by-token streaming just like remote models! The LLMClient automatically handles streaming for both remote and local models through the same unified API, providing a consistent experience.
Architecture
For detailed architecture documentation including system design, component interactions, and execution flows, see docs/ARCHITECTURE.md.
Quick System Overview
Helios Engine follows a modular architecture with clear separation of concerns:
- Agent: Orchestrates conversations and tool execution
- LLM Client: Handles communication with language models
- Tool Registry: Manages and executes tools
- Chat Session: Maintains conversation history
- Configuration: Manages settings and preferences
Built-in Tools
Helios Engine includes 16+ built-in tools for common tasks. All tools follow the same pattern and can be easily added to agents.
Core Tools
CalculatorTool
Performs basic arithmetic operations.
Parameters:
expression(string, required): Mathematical expression to evaluate
Example:
agent.tool;
EchoTool
Echoes back a message.
Parameters:
message(string, required): Message to echo
Example:
agent.tool;
File Management Tools
FileSearchTool
Search for files by name pattern or content within files.
Parameters:
path(string, optional): Directory path to search in (default: current directory)pattern(string, optional): File name pattern with wildcards (e.g.,*.rs)content(string, optional): Text content to search for within filesmax_results(number, optional): Maximum number of results (default: 50)
FileReadTool
Read the contents of a file with optional line range selection.
Parameters:
path(string, required): File path to readstart_line(number, optional): Starting line number (1-indexed)end_line(number, optional): Ending line number (1-indexed)
FileWriteTool
Write content to a file (creates new or overwrites existing).
Parameters:
path(string, required): File path to write tocontent(string, required): Content to write
FileEditTool
Edit a file by replacing specific text (find and replace).
Parameters:
path(string, required): File path to editfind(string, required): Text to findreplace(string, required): Replacement text
FileIOTool
Unified file operations: read, write, append, delete, copy, move, exists, size.
Parameters:
operation(string, required): Operation typepath(string, optional): File path for operationssrc_path(string, optional): Source path for copy/movedst_path(string, optional): Destination path for copy/movecontent(string, optional): Content for write/appendrecursive(boolean, optional): Allow recursive directory deletion (default: false for safety)
FileListTool
List directory contents with detailed metadata.
Parameters:
path(string, optional): Directory path to listshow_hidden(boolean, optional): Show hidden filesrecursive(boolean, optional): List recursivelymax_depth(number, optional): Maximum recursion depth
Web & API Tools
WebScraperTool
Fetch and extract content from web URLs.
Parameters:
url(string, required): URL to scrapeextract_text(boolean, optional): Extract readable text from HTMLtimeout_seconds(number, optional): Request timeout
HttpRequestTool
Make HTTP requests with various methods.
Parameters:
method(string, required): HTTP method (GET, POST, PUT, DELETE, etc.)url(string, required): Request URLheaders(object, optional): Request headersbody(string, optional): Request bodytimeout_seconds(number, optional): Request timeout
JsonParserTool
Parse, validate, format, and manipulate JSON data.
Operations:
parse- Parse and validate JSONstringify- Format JSON with optional indentationget_value- Extract values by JSON pathset_value- Modify JSON valuesvalidate- Check JSON validity
System & Utility Tools
ShellCommandTool
Execute shell commands safely with security restrictions.
Parameters:
command(string, required): Shell command to executetimeout_seconds(number, optional): Command timeout
SystemInfoTool
Retrieve system information (OS, CPU, memory, disk, network).
Parameters:
category(string, optional): Info category (all, os, cpu, memory, disk, network)
TimestampTool
Work with timestamps and date/time operations.
Operations:
now- Current timeformat- Format timestampsparse- Parse timestamp stringsadd/subtract- Time arithmeticdiff- Time difference calculation
TextProcessorTool
Process and manipulate text with various operations.
Operations:
search- Regex-based text searchreplace- Find and replace with regexsplit/join- Text splitting and joiningcount- Character, word, and line countsuppercase/lowercase- Case conversiontrim- Whitespace removallines/words- Text formatting
Data Storage Tools
MemoryDBTool
In-memory key-value database for caching data during conversations.
Operations:
set- Store key-value pairsget- Retrieve valuesdelete- Remove entrieslist- Show all stored dataclear- Remove all dataexists- Check key existence
QdrantRAGTool
RAG (Retrieval-Augmented Generation) tool with Qdrant vector database.
Operations:
add_document- Store and embed documentssearch- Semantic searchdelete- Remove documentsclear- Clear collection
Creating Custom Tools
Implement the Tool trait to create custom tools:
use async_trait;
use ;
use Value;
use HashMap;
;
// Use your custom tool
async
API Documentation
Core Types
Agent
The main agent struct that manages conversation and tool execution.
Methods:
builder(name)- Create a new agent builderchat(message)- Send a message and get a responseregister_tool(tool)- Add a tool to the agentclear_history()- Clear conversation historyset_system_prompt(prompt)- Set the system promptset_max_iterations(max)- Set maximum tool call iterationsset_memory(key, value)- Set a memory value for the agentget_memory(key)- Get a memory valueremove_memory(key)- Remove a memory valueclear_memory()- Clear all agent memory (preserves session metadata)get_session_summary()- Get a summary of the current sessionincrement_counter(key)- Increment a counter in memoryincrement_tasks_completed()- Increment the tasks_completed counter
Config
Configuration management for LLM settings.
Methods:
from_file(path)- Load config from TOML filedefault()- Create default configurationsave(path)- Save config to file
LLMClient
Client for interacting with LLM providers (remote or local).
Methods:
new(provider_type)- Create client with LLMProviderType (Remote or Local)chat(messages, tools)- Send messages and get responsechat_stream(messages, tools, callback)- Send messages and stream response with callback functiongenerate(request)- Low-level generation method
LLMProviderType
Enumeration for different LLM provider types.
Variants:
Remote(LLMConfig)- For remote API providers (OpenAI, Azure, etc.)Local(LocalConfig)- For local llama.cpp models
ToolRegistry
Manages and executes tools.
Methods:
new()- Create empty registryregister(tool)- Register a new toolexecute(name, args)- Execute a tool by nameget_definitions()- Get all tool definitionslist_tools()- List registered tool names
ChatSession
Manages conversation history and session metadata.
Methods:
new()- Create new sessionwith_system_prompt(prompt)- Set system promptadd_message(message)- Add message to historyadd_user_message(content)- Add a user messageadd_assistant_message(content)- Add an assistant messageget_messages()- Get all messagesclear()- Clear all messagesset_metadata(key, value)- Set session metadataget_metadata(key)- Get session metadataremove_metadata(key)- Remove session metadataget_summary()- Get a summary of the session
Built-in Tools
For detailed documentation of all 16+ built-in tools including usage examples, see the Built-in Tools section above.
Legacy Tool Documentation
CalculatorTool
Performs basic arithmetic operations.
Parameters:
expression(string, required): Mathematical expression to evaluate
Example:
agent.tool;
EchoTool
Echoes back a message.
Parameters:
message(string, required): Message to echo
Example:
agent.tool;
FileSearchTool
Search for files by name pattern or search for content within files.
Parameters:
path(string, optional): Directory path to search in (default: current directory)pattern(string, optional): File name pattern with wildcards (e.g.,*.rs)content(string, optional): Text content to search for within filesmax_results(number, optional): Maximum number of results (default: 50)
Example:
agent.tool;
FileReadTool
Read the contents of a file with optional line range selection.
Parameters:
path(string, required): File path to readstart_line(number, optional): Starting line number (1-indexed)end_line(number, optional): Ending line number (1-indexed)
Example:
agent.tool;
FileWriteTool
Write content to a file (creates new or overwrites existing).
Parameters:
path(string, required): File path to write tocontent(string, required): Content to write
Example:
agent.tool;
FileEditTool
Edit a file by replacing specific text (find and replace).
Parameters:
path(string, required): File path to editfind(string, required): Text to findreplace(string, required): Replacement text
Example:
agent.tool;
MemoryDBTool
In-memory key-value database for caching data during conversations.
Parameters:
operation(string, required): Operation to perform:set,get,delete,list,clear,existskey(string, optional): Key for set, get, delete, exists operationsvalue(string, optional): Value for set operation
Supported Operations:
set- Store a key-value pairget- Retrieve a value by keydelete- Remove a key-value pairlist- List all stored itemsclear- Clear all dataexists- Check if a key exists
Example:
agent.tool;
Usage in conversation:
// Agent can now cache data
agent.chat.await?;
agent.chat.await?; // Agent retrieves from DB
QdrantRAGTool
RAG (Retrieval-Augmented Generation) tool with Qdrant vector database for semantic search and document retrieval.
Parameters:
operation(string, required): Operation:add_document,search,delete,cleartext(string, optional): Document text or search querydoc_id(string, optional): Document ID for delete operationlimit(number, optional): Number of search results (default: 5)metadata(object, optional): Additional metadata for documents
Supported Operations:
add_document- Embed and store a documentsearch- Semantic search with vector similaritydelete- Remove a document by IDclear- Clear all documents from collection
Example:
let rag_tool = new;
agent.tool;
Prerequisites:
- Qdrant running:
docker run -p 6333:6333 qdrant/qdrant - OpenAI API key for embeddings
WebScraperTool
Scrape web content from URLs with automatic text extraction and cleaning.
Parameters:
url(string, required): URL to scrapemax_length(number, optional): Maximum content length (default: 10000)
Example:
agent.tool;
JsonParserTool
Parse, validate, stringify, and extract values from JSON data.
Parameters:
operation(string, required): Operation:parse,stringify,get_value,validatejson(string, optional): JSON string for parse/stringify operationspath(string, optional): JSON path for get_value operation (e.g., "$.key" or "$.array[0]")
Supported Operations:
parse- Parse and validate JSON stringstringify- Convert JSON to formatted stringget_value- Extract value using JSON pathvalidate- Validate JSON structure
Example:
agent.tool;
TimestampTool
Work with timestamps, perform date/time operations and formatting.
Parameters:
operation(string, required): Operation:now,format,add,difftimestamp(number, optional): Unix timestamp for format/add operationsformat(string, optional): Date format string (default: "%Y-%m-%d %H:%M:%S")amount(number, optional): Time amount to add/subtractunit(string, optional): Time unit:seconds,minutes,hours,days,weeks
Supported Operations:
now- Get current timestampformat- Format timestamp to stringadd- Add time to timestampdiff- Calculate difference between timestamps
Example:
agent.tool;
ShellCommandTool
Execute shell commands safely with timeout and output capture.
Parameters:
command(string, required): Shell command to executetimeout(number, optional): Timeout in seconds (default: 30)
Example:
agent.tool;
HttpRequestTool
Make HTTP requests with full support for methods, headers, and body.
Parameters:
method(string, required): HTTP method:GET,POST,PUT,DELETE, etc.url(string, required): Request URLheaders(object, optional): HTTP headers as key-value pairsbody(string, optional): Request body for POST/PUT requests
Example:
agent.tool;
FileListTool
List directory contents with filtering and detailed information.
Parameters:
path(string, optional): Directory path (default: current directory)pattern(string, optional): File name pattern with wildcardsrecursive(boolean, optional): Include subdirectories (default: false)max_results(number, optional): Maximum number of results (default: 100)
Example:
agent.tool;
SystemInfoTool
Retrieve system information including CPU, memory, disk, and OS details.
Parameters:
- None required
Example:
agent.tool;
TextProcessorTool
Process and analyze text with various operations like counting, trimming, and searching.
Parameters:
operation(string, required): Operation:count,trim,uppercase,lowercase,replace,searchtext(string, required): Input textfind(string, optional): Text to find (for replace/search operations)replace(string, optional): Replacement text (for replace operation)case_sensitive(boolean, optional): Case sensitivity for search (default: true)
Supported Operations:
count- Count characters, words, linestrim- Remove whitespaceuppercase/lowercase- Change casereplace- Find and replace textsearch- Search for text patterns
Example:
agent.tool;
FileIOTool
Perform file I/O operations including reading, writing, copying, and moving files.
Parameters:
operation(string, required): Operation:read,write,copy,move,delete,existspath(string, required): File pathcontent(string, optional): Content for write operationdestination(string, optional): Destination path for copy/move operations
Supported Operations:
read- Read file contentwrite- Write content to filecopy- Copy file to new locationmove- Move file to new locationdelete- Delete fileexists- Check if file exists
Example:
agent.tool;
Project Structure
helios/
βββ Cargo.toml # Project configuration
βββ README.md # This file
βββ config.example.toml # Example configuration
βββ .gitignore # Git ignore rules
β
βββ src/
β βββ lib.rs # Library entry point
β βββ main.rs # Binary entry point (interactive demo)
β βββ agent.rs # Agent implementation
β βββ llm.rs # LLM client and provider
β βββ tools.rs # Tool system and built-in tools
β βββ chat.rs # Chat message and session types
β βββ config.rs # Configuration management
β βββ serve.rs # HTTP server for OpenAI-compatible API
β βββ error.rs # Error types
β
βββ docs/
β βββ API.md # API reference
β βββ QUICKSTART.md # Quick start guide
β βββ TUTORIAL.md # Detailed tutorial
β βββ USING_AS_CRATE.md # Using Helios as a library
β
βββ examples/
βββ basic_chat.rs # Simple chat example
βββ agent_with_tools.rs # Tool usage example
βββ agent_with_file_tools.rs # File management tools example
βββ agent_with_memory_db.rs # Memory database tool example
βββ agent_with_rag.rs # Agent with RAG capabilities
βββ custom_tool.rs # Custom tool implementation
βββ multiple_agents.rs # Multiple agents example
βββ forest_of_agents.rs # Multi-agent collaboration system
βββ send_message_tool_demo.rs # SendMessageTool functionality demo
βββ direct_llm_usage.rs # Direct LLM client usage
βββ streaming_chat.rs # Streaming responses example
βββ local_streaming.rs # Local model streaming example
βββ rag_in_memory.rs # RAG with in-memory vector store
βββ rag_advanced.rs # RAG with Qdrant vector store
βββ rag_qdrant_comparison.rs # Compare RAG implementations
βββ serve_agent.rs # Serve agent via HTTP API
βββ serve_with_custom_endpoints.rs # Serve with custom endpoints
βββ complete_demo.rs # Complete feature demonstration
Module Overview
helios-engine/
β
βββ agent - Agent system and builder pattern
βββ chat - Chat messages and session management
βββ config - TOML configuration loading/saving
βββ error - Error types and Result alias
βββ forest - Forest of Agents - multi-agent collaboration system
βββ llm - LLM client and API communication
βββ rag - RAG (Retrieval-Augmented Generation) system
βββ rag_tool - RAG tool implementation for agents
βββ serve - HTTP server for OpenAI-compatible API
βββ tools - Tool registry and implementations
Examples
For comprehensive examples demonstrating various Helios Engine features, see the examples/ directory.
The examples include:
- Basic chat and agent usage
- Tool integration examples
- File management demonstrations
- API serving examples
- Streaming and advanced features
See examples/README.md for detailed documentation and usage instructions.
Testing
Run tests:
Run with logging:
RUST_LOG=debug
π Advanced Features
Custom LLM Providers
Implement the LLMProvider trait for custom backends:
use async_trait;
use ;
;
Tool Chaining
Agents automatically chain tool calls:
// The agent can use multiple tools in sequence
let response = agent.chat.await?;
Thinking Tags Display
Helios Engine automatically detects and displays thinking tags from LLM responses:
- The CLI displays thinking tags with visual indicators:
π [Thinking...] - Streaming responses show thinking tags in real-time
- Supports both
<thinking>and<think>tag formats - In offline mode, thinking tags are processed and removed from final output
Conversation Context
Maintain conversation history:
let mut agent = builder
.config
.system_prompt
.build
.await?;
let response1 = agent.chat.await?;
let response2 = agent.chat.await?; // Agent remembers: "Alice"
println!;
println!;
Clean Output Mode
In offline mode, Helios Engine suppresses all verbose debugging output from llama.cpp:
- No model loading messages
- No layer information display
- No verbose internal operations
- Clean, user-focused experience during local inference
Session Memory & Metadata
Track agent state and conversation metadata across interactions:
// Set agent memory (namespaced under "agent:" prefix)
agent.set_memory;
agent.set_memory;
// Get memory values
if let Some = agent.get_memory
// Increment counters
agent.increment_tasks_completed;
agent.increment_counter;
// Get session summary
println!;
// Clear only agent memory (preserves general session metadata)
agent.clear_memory;
Session metadata in ChatSession:
let mut session = new;
// Set general session metadata
session.set_metadata;
session.set_metadata;
// Retrieve metadata
if let Some = session.get_metadata
// Get session summary
println!;
File Management Tools
Built-in tools for file operations:
use ;
let mut agent = builder
.config
.tool // Search files by name or content
.tool // Read file contents
.tool // Write/create files
.tool // Find and replace in files
.build
.await?;
// Agent can now search, read, write, and edit files
let response = agent.chat.await?;
println!;
In-Memory Database Tool
Cache and retrieve data during agent conversations:
use ;
let mut agent = builder
.config
.system_prompt
.tool
.build
.await?;
// Store data
agent.chat.await?;
// Agent automatically uses the database to remember
agent.chat.await?;
// Response: "Your favorite color is blue"
// Cache expensive computations
agent.chat.await?;
agent.chat.await?;
// List all cached data
let response = agent.chat.await?;
println!;
Shared Database Between Agents:
use ;
use HashMap;
// Create a shared database
let shared_db = new;
// Multiple agents sharing the same database
let mut agent1 = builder
.config
.tool
.build
.await?;
let mut agent2 = builder
.config
.tool
.build
.await?;
// Data stored by agent1 is accessible to agent2
agent1.chat.await?;
agent2.chat.await?; // Gets "in_progress"
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
- Clone the repository:
- Build the project:
- Run tests:
- Format code:
- Check for issues:
License
This project is licensed under the MIT License - see the LICENSE file for details.
Made with β€οΈ in Rust
β οΈ β οΈ HERE BE DRAGONS β οΈ β οΈ
π₯ ABANDON ALL HOPE, YE WHO ENTER HERE π₯
Greetings, Foolish Mortal
What lies before you is not codeβit is a CURSE.
A labyrinth of logic so twisted, so arcane, that it defies comprehension itself.
β‘ What Holds This Monstrosity Together
- π©Ή Duct tape (metaphorical and spiritual)
- π Prayers whispered at 3 AM
- π Stack Overflow answers from 2009
- π± Pure, unfiltered desperation
- π The tears of junior developers
- π² Luck (mostly luck)
π The Legend
Once, two beings understood this code:
β‘ God and Me β‘
Now... I have forgotten.
Only God remains.
And I'm not sure He's still watching.