๐ Helios Engine - LLM Agent Framework
Helios Engine is a powerful and flexible Rust framework for building LLM-powered agents with tool support, chat capabilities, and easy configuration management. Create intelligent agents that can interact with users, call tools, and maintain conversation context.
โจ Features
- ๐ค Agent System: Create multiple agents with different personalities and capabilities
- ๐ ๏ธ Tool Registry: Extensible tool system for adding custom functionality
- ๐ฌ Chat Management: Built-in conversation history and session management
- โก Streaming Support: Real-time response streaming with thinking tag detection
- โ๏ธ Configuration: TOML-based configuration for LLM settings
- ๐ LLM Support: Compatible with OpenAI API, any OpenAI-compatible API, and local models via llama.cpp
- ๐ Async/Await: Built on Tokio for high-performance async operations
- ๐ฏ Type-Safe: Leverages Rust's type system for safe and reliable code
- ๐ฆ Extensible: Easy to add custom tools and extend functionality
- ๐ญ Thinking Tags: Automatic detection and display of model reasoning process
- ๐ Offline Mode: Run local models without internet connection
- ๐ Clean Output: Suppresses verbose debugging in offline mode for clean user experience
๐ Table of Contents
- Installation
- Quick Start
- Configuration
- Local Inference Setup
- Architecture
- Usage Examples
- Creating Custom Tools
- API Documentation
- Project Structure
- Examples
- Contributing
- License
๐ง Installation
Helios Engine can be used both as a command-line tool and as a library crate in your Rust projects.
As a CLI Tool (Recommended for Quick Start)
Install globally using Cargo (once published):
Then use anywhere:
# Initialize configuration
# Start interactive chat
# Ask a quick question
# Get help
# ๐ NEW: Use offline mode with local models (no internet required)
# Use online mode (default - uses remote APIs)
# Auto mode (uses local if configured, otherwise remote)
As a Library Crate
Add Helios-Engine to your Cargo.toml:
[]
= "0.1.7"
= { = "1.35", = ["full"] }
Or use a local path during development:
[]
= { = "../helios" }
= { = "1.35", = ["full"] }
Build from Source
# Install locally
๐ Quick Start
Using as a Library Crate
The simplest way to use Helios Engine is to call LLM models directly:
use ;
use LLMConfig;
async
๐ For detailed examples of using Helios Engine as a crate, see Using as a Crate Guide
Using Offline Mode with Local Models
Run models locally without internet connection:
use ;
use LocalConfig;
async
Note: First run downloads the model. Subsequent runs use the cached model.
Using with Agent System
For more advanced use cases with tools and persistent conversation:
1. Configure Your LLM
Create a config.toml file (supports both remote and local):
[]
= "gpt-3.5-turbo"
= "https://api.openai.com/v1"
= "your-api-key-here"
= 0.7
= 2048
# Optional: Add local configuration for offline mode
[]
= "unsloth/Qwen3-0.6B-GGUF"
= "Qwen3-0.6B-Q4_K_M.gguf"
= 0.7
= 2048
2. Create Your First Agent
use ;
async
3. Run the Interactive Demo
โ๏ธ Configuration
Helios Engine uses TOML for configuration. You can configure either remote API access or local model inference.
Remote API Configuration (Default)
[]
# The model name (e.g., gpt-3.5-turbo, gpt-4, claude-3, etc.)
= "gpt-3.5-turbo"
# Base URL for the API (OpenAI or compatible)
= "https://api.openai.com/v1"
# Your API key
= "sk-..."
# Temperature for response generation (0.0 - 2.0)
= 0.7
# Maximum tokens in response
= 2048
Local Model Configuration (Offline Mode)
[]
# Remote config still needed for auto mode fallback
= "gpt-3.5-turbo"
= "https://api.openai.com/v1"
= "sk-..."
= 0.7
= 2048
# Local model configuration
[]
# HuggingFace repository and model file
= "unsloth/Qwen3-0.6B-GGUF"
= "Qwen3-0.6B-Q4_K_M.gguf"
# Local model settings
= 0.7
= 2048
Supported LLM Providers
Helios Engine supports both remote APIs and local model inference:
Remote APIs (Online Mode)
Helios Engine works with any OpenAI-compatible API:
- OpenAI:
https://api.openai.com/v1 - Azure OpenAI:
https://your-resource.openai.azure.com/openai/deployments/your-deployment - Local Models (LM Studio):
http://localhost:1234/v1 - Ollama with OpenAI compatibility:
http://localhost:11434/v1 - Any OpenAI-compatible API
Local Models (Offline Mode)
Run models locally using llama.cpp without internet connection:
- GGUF Models: Compatible with all GGUF format models from HuggingFace
- Automatic Download: Models are downloaded automatically from HuggingFace
- GPU Acceleration: Uses GPU if available (via llama.cpp)
- Clean Output: Suppresses verbose debugging for clean user experience
- Popular Models: Works with Qwen, Llama, Mistral, and other GGUF models
Supported Model Sources:
- HuggingFace Hub repositories
- Local GGUF files
- Automatic model caching
๐ Local Inference Setup
Helios Engine supports running large language models locally using llama.cpp, providing privacy, offline capability, and no API costs.
Prerequisites
- HuggingFace Account: Sign up at huggingface.co (free)
- HuggingFace CLI: Install the CLI tool:
Setting Up Local Models
-
Find a GGUF Model: Browse HuggingFace Models for compatible models
-
Update Configuration: Add local model config to your
config.toml:[] = "unsloth/Qwen3-0.6B-GGUF" = "Qwen3-0.6B-Q4_K_M.gguf" = 0.7 = 2048 -
Run in Offline Mode:
# First run downloads the model # Subsequent runs use cached model
Recommended Models
| Model | Size | Use Case | Repository |
|---|---|---|---|
| Qwen3-0.6B | ~400MB | Fast, good quality | unsloth/Qwen3-0.6B-GGUF |
| Llama-3.2-1B | ~700MB | Balanced performance | unsloth/Llama-3.2-1B-Instruct-GGUF |
| Mistral-7B | ~4GB | High quality | TheBloke/Mistral-7B-Instruct-v0.1-GGUF |
Performance Tips
- GPU Acceleration: Models automatically use GPU if available
- Model Caching: Downloaded models are cached locally (~/.cache/huggingface)
- Memory Usage: Larger models need more RAM/VRAM
- First Run: Initial model download may take time depending on connection
Clean Output Mode
In offline mode, Helios Engine suppresses all debugging output from llama.cpp, providing a clean chat experience without verbose loading messages or layer information.
๐๏ธ Architecture
System Overview
graph TB
User[User] -->|Input| Agent[Agent]
Agent -->|Messages| LLM[LLM Client]
Agent -->|Tool Calls| Registry[Tool Registry]
Registry -->|Execute| Tools[Tools]
Tools -->|Results| Agent
LLM -->|Response| Agent
Agent -->|Output| User
Config[Config TOML] -->|Load| Agent
style Agent fill:#4CAF50
style LLM fill:#2196F3
style Registry fill:#FF9800
style Tools fill:#9C27B0
Component Architecture
classDiagram
class Agent {
+name: String
+llm_client: LLMClient
+tool_registry: ToolRegistry
+chat_session: ChatSession
+chat(message) ChatMessage
+register_tool(tool) void
+clear_history() void
}
class LLMClient {
+config: LLMConfig
+chat(messages, tools) ChatMessage
+generate(request) LLMResponse
}
class ToolRegistry {
+tools: HashMap
+register(tool) void
+execute(name, args) ToolResult
+get_definitions() Vec
}
class Tool {
<<interface>>
+name() String
+description() String
+parameters() HashMap
+execute(args) ToolResult
}
class ChatSession {
+messages: Vec
+system_prompt: Option
+add_message(msg) void
+clear() void
}
class Config {
+llm: LLMConfig
+from_file(path) Config
+save(path) void
}
Agent --> LLMClient
Agent --> ToolRegistry
Agent --> ChatSession
Agent --> Config
ToolRegistry --> Tool
Tool <|-- CalculatorTool
Tool <|-- EchoTool
Tool <|-- CustomTool
Agent Execution Flow
sequenceDiagram
participant User
participant Agent
participant LLM
participant ToolRegistry
participant Tool
User->>Agent: Send Message
Agent->>Agent: Add to Chat History
loop Until No Tool Calls
Agent->>LLM: Send Messages + Tool Definitions
LLM->>Agent: Response (with/without tool calls)
alt Has Tool Calls
Agent->>ToolRegistry: Execute Tool
ToolRegistry->>Tool: Call with Arguments
Tool->>ToolRegistry: Return Result
ToolRegistry->>Agent: Tool Result
Agent->>Agent: Add Tool Result to History
else No Tool Calls
Agent->>User: Return Final Response
end
end
Tool Execution Pipeline
flowchart LR
A[User Request] --> B{LLM Decision}
B -->|Need Tool| C[Get Tool Definition]
C --> D[Parse Arguments]
D --> E[Execute Tool]
E --> F[Format Result]
F --> G[Add to Context]
G --> B
B -->|No Tool Needed| H[Return Response]
H --> I[User]
style B fill:#FFD700
style E fill:#4CAF50
style H fill:#2196F3
๐ Usage Examples
Basic Chat
use ;
async
Agent with Built-in Tools
use ;
async
Multiple Agents
use ;
async
๐ ๏ธ Creating Custom Tools
Implement the Tool trait to create custom tools:
use async_trait;
use ;
use Value;
use HashMap;
;
// Use your custom tool
async
๐ API Documentation
Core Types
Agent
The main agent struct that manages conversation and tool execution.
Methods:
builder(name)- Create a new agent builderchat(message)- Send a message and get a responseregister_tool(tool)- Add a tool to the agentclear_history()- Clear conversation historyset_system_prompt(prompt)- Set the system promptset_max_iterations(max)- Set maximum tool call iterations
Config
Configuration management for LLM settings.
Methods:
from_file(path)- Load config from TOML filedefault()- Create default configurationsave(path)- Save config to file
ToolRegistry
Manages and executes tools.
Methods:
new()- Create empty registryregister(tool)- Register a new toolexecute(name, args)- Execute a tool by nameget_definitions()- Get all tool definitionslist_tools()- List registered tool names
ChatSession
Manages conversation history.
Methods:
new()- Create new sessionwith_system_prompt(prompt)- Set system promptadd_message(message)- Add message to historyclear()- Clear all messages
Built-in Tools
CalculatorTool
Performs basic arithmetic operations.
Parameters:
expression(string, required): Mathematical expression
Example:
agent.tool;
EchoTool
Echoes back a message.
Parameters:
message(string, required): Message to echo
Example:
agent.tool;
๐ Project Structure
helios/
โโโ Cargo.toml # Project configuration
โโโ README.md # This file
โโโ config.example.toml # Example configuration
โโโ .gitignore # Git ignore rules
โ
โโโ src/
โ โโโ lib.rs # Library entry point
โ โโโ main.rs # Binary entry point (interactive demo)
โ โโโ agent.rs # Agent implementation
โ โโโ llm.rs # LLM client and provider
โ โโโ tools.rs # Tool system and built-in tools
โ โโโ chat.rs # Chat message and session types
โ โโโ config.rs # Configuration management
โ โโโ error.rs # Error types
โ
โโโ docs/
โ โโโ API.md # API reference
โ โโโ QUICKSTART.md # Quick start guide
โ โโโ TUTORIAL.md # Detailed tutorial
โ โโโ USING_AS_CRATE.md # Using Helios as a library
โ
โโโ examples/
โโโ basic_chat.rs # Simple chat example
โโโ agent_with_tools.rs # Tool usage example
โโโ custom_tool.rs # Custom tool implementation
โโโ multiple_agents.rs # Multiple agents example
โโโ direct_llm_usage.rs # Direct LLM client usage
Module Overview
helios-engine/
โ
โโโ ๐ฆ agent - Agent system and builder pattern
โโโ ๐ค llm - LLM client and API communication
โโโ ๐ ๏ธ tools - Tool registry and implementations
โโโ ๐ฌ chat - Chat messages and session management
โโโ โ๏ธ config - TOML configuration loading/saving
โโโ โ error - Error types and Result alias
๐ฏ Examples
Run the included examples:
# Basic chat
# Agent with tools
# Custom tool
# Multiple agents
๐งช Testing
Run tests:
Run with logging:
RUST_LOG=debug
๐ Advanced Features
Custom LLM Providers
Implement the LLMProvider trait for custom backends:
use async_trait;
use ;
;
Tool Chaining
Agents automatically chain tool calls:
// The agent can use multiple tools in sequence
let response = agent.chat.await?;
Conversation Context
Maintain conversation history:
let mut agent = new;
agent.chat.await?;
agent.chat.await?; // Agent remembers: "Alice"
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
- Clone the repository:
- Build the project:
- Run tests:
- Format code:
- Check for issues:
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
Made with โค๏ธ in Rust