Skip to main content

Crate orion_core

Crate orion_core 

Source
Expand description

orion-core: Agent harness for local LLM inference.

Provides the agent loop, context pipeline, tool execution, and event system for building AI chat interfaces on top of local model backends (llama.cpp, MLX, etc.).

§Architecture

User prompt
  → Agent.prompt()
    → Context pipeline (prune pairs + template format)
      → LlmBackend.generate() (streaming tokens)
        → Tool execution loop (parse calls → run tools → feed results back)
          → AgentEvent stream → UI

The crate is backend-agnostic. Implement backend::LlmBackend for your inference engine and the agent handles the rest.

§Example

Implement LlmBackend for your engine, then drive the agent. The mock backend below streams a canned reply so the whole loop runs end to end (a complete version lives in examples/mock_backend.rs):

use std::sync::Arc;
use std::sync::atomic::AtomicBool;
use orion_core::{
    Agent, AgentConfig, AgentEvent, CoreResult, GenerationResult,
    InferenceParams, LlmBackend, TokenCallback,
};
use tokio::sync::mpsc;

struct MockBackend;
impl LlmBackend for MockBackend {
    fn generate(
        &self,
        _prompt: &str,
        _params: &InferenceParams,
        _abort: Arc<AtomicBool>,
        mut on_token: TokenCallback,
    ) -> CoreResult<GenerationResult> {
        on_token("Hi!", 1, 10.0);
        Ok(GenerationResult {
            text: "Hi!".into(),
            tokens_generated: 1,
            prompt_tokens: 0,
            tokens_per_sec: 10.0,
            time_to_first_token_ms: 1.0,
            generation_time_ms: 1.0,
        })
    }
    fn tokenize_count(&self, text: &str) -> CoreResult<u32> {
        Ok(text.split_whitespace().count() as u32)
    }
    fn is_ready(&self) -> bool { true }
}

let rt = tokio::runtime::Runtime::new().unwrap();
rt.block_on(async {
    let mut agent = Agent::new(AgentConfig::default());
    let backend: Arc<dyn LlmBackend> = Arc::new(MockBackend);
    let (tx, mut rx) = mpsc::unbounded_channel::<AgentEvent>();

    // Consume events concurrently while the agent generates.
    let consumer = tokio::spawn(async move {
        let mut reply = String::new();
        while let Some(event) = rx.recv().await {
            if let AgentEvent::MessageDelta { delta, .. } = event {
                reply.push_str(&delta);
            }
        }
        reply
    });

    agent.prompt("Hello", backend, tx).await.unwrap();
    assert_eq!(consumer.await.unwrap(), "Hi!");
});

Re-exports§

pub use agent::Agent;
pub use agent::AgentConfig;
pub use backend::GenerationResult;
pub use backend::InferenceParams;
pub use backend::LlmBackend;
pub use backend::TokenCallback;
pub use context::plan_prune;
pub use context::ContextConfig;
pub use context::PreparedContext;
pub use context::PrunePlan;
pub use context::PruneStrategy;
pub use error::CoreError;
pub use error::CoreResult;
pub use events::AgentEvent;
pub use messages::Message;
pub use messages::Role;
pub use messages::ToolCall;
pub use messages::ToolResult;
pub use template::detect_template;
pub use template::template_from_name;
pub use template::AlpacaTemplate;
pub use template::ChatMLTemplate;
pub use template::ChatTemplate;
pub use template::CommandRTemplate;
pub use template::DeepSeekTemplate;
pub use template::GemmaTemplate;
pub use template::Llama2Template;
pub use template::Llama3Template;
pub use template::MistralTemplate;
pub use template::Phi3Template;
pub use template::VicunaTemplate;
pub use tools::parse_tool_calls;
pub use tools::ParsedToolCall;
pub use tools::ToolSchema;
pub use tools::Tool;
pub use tools::ToolOutput;
pub use tools::ToolUpdateCallback;

Modules§

agent
The Agent orchestrator and its configuration.
backend
The LlmBackend trait and inference parameter/result types.
context
Context-window management: pruning, token budgeting, and prompt formatting.
error
Error and result types for the crate.
events
The AgentEvent stream emitted while the agent runs.
messages
Conversation data types: Message, Role, and tool call/result records.
template
Chat prompt templates for the supported model families.
tools
The Tool trait (feature tools), tool schemas, and tool-call parsing.