openheim 0.3.0 - Docs.rs

# Custom LLM Provider

Openheim's `LlmClient` trait abstracts over any chat-completion backend. Implement it to connect to a provider that isn't built in — a self-hosted model, a private API, a research endpoint, or a mock for testing.

---

## The `LlmClient` trait

```rust
#[async_trait]
pub trait LlmClient: Send + Sync {
    /// Send a chat request and return the first choice from the provider.
    async fn send(&self, messages: &[Message], tools: &[Tool]) -> Result<Choice>;
}
```

`messages` is the full conversation history (user, assistant, tool-result turns). `tools` is the list of currently registered tools in JSON-schema format. Return the model's next `Choice` — either a text response or a set of tool calls.

---

## Key types

```rust
// Input
pub struct Message {
    pub role: Role,                        // User | Assistant | System | Tool
    pub content: Option<String>,
    pub tool_calls: Option<Vec<ToolCall>>, // set by the model when it calls tools
    pub tool_call_id: Option<String>,      // set on Role::Tool messages
    pub tool_name: Option<String>,
}

pub struct Tool {
    pub tool_type: String,                 // always "function"
    pub function: FunctionDefinition,
}

pub struct FunctionDefinition {
    pub name: String,
    pub description: String,
    pub parameters: serde_json::Value,    // JSON Schema object
}

// Output
pub struct Choice {
    pub message: Message,
    pub finish_reason: Option<String>,    // "stop" | "tool_calls" | "length" | …
}
```

The agent loop treats `finish_reason == "stop"` as the signal to end the conversation. Any other finish reason with no tool calls also ends the loop (with a warning). If `message.tool_calls` is set, the loop executes them and continues.

---

## Step-by-step example

The following implements a provider that speaks a hypothetical OpenAI-compatible API with a custom base URL and auth scheme.

### 1. Define the client struct

```rust
use async_trait::async_trait;
use openheim::core::models::{Choice, Message, Role, Tool, ToolCall, FunctionCall};
use openheim::error::{Error, Result};
use openheim::llm::LlmClient;
use reqwest::Client;
use serde::{Deserialize, Serialize};

pub struct MyCustomProvider {
    client: Client,
    base_url: String,
    api_key: String,
    model: String,
}

impl MyCustomProvider {
    pub fn new(base_url: impl Into<String>, api_key: impl Into<String>, model: impl Into<String>) -> Self {
        Self {
            client: Client::new(),
            base_url: base_url.into(),
            api_key: api_key.into(),
            model: model.into(),
        }
    }
}
```

### 2. Define request/response shapes

Map openheim's types to what the remote API expects. Most chat-completion APIs follow the OpenAI schema closely, so if that's the case, use `OpenAiCompatibleClient` instead of writing this by hand.

```rust
#[derive(Serialize)]
struct ApiRequest {
    model: String,
    messages: Vec<ApiMessage>,
    #[serde(skip_serializing_if = "Vec::is_empty")]
    tools: Vec<serde_json::Value>,
}

#[derive(Serialize)]
struct ApiMessage {
    role: String,
    content: Option<String>,
}

#[derive(Deserialize)]
struct ApiResponse {
    choices: Vec<ApiChoice>,
}

#[derive(Deserialize)]
struct ApiChoice {
    message: ApiResponseMessage,
    finish_reason: Option<String>,
}

#[derive(Deserialize)]
struct ApiResponseMessage {
    content: Option<String>,
    // tool_calls: Option<Vec<…>>,  add if the API supports function calling
}
```

### 3. Implement `send`

```rust
#[async_trait]
impl LlmClient for MyCustomProvider {
    async fn send(&self, messages: &[Message], tools: &[Tool]) -> Result<Choice> {
        let api_messages: Vec<ApiMessage> = messages
            .iter()
            .map(|m| ApiMessage {
                role: match m.role {
                    Role::User => "user".into(),
                    Role::Assistant => "assistant".into(),
                    Role::System => "system".into(),
                    Role::Tool => "tool".into(),
                },
                content: m.content.clone(),
            })
            .collect();

        let api_tools: Vec<serde_json::Value> = tools
            .iter()
            .map(|t| serde_json::json!({
                "type": t.tool_type,
                "function": {
                    "name": t.function.name,
                    "description": t.function.description,
                    "parameters": t.function.parameters,
                }
            }))
            .collect();

        let body = ApiRequest {
            model: self.model.clone(),
            messages: api_messages,
            tools: api_tools,
        };

        let response = self.client
            .post(format!("{}/v1/chat/completions", self.base_url))
            .bearer_auth(&self.api_key)
            .json(&body)
            .send()
            .await
            .map_err(|e| Error::HttpError(e.to_string()))?;

        if !response.status().is_success() {
            let status = response.status();
            let text = response.text().await.unwrap_or_default();
            return Err(Error::ApiError(format!("HTTP {status}: {text}")));
        }

        let api_resp: ApiResponse = response
            .json()
            .await
            .map_err(|e| Error::ParseError(format!("failed to parse response: {e}")))?;

        let choice = api_resp
            .choices
            .into_iter()
            .next()
            .ok_or_else(|| Error::ApiError("empty choices array".into()))?;

        Ok(Choice {
            message: Message {
                role: Role::Assistant,
                content: choice.message.content,
                tool_calls: None,
                tool_call_id: None,
                tool_name: None,
            },
            finish_reason: choice.finish_reason,
        })
    }
}
```

### 4. Wrap with `RetryClient` (optional but recommended)

`RetryClient` wraps any `LlmClient` and retries on transient errors (rate limits, 5xx, network timeouts) with exponential backoff:

```rust
use openheim::llm::RetryClient;
use std::sync::Arc;

let base_provider = MyCustomProvider::new(
    "https://api.myprovider.com",
    std::env::var("MY_PROVIDER_KEY").unwrap(),
    "my-model-v1",
);

let llm: Arc<dyn LlmClient> = Arc::new(RetryClient::new(Arc::new(base_provider)));
```

### 5. Use with the agent loop

Pass the custom client directly to `run_agent_with_history`:

```rust
use openheim::core::agent::run_agent_with_history;
use openheim::core::models::Message;
use openheim::config::{AgentConfig, load_config};
use openheim::tools::SystemToolExecutor;
use std::sync::Arc;

#[tokio::main]
async fn main() -> openheim::Result<()> {
    let llm: Arc<dyn openheim::llm::LlmClient> = Arc::new(
        RetryClient::new(Arc::new(MyCustomProvider::new(
            "https://api.myprovider.com",
            std::env::var("MY_PROVIDER_KEY").unwrap(),
            "my-model-v1",
        )))
    );

    let mut executor = SystemToolExecutor::new();
    executor.register_builtins();
    let executor = Arc::new(executor);

    let app_config = load_config()?;
    let agent_config = app_config.resolve(None)?;

    let mut messages = vec![Message::user("Hello!".into())];

    let result = run_agent_with_history(
        llm,
        executor,
        &agent_config,
        &mut messages,
        None,
    )
    .await?;

    println!("{}", result.final_response);
    Ok(())
}
```

---

## Testing a custom provider

Use a mock `LlmClient` to test prompt logic without making real API calls. The agent's test suite in `src/core/agent.rs` shows the pattern:

```rust
use std::sync::{Arc, Mutex};
use async_trait::async_trait;
use openheim::core::models::{Choice, Message, Role, Tool};
use openheim::error::{Error, Result};
use openheim::llm::LlmClient;

struct MockLlm {
    responses: Mutex<Vec<Choice>>,
}

impl MockLlm {
    fn with_responses(responses: Vec<Choice>) -> Arc<Self> {
        Arc::new(Self { responses: Mutex::new(responses) })
    }
}

#[async_trait]
impl LlmClient for MockLlm {
    async fn send(&self, _messages: &[Message], _tools: &[Tool]) -> Result<Choice> {
        self.responses
            .lock()
            .unwrap()
            .pop()
            .ok_or_else(|| Error::ApiError("no more mock responses".into()))
    }
}
```

---

## Already covered by `OpenAiCompatibleClient`

If the target API speaks the OpenAI chat-completions format, use the built-in `OpenAiCompatibleClient` with a custom `base_url` in `config.toml` rather than implementing `LlmClient` from scratch:

```toml
[providers.my-provider]
api_base = "https://api.myprovider.com/v1"
default_model = "my-model-v1"
models = ["my-model-v1"]
env_var = "MY_PROVIDER_KEY"
```

This covers Ollama, vLLM, LM Studio, Mistral, and any other OpenAI-compatible endpoint.