Crate ai_assistant_core

Expand description

§ai_assistant_core

Simple, ergonomic Rust client and server for local LLMs.

Connect to Ollama, LM Studio, or any OpenAI-compatible server in a few lines of code. List models, chat, stream responses, and serve your local model as an OpenAI-compatible API accessible remotely.

§Quick Start — Client

use ai_assistant_core::{ollama, Message};

#[tokio::main]
async fn main() -> Result<(), ai_assistant_core::Error> {
    let provider = ollama();

    // List available models
    let models = provider.models().await?;
    println!("Models: {:?}", models.iter().map(|m| &m.name).collect::<Vec<_>>());

    // Simple chat
    let reply = provider.chat("llama3.2:1b", "What is Rust?").await?;
    println!("{reply}");

    // Chat with message history
    let messages = vec![
        Message::system("You are a helpful assistant."),
        Message::user("Explain ownership in Rust in 2 sentences."),
    ];
    let reply = provider.send("llama3.2:1b", &messages).await?;
    println!("{reply}");

    Ok(())
}

§Streaming

use ai_assistant_core::ollama;
use futures::StreamExt;

let provider = ollama();
let mut stream = provider.chat_stream("llama3.2:1b", "Tell me a joke").await?;
while let Some(chunk) = stream.next().await {
    print!("{}", chunk?);
}

§Providers

use ai_assistant_core::{ollama, ollama_at, lm_studio, openai_compat};

let o = ollama();                                         // localhost:11434
let o2 = ollama_at("http://192.168.1.50:11434");          // remote Ollama
let lm = lm_studio();                                    // localhost:1234
let custom = openai_compat("http://localhost:8080/v1");   // any OpenAI-compatible

§Auto-detection

use ai_assistant_core::detect;

let providers = detect(&[]).await;
for p in &providers {
    println!("{} at {} ({} models)", p.name, p.url, p.model_count);
}
if let Some(p) = providers.first() {
    let reply = p.provider.chat(&p.models[0], "Hello!").await?;
    println!("{reply}");
}

§Serve Your Model (feature `serve`)

Expose your local LLM as an OpenAI-compatible API:

use ai_assistant_core::{ollama, serve};

#[tokio::main]
async fn main() -> Result<(), ai_assistant_core::Error> {
    let provider = ollama();
    serve::quick(provider).await?; // serves on :8090
    Ok(())
}

With more control:

use ai_assistant_core::{ollama, ProviderServiceBuilder};

ProviderServiceBuilder::new(ollama())
    .port(9090)
    .token("my_secret")
    .nat()                    // STUN + UPnP for remote access
    .start().await?;

§Binary: `ai_serve`

cargo install ai_assistant_core --bin ai_serve --features serve
ai_serve                            # auto-detect + serve
ai_serve --nat --token secret       # with remote access + auth

§Need more?

For advanced features (RAG, multi-agent, security, distributed clusters, MCP, autonomous agents, and more), check out the full ai_assistant suite.

Structs§

DetectedProvider: A provider that was automatically detected on the local machine.
Message: A chat message with a role and content.
ModelInfo: Information about an available model.
Provider: An LLM provider that can list models, chat, and stream responses.

Enums§

Error: Errors returned by ai_assistant_core.
Role: Role in a conversation.

Functions§

detect: Scan the local machine for running LLM providers.
lm_studio: Create an LM Studio provider pointing to http://localhost:1234/v1.
ollama: Create an Ollama provider pointing to http://localhost:11434.
ollama_at: Create an Ollama provider at a custom URL.
openai_compat: Create a provider for any OpenAI-compatible API.