Alchemy

A unified LLM API abstraction layer in Rust focused on a consistent streaming interface across the providers that are implemented today.

Warning: This project is in early development (v0.1.x). APIs may change without notice. Not recommended for production use yet.

Current status: the crate ships first-class streaming implementations for Anthropic-style Messages, OpenAI-compatible Completions, MiniMax Completions, and z.ai GLM Completions. Additional provider identities exist in the type system, but several Rust runtime ports are still in progress.

Alchemy-rs

Heavily inspired by and ported from: pi-mono/packages/ai

Current Provider Status

Implemented today

Anthropic-style Messages
- Anthropic
- Kimi (Moonshot AI, coding endpoint)
OpenAI-compatible Completions
- OpenAI
- OpenRouter
- Featherless
- other compatible endpoints can also be used by manually constructing a Model<OpenAICompletions>
MiniMax (Global)
MiniMax CN
z.ai (GLM)

Still being ported

These provider identities are present in the crate surface, but should be treated as in-progress until dedicated runtime implementations land:

Google (Gemini / Vertex)
AWS Bedrock
xAI (Grok)
Groq
Cerebras
Mistral

Features

Streaming-first - Implemented providers use async streams
Type-safe - Leverages Rust's type system
Provider-agnostic - Switch providers without code changes
Tool calling - Function/tool support across implemented streaming paths
Message transformation - Cross-provider message compatibility primitives

Installation

cargo add alchemy-llm

Or add to your Cargo.toml:

[dependencies]
alchemy-llm = "0.1"

Quick Start

use alchemy_llm::stream;
use alchemy_llm::types::{
    AssistantMessageEvent, Context, InputType, KnownProvider, Message, Model, ModelCost,
    OpenAICompletions, Provider, UserContent, UserMessage,
};
use futures::StreamExt;

#[tokio::main]
async fn main() -> alchemy_llm::Result<()> {
    let model = Model::<OpenAICompletions> {
        id: "gpt-4o-mini".to_string(),
        name: "GPT-4o Mini".to_string(),
        api: OpenAICompletions,
        provider: Provider::Known(KnownProvider::OpenAI),
        base_url: "https://api.openai.com/v1".to_string(),
        reasoning: false,
        input: vec![InputType::Text],
        cost: ModelCost {
            input: 0.0,
            output: 0.0,
            cache_read: 0.0,
            cache_write: 0.0,
        },
        context_window: 128_000,
        max_tokens: 16_384,
        headers: None,
        compat: None,
    };

    let context = Context {
        messages: vec![Message::User(UserMessage {
            content: UserContent::Text("Hello!".to_string()),
            timestamp: 0,
        })],
        system_prompt: None,
        tools: None,
    };

    let mut stream = stream(&model, &context, None)?;

    while let Some(event) = stream.next().await {
        if let AssistantMessageEvent::TextDelta { delta, .. } = event {
            print!("{}", delta);
        }
    }

    Ok(())
}

Featherless Quick Example

Featherless is available as a first-class provider identity while reusing the shared OpenAI-compatible runtime underneath. The public API stays the same: build a Model<OpenAICompletions>, then call stream(...) or complete(...).

use alchemy_llm::{featherless_model, stream};
use alchemy_llm::types::{AssistantMessageEvent, Context, Message, UserContent, UserMessage};
use futures::StreamExt;

#[tokio::main]
async fn main() -> alchemy_llm::Result<()> {
    let model = featherless_model("moonshotai/Kimi-K2.5");
    let context = Context {
        system_prompt: None,
        messages: vec![Message::User(UserMessage {
            content: UserContent::Text("Hello from Featherless".to_string()),
            timestamp: 0,
        })],
        tools: None,
    };

    let mut stream = stream(&model, &context, None)?;

    while let Some(event) = stream.next().await {
        if let AssistantMessageEvent::TextDelta { delta, .. } = event {
            print!("{}", delta);
        }
    }

    Ok(())
}

Set FEATHERLESS_API_KEY in your environment.

The helper returns a default Model<OpenAICompletions> with:

provider: KnownProvider::Featherless
base URL: https://api.featherless.ai/v1/chat/completions
default context window: 128_000
default max output tokens: 16_384

Because Featherless exposes a dynamic catalog, you should treat those limits as safe defaults. If you fetch exact model metadata from GET /v1/models, override the returned Model fields before calling stream(...) or complete(...).

Latest Release

Crate: alchemy-llm on crates.io
Docs: docs.rs/alchemy-llm
Current version: 0.1.9
Release notes: CHANGELOG.md
Highlights:
- Added first-class Kimi provider integration on the shared Anthropic-style Messages path
- Added kimi_k2_0711_preview() model helper and KIMI_API_KEY environment lookup support
- Added provider architecture and Kimi docs covering replay fidelity and shared runtime behavior

Setup

Clone the repository

git clone https://github.com/alchemiststudiosDOTai/alchemy-rs.git
cd alchemy-rs

Configure API keys

cp .env.example .env
# Edit .env and add your API keys

Build the project
```
cargo build
```
Run tests
```
cargo test
```

Notes on examples

Public example binaries are still being rebuilt. For now, the most accurate usage references are:

the Quick Start snippets in this README
provider-specific docs under docs/providers/
unit and integration-style tests under src/providers/, src/stream/, and related modules

Documentation

docs/README.md - Documentation index
docs/providers/architecture.md - Provider architecture contract for unified thinking, replay fidelity, and stream normalization
docs/providers/featherless.md - Featherless as a first-class provider on the shared OpenAI-compatible path
docs/providers/kimi.md - Kimi as a first-class provider on the shared Anthropic-style messages path

Development

See AGENTS.md for detailed development guidelines, architecture, and quality gates.

Quality Checks

Pre-commit hooks automatically run:

cargo fmt - Code formatting
cargo clippy - Linting with complexity checks
cargo check - Compilation

Run all quality checks:

make quality-full     # All checks including complexity, duplicates, and ast-rules
make quality-quick    # Fast checks (fmt, clippy, check)
make complexity       # Cyclomatic complexity analysis
make duplicates       # Duplicate code detection
make ast-rules        # Ast-grep architecture boundary checks

Or run individually:

cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo check --all-targets --all-features
make ast-rules

Tools used:

Clippy - Cognitive complexity warnings (threshold: 20)
polydup - Duplicate code detection (install: cargo install polydup-cli)
ast-grep (sg) - Architecture boundary checks (make ast-rules)

License

MIT

alchemy-llm 0.2.0