Skip to main content

Crate cognate_llm

Crate cognate_llm 

Source
Expand description

§Cognate

Type-safe LLM framework for Rust. Multi-provider support, compile-time validation, streaming, tool calling—all with zero-cost abstractions.

Cognate Logo

§Overview

Cognate provides production-grade abstractions for building LLM-powered applications in Rust. Unlike fragmented HTTP clients or Python-style libraries, Cognate offers:

  • Unified multi-provider interface (OpenAI, Anthropic, Groq, Ollama, custom)
  • Type-safe tool calling via #[derive(Tool)]
  • Compile-time prompt validation via #[derive(Prompt)]
  • Production middleware (retry, rate-limit, tracing)
  • Streaming-first design with proper error handling
  • Axum web server integration out of the box
  • RAG pipeline support with pluggable vector stores

§Why Cognate?

§Feature Comparison

Feature Comparison

FeatureCognateasync-openairig
Multi-providerOpenAI, Anthropic, Groq, OllamaOpenAI onlyOpenAI, Anthropic
Type-safe tools#[derive(Tool)] with validationManual JSONRuntime definition
Compile-time validationPrompts checked at build timeNoNo
Axum integrationBuilt-in extractors + middlewareNoNo
Middleware systemRetry, rate-limit, tracing, observabilityBasic retryLimited
RAG supportVector search + memory traitsNoNo

§Performance

Cognate is designed for production workloads where latency and throughput matter.

MetricCognateasync-openai (Rust)Python LangChain
P50 Latency<1ms (overhead)<1ms (overhead)45ms
P99 Latency<5ms (overhead)<5ms (overhead)150ms
Requests/sec2500+2800+200-400
Memory (RSS)12-15 MB12-15 MB120-150 MB
Compile time8-12s (clean)6-8s (clean)N/A

See BENCHMARKS.md for detailed metrics and reproducible measurements.

§Installation

Add Cognate to your Cargo.toml:

[dependencies]
cognate-core = "0.1"
cognate-providers = "0.1"
cognate-tools = "0.1"
cognate-prompts = "0.1"
tokio = { version = "1.0", features = ["full"] }

§Quick Start

§Basic Chat

use cognate_core::{Provider, Request, Message};
use cognate_providers::OpenAiProvider;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let provider = OpenAiProvider::new(
        std::env::var("OPENAI_API_KEY")?
    )?;
    
    let request = Request::new()
        .with_model("gpt-4o-mini")
        .with_messages(vec![
            Message::user("Explain Rust type safety in one sentence"),
        ]);

    let response = provider.complete(request).await?;
    println!("{}", response.content());
    
    Ok(())
}

§Type-Safe Tool Calling

use cognate_tools::Tool;
use cognate_core::{Provider, Request, Message};
use serde::{Deserialize, Serialize};
use schemars::JsonSchema;

#[derive(Tool, Serialize, Deserialize, JsonSchema)]
#[tool(description = "Add two numbers")]
struct Calculator {
    /// First number
    a: i32,
    /// Second number
    b: i32,
}

impl Calculator {
    async fn run(&self) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
        Ok(format!("{} + {} = {}", self.a, self.b, self.a + self.b))
    }
}

// Use in request
let request = Request::new()
    .with_model("gpt-4o")
    .with_messages(vec![
        Message::user("What is 15 + 23?"),
    ])
    .with_tool(Calculator);

§Streaming Responses

use cognate_core::Provider;
use futures::StreamExt;

let provider = /* ... */;

let mut stream = provider.stream(request).await?;

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    print!("{}", chunk.text);
}

§Architecture

Cognate is organized into specialized crates:

Architecture Diagram

  • cognate-core: Provider trait, request/response types, middleware system
  • cognate-providers: OpenAI, Anthropic, retry, fallback implementations
  • cognate-tools: Tool dispatch, automatic execution loop, #[derive(Tool)]
  • cognate-prompts: Template system, compile-time validation, #[derive(Prompt)]
  • cognate-rag: Vector search traits, memory abstraction, embedding utilities
  • cognate-axum: Axum extractors, middleware layers, web integration
  • cognate-cli: CLI tools for development and testing

§Examples

All examples are in the respective crate directories:

Run an example:

cargo run --example simple_chat -p cognate-providers

§Configuration

§OpenAI Provider

use cognate_providers::{OpenAiProvider, RetryConfig};
use std::time::Duration;

let provider = OpenAiProvider::new(api_key)?
    .with_timeout(Duration::from_secs(30))
    .with_retry(RetryConfig {
        max_retries: 3,
        initial_backoff: Duration::from_millis(100),
        max_backoff: Duration::from_secs(10),
    });

§Custom Providers

Implement the Provider trait:

use cognate_core::{Provider, Request, Response};
use async_trait::async_trait;

struct MyProvider;

#[async_trait]
impl Provider for MyProvider {
    async fn complete(&self, req: Request) -> cognate_core::Result<Response> {
        // Your implementation
        todo!()
    }
}

§Production Considerations

§Observability

Cognate integrates with standard Rust tracing:

use tracing::{info, span, Level};

let span = span!(Level::INFO, "llm_request", model = "gpt-4o");
let _guard = span.enter();

let response = provider.complete(request).await?;

§Error Handling

Comprehensive error types:

use cognate_core::{Error, Result};

match provider.complete(request).await {
    Ok(response) => println!("{}", response.content()),
    Err(Error::RateLimited { retry_after }) => {
        println!("Rate limited, retry in {:?}", retry_after);
    }
    Err(e) => eprintln!("Error: {}", e),
}

§Testing

Cognate includes a mock provider for testing:

use cognate_core::MockProvider;

let mock = MockProvider::new()
    .queue_response(Response::text("Hello, world!"));

let response = mock.complete(request).await?;

§Status

Cognate is in active development (v0.1.0). The API is stable and suitable for production use.

  • All 9 crates compile cleanly
  • 17 unit tests + 7 doc tests passing
  • Compatible with Rust 1.70 and newer
  • Production middleware included
  • Streaming support verified

§License

Dual-licensed under MIT and Apache-2.0.

Choose whichever license works best for your project.

§Contributing

Contributions are welcome. Please read CONTRIBUTING.md first.

Development setup:

git clone https://github.com/YOUR_ORG/cognate
cd cognate
cargo test --workspace
cargo fmt
cargo clippy --workspace

§Support

  • Documentation: https://docs.rs/cognate-core
  • Examples: See examples/
  • Issues: https://github.com/YOUR_ORG/cognate/issues

§Roadmap

  • Vector store integrations (Qdrant, Pinecone, Weaviate)
  • Additional providers (Groq, Ollama embedded)
  • Streaming cost estimation
  • Advanced caching layer
  • Web dashboard for monitoring

§Cognate

A modular, extensible LLM framework for Rust with multi-provider support, type-safe tools, and RAG capabilities.

§Quick Start

use cognate::prelude::*;

#[tokio::main]
async fn main() {
    let client = cognate::providers::OpenAiProvider::new("sk-...".to_string());
    // Use the client...
}

§Features

  • providers - OpenAI and Anthropic provider support (default)
  • tools - Type-safe tool calling with derive macros (default)
  • prompts - Compile-time validated prompt templates (default)
  • rag - Retrieval-Augmented Generation support
  • axum - Axum web framework integration
  • full - All features

Re-exports§

pub use cognate_axum;
pub use cognate_tools_derive;
pub use cognate_prompts_derive;

Modules§

anthropic
Anthropic provider implementation.
error
Error types for Cognate.
middleware
Tower-inspired middleware system for Cognate providers.
openai
OpenAI provider implementation.
prelude
Prelude module for convenient imports
prompts
Prompt templating
providers
Provider implementations
ratelimit
Token bucket rate limiting implementation
retry
Retry logic with exponential backoff
sse
Server-Sent Events (SSE) streaming parser
tools
Tool calling and execution
types
Re-exports of core types for ergonomic imports.

Structs§

AnthropicProvider
Provider client for the Anthropic API (Claude models).
Document
A document stored in a vector store.
FallbackProvider
A provider that falls back to a secondary provider on retryable errors.
MemoryVectorStore
A simple in-memory VectorStore.
Message
A single message in a conversation.
OpenAiProvider
Provider client for the OpenAI API.
RagPipeline
A high-level RAG pipeline combining an embedding provider with a vector store.
Request
A completion request sent to a provider.
Response
A completed response from a provider.
RetryConfig
Configuration for exponential-backoff retry logic.
ToolExecutor
Executes a request with automatic tool-call dispatch.

Enums§

Error
The main error type for all Cognate operations.

Traits§

Layer
A factory that wraps a Provider to produce a new Provider.
Prompt
A type-safe, compile-time validated prompt template.
Provider
Core trait for all LLM providers.
Tool
The core trait for all callable tools.
VectorStore
A persistent or in-memory store of embedded documents.

Derive Macros§

DerivePromptMacro
Derive the [cognate_prompts::Prompt] trait for a struct.
DeriveToolMacro
Derive the [cognate_tools::Tool] trait for a struct.
Prompt
Derive the [cognate_prompts::Prompt] trait for a struct.
Tool
Derive the [cognate_tools::Tool] trait for a struct.