Cognee LLM

LLM abstraction layer for Cognee with support for structured output generation.

Features

Async-first: All operations are async, supporting both API calls and local inference
Structured outputs: Generate type-safe structured data (e.g., knowledge graphs) from text
JSON Schema generation: Automatic schema generation from Rust types using schemars
Provider-agnostic: Trait-based design supports OpenAI, Anthropic, Ollama, local models, etc.
Configuration: Flexible configuration with sensible defaults

Usage

OpenAI Adapter

use cognee_llm::{Llm, OpenAIAdapter, GenerationOptions};
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize, JsonSchema)]
struct KnowledgeGraph {
    nodes: Vec<Node>,
    edges: Vec<Edge>,
}

// Create OpenAI adapter
let llm = OpenAIAdapter::new(
    "gpt-4",
    "sk-...",  // Your API key
    None,      // Use default OpenAI base URL
)?;

// Generate structured output
let graph: KnowledgeGraph = llm.create_structured_output(
    "Alice told Bob to bring documents.",
    "Extract a knowledge graph with nodes and edges.",
    Some(GenerationOptions {
        temperature: Some(0.0),
        max_tokens: Some(1000),
        ..Default::default()
    }),
).await?;

Custom Base URL (for OpenAI-compatible APIs)

// Use with Ollama, LocalAI, or other OpenAI-compatible services
let llm = OpenAIAdapter::new(
    "llama3.2:3b",
    "not-needed",  // Some services don't require API key
    Some("http://localhost:11435/v1".to_string()),
)?;

Note: The adapter automatically detects API capabilities:

OpenAI/Azure: Uses function calling for structured outputs (more reliable)
Ollama/LocalAI: Automatically falls back to JSON mode with example-based prompts
No configuration needed - it just works with both!

Basic Trait Definition

use cognee_llm::{Llm, Message, GenerationOptions};
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize, JsonSchema)]
struct KnowledgeGraph {
    nodes: Vec<Node>,
    edges: Vec<Edge>,
}

// Implement the Llm trait for your provider
let llm: Box<dyn Llm> = ...;

// Generate structured output
let graph: KnowledgeGraph = llm.create_structured_output(
    "Alice told Bob to bring documents.",
    "Extract a knowledge graph with nodes and edges.",
    None,
).await?;

JSON Schema Generation

The crate automatically generates JSON schemas from your Rust types to guide the LLM:

use cognee_llm::schema::{generate_json_schema, build_schema_prompt};
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize, JsonSchema)]
struct Person {
    name: String,
    age: u32,
    email: Option<String>,
}

// Generate schema as JSON value
let schema = generate_json_schema::<Person>();

// Or build a complete prompt with schema embedded
let prompt = build_schema_prompt::<Person>(
    "Extract the person's information from the text."
);

With Retry Logic

use cognee_llm::{Llm, LlmError};
use cognee_utils::retry::{retry_with_backoff, RetryConfig, RetryDecision};

let retry_config = RetryConfig::new(3, 100, 5000);

let graph: KnowledgeGraph = retry_with_backoff(
    retry_config,
    || llm.create_structured_output(
        "Alice told Bob to bring the documents.",
        "Extract entities and relationships.",
        None,
    ),
    |error| match error {
        LlmError::NetworkError(_) | LlmError::RateLimitExceeded(_) => RetryDecision::Retry,
        LlmError::ContentPolicyViolation(_) => RetryDecision::Abort,
        _ => RetryDecision::Retry,
    },
).await?;

Architecture

The crate provides:

Llm trait: Core async trait with structured output generation
OpenAI adapter: Production-ready implementation using OpenAI's function calling API
JSON Schema generation: schemars-based schema generation from Rust types
Schema utilities: Helper functions to generate schemas and build prompts
Configuration types: LlmConfig, LlmProvider, GenerationOptions
Type-safe responses: Generic over T: Serialize + DeserializeOwned + JsonSchema
Comprehensive errors: LlmError covers API, network, serialization, rate limit errors

Implementation Details

OpenAI Adapter

The OpenAIAdapter uses a dual-strategy approach for structured outputs:

Primary (Function Calling):

Schema Generation: Automatically generates JSON schema from your Rust type using schemars
Function Definition: Creates an OpenAI function with the schema as parameters
Forced Execution: Sets function_call: {name: "extract_structured_data"} to force the model to use the function
Validation: Parses and validates the function call arguments into your type

Fallback (JSON Mode):

Automatic Detection: If function calling isn't supported, automatically switches to JSON mode
Example Generation: Creates example JSON from the schema (clearer than full schema for LLMs)
Response Format: Sets response_format: {"type": "json_object"} for JSON-only responses
Content Parsing: Parses the JSON from the response content

This dual approach provides:

Universal compatibility: Works with OpenAI, Azure OpenAI, Ollama, LocalAI, and others
High reliability: Function calling for best results, JSON mode for broad compatibility
Type safety: Compile-time guarantees about response structure
Zero configuration: Automatic detection and fallback

Adding New Adapters

To add support for other providers:

Create a new module in src/adapters/
Implement the Llm trait
Use generate_json_schema::<T>() to get the schema
Adapt the schema to the provider's format (function calling, JSON mode, etc.)
Parse the response into type T

See src/adapters/openai.rs as a reference implementation.

Testing

Unit Tests

Run the unit tests:

cargo test --package cognee-llm --lib

Integration Tests

The crate includes integration tests that exercise a real OpenAI-compatible endpoint (OpenAI, or a local Ollama instance via its OpenAI-compatible API):

# Set environment variables
export OPENAI_URL="http://localhost:11435/v1"
export OPENAI_TOKEN="not-needed"
export OPENAI_MODEL="llama3.2:3b"

# Run integration tests
cargo test --package cognee-llm --test integration_openai -- --nocapture

Tests will automatically skip if environment variables are not set.

Adapters

Implemented:

OpenAIAdapter — OpenAI-compatible APIs (also works with Ollama/vLLM/LocalAI)

On-device LiteRT inference (Android) ships in the closed companion crate cognee-llm-litert.

Planned:

Anthropic adapter (Claude with tool use)
Streaming support: real-time token streaming for all adapters

cognee-llm 0.1.1

Cognee LLM

Features

Usage

OpenAI Adapter

Custom Base URL (for OpenAI-compatible APIs)

Basic Trait Definition

JSON Schema Generation

With Retry Logic

Architecture

Implementation Details

OpenAI Adapter

Adding New Adapters

Testing

Unit Tests

Integration Tests

Adapters