# Cognee LLM
LLM abstraction layer for Cognee with support for structured output generation.
## Features
- **Async-first**: All operations are async, supporting both API calls and local inference
- **Structured outputs**: Generate type-safe structured data (e.g., knowledge graphs) from text
- **JSON Schema generation**: Automatic schema generation from Rust types using `schemars`
- **Provider-agnostic**: Trait-based design supports OpenAI, Anthropic, Ollama, local models, etc.
- **Configuration**: Flexible configuration with sensible defaults
## Usage
### OpenAI Adapter
```rust
use cognee_llm::{Llm, OpenAIAdapter, GenerationOptions};
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, JsonSchema)]
struct KnowledgeGraph {
nodes: Vec<Node>,
edges: Vec<Edge>,
}
// Create OpenAI adapter
let llm = OpenAIAdapter::new(
"gpt-4",
"sk-...", // Your API key
None, // Use default OpenAI base URL
)?;
// Generate structured output
let graph: KnowledgeGraph = llm.create_structured_output(
"Alice told Bob to bring documents.",
"Extract a knowledge graph with nodes and edges.",
Some(GenerationOptions {
temperature: Some(0.0),
max_tokens: Some(1000),
..Default::default()
}),
).await?;
```
### Custom Base URL (for OpenAI-compatible APIs)
```rust
// Use with Ollama, LocalAI, or other OpenAI-compatible services
let llm = OpenAIAdapter::new(
"llama3.2:3b",
"not-needed", // Some services don't require API key
Some("http://localhost:11435/v1".to_string()),
)?;
```
**Note:** The adapter automatically detects API capabilities:
- **OpenAI/Azure**: Uses function calling for structured outputs (more reliable)
- **Ollama/LocalAI**: Automatically falls back to JSON mode with example-based prompts
- No configuration needed - it just works with both!
### Basic Trait Definition
```rust
use cognee_llm::{Llm, Message, GenerationOptions};
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, JsonSchema)]
struct KnowledgeGraph {
nodes: Vec<Node>,
edges: Vec<Edge>,
}
// Implement the Llm trait for your provider
let llm: Box<dyn Llm> = ...;
// Generate structured output
let graph: KnowledgeGraph = llm.create_structured_output(
"Alice told Bob to bring documents.",
"Extract a knowledge graph with nodes and edges.",
None,
).await?;
```
### JSON Schema Generation
The crate automatically generates JSON schemas from your Rust types to guide the LLM:
```rust
use cognee_llm::schema::{generate_json_schema, build_schema_prompt};
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, JsonSchema)]
struct Person {
name: String,
age: u32,
email: Option<String>,
}
// Generate schema as JSON value
let schema = generate_json_schema::<Person>();
// Or build a complete prompt with schema embedded
let prompt = build_schema_prompt::<Person>(
"Extract the person's information from the text."
);
```
### With Retry Logic
```rust
use cognee_llm::{Llm, LlmError};
use cognee_utils::retry::{retry_with_backoff, RetryConfig, RetryDecision};
let retry_config = RetryConfig::new(3, 100, 5000);
let graph: KnowledgeGraph = retry_with_backoff(
retry_config,
|| llm.create_structured_output(
"Alice told Bob to bring the documents.",
"Extract entities and relationships.",
None,
),
|error| match error {
LlmError::NetworkError(_) | LlmError::RateLimitExceeded(_) => RetryDecision::Retry,
LlmError::ContentPolicyViolation(_) => RetryDecision::Abort,
_ => RetryDecision::Retry,
},
).await?;
```
## Architecture
The crate provides:
- **`Llm` trait**: Core async trait with structured output generation
- **OpenAI adapter**: Production-ready implementation using OpenAI's function calling API
- **JSON Schema generation**: `schemars`-based schema generation from Rust types
- **Schema utilities**: Helper functions to generate schemas and build prompts
- **Configuration types**: `LlmConfig`, `LlmProvider`, `GenerationOptions`
- **Type-safe responses**: Generic over `T: Serialize + DeserializeOwned + JsonSchema`
- **Comprehensive errors**: `LlmError` covers API, network, serialization, rate limit errors
## Implementation Details
### OpenAI Adapter
The `OpenAIAdapter` uses a dual-strategy approach for structured outputs:
**Primary (Function Calling):**
1. **Schema Generation**: Automatically generates JSON schema from your Rust type using `schemars`
2. **Function Definition**: Creates an OpenAI function with the schema as parameters
3. **Forced Execution**: Sets `function_call: {name: "extract_structured_data"}` to force the model to use the function
4. **Validation**: Parses and validates the function call arguments into your type
**Fallback (JSON Mode):**
1. **Automatic Detection**: If function calling isn't supported, automatically switches to JSON mode
2. **Example Generation**: Creates example JSON from the schema (clearer than full schema for LLMs)
3. **Response Format**: Sets `response_format: {"type": "json_object"}` for JSON-only responses
4. **Content Parsing**: Parses the JSON from the response content
This dual approach provides:
- **Universal compatibility**: Works with OpenAI, Azure OpenAI, Ollama, LocalAI, and others
- **High reliability**: Function calling for best results, JSON mode for broad compatibility
- **Type safety**: Compile-time guarantees about response structure
- **Zero configuration**: Automatic detection and fallback
### Adding New Adapters
To add support for other providers:
1. Create a new module in `src/adapters/`
2. Implement the `Llm` trait
3. Use `generate_json_schema::<T>()` to get the schema
4. Adapt the schema to the provider's format (function calling, JSON mode, etc.)
5. Parse the response into type `T`
See `src/adapters/openai.rs` as a reference implementation.
## Testing
### Unit Tests
Run the unit tests:
```bash
cargo test --package cognee-llm --lib
```
### Integration Tests
The crate includes integration tests that exercise a real OpenAI-compatible
endpoint (OpenAI, or a local Ollama instance via its OpenAI-compatible API):
```bash
# Set environment variables
export OPENAI_URL="http://localhost:11435/v1"
export OPENAI_TOKEN="not-needed"
export OPENAI_MODEL="llama3.2:3b"
# Run integration tests
cargo test --package cognee-llm --test integration_openai -- --nocapture
```
Tests will automatically skip if environment variables are not set.
## Adapters
Implemented:
- **`OpenAIAdapter`** — OpenAI-compatible APIs (also works with Ollama/vLLM/LocalAI)
On-device LiteRT inference (Android) ships in the closed companion crate
`cognee-llm-litert`.
Planned:
- Anthropic adapter (Claude with tool use)
- Streaming support: real-time token streaming for all adapters