instructors 1.3.3

Type-safe structured output extraction from LLMs. The Rust instructor.
Documentation

instructors

Crates.io docs.rs License Downloads MSRV

English | 简体中文 | 日本語

Type-safe structured output extraction from LLMs. The Rust instructor.

Define a Rust struct → instructors generates the JSON Schema → LLM returns valid JSON → you get a typed value. With automatic validation and retry.

Highlights

  • 6 providers — OpenAI, Anthropic, Gemini, DeepSeek, Together, any OpenAI/Anthropic/Gemini-compatible API
  • Validation + retry — invalid output is fed back to the LLM with error details for automatic correction
  • JSON auto-repair — fixes trailing commas, single quotes, markdown fences before retry, saving tokens and latency
  • Provider fallback — chain multiple providers for automatic failover
  • Streaming — partial JSON tokens as they arrive (OpenAI, Anthropic, Gemini)
  • Batch + concurrent — process hundreds of prompts with configurable concurrency via Semaphore
  • Vision — extract structured data from images (URL or base64)
  • Cost tracking — per-request token count and cost estimation via tiktoken

Why instructors?

Aspect Raw API + serde instructors
Schema enforcement Manual JSON Schema writing Auto-generated from #[derive(JsonSchema)]
Parse failures Crash or silent data loss Automatic retry with error feedback to LLM
Malformed JSON Application error Auto-repaired before deserialization
Multiple providers Rewrite per provider One interface, swap with one line
Validation Manual if/else after parsing .validate() with LLM-aware retry
Cost tracking Count tokens yourself Built-in via tiktoken

Quick Start

use instructors::prelude::*;

#[derive(Debug, Deserialize, JsonSchema)]
struct Contact { name: String, email: Option<String> }

let client = Client::openai("sk-...");
let contact: Contact = client
    .extract("Contact John Doe at john@example.com")
    .await?.value;

Installation

[dependencies]
instructors = "1"

Providers

Provider Constructor Mechanism
OpenAI Client::openai(key) response_format strict JSON Schema
Anthropic Client::anthropic(key) tool_use with forced tool choice
OpenAI-compatible Client::openai_compatible(key, url) Same as OpenAI (DeepSeek, Together, etc.)
Anthropic-compatible Client::anthropic_compatible(key, url) Same as Anthropic
Google Gemini Client::gemini(key) response_schema structured JSON
Gemini-compatible Client::gemini_compatible(key, url) Same as Gemini
// OpenAI
let client = Client::openai("sk-...");

// Anthropic
let client = Client::anthropic("sk-ant-...");

// DeepSeek, Together, or any OpenAI-compatible API
let client = Client::openai_compatible("sk-...", "https://api.deepseek.com/v1");

// Anthropic-compatible proxy
let client = Client::anthropic_compatible("sk-...", "https://proxy.example.com/v1");

// Google Gemini
let client = Client::gemini("AIza...");

// Gemini-compatible proxy
let client = Client::gemini_compatible("AIza...", "https://proxy.example.com/v1beta");

Streaming

Stream partial JSON tokens as they arrive:

let result = client.extract::<Contact>("...")
    .on_stream(|chunk| {
        print!("{chunk}");  // partial JSON fragments
    })
    .await?;

All three providers (OpenAI, Anthropic, Gemini) support streaming. The final result is assembled from all chunks and deserialized as usual.

Image Input

Extract structured data from images using vision-capable models:

use instructors::ImageInput;

// from URL
let result = client.extract::<Description>("Describe this image")
    .image(ImageInput::Url("https://example.com/photo.jpg".into()))
    .model("gpt-4o")
    .await?;

// from base64
let result = client.extract::<Description>("Describe this image")
    .image(ImageInput::Base64 {
        media_type: "image/png".into(),
        data: base64_string,
    })
    .await?;

// multiple images
let result = client.extract::<Comparison>("Compare these images")
    .images(vec![
        ImageInput::Url("https://example.com/a.jpg".into()),
        ImageInput::Url("https://example.com/b.jpg".into()),
    ])
    .await?;

Provider Fallback

Chain multiple providers for automatic failover:

let client = Client::openai("sk-...")
    .with_fallback(Client::anthropic("sk-ant-..."))
    .with_fallback(Client::openai_compatible("sk-...", "https://api.deepseek.com/v1"));

// tries OpenAI first → Anthropic on failure → DeepSeek as last resort
let result = client.extract::<Contact>("...").await?;

Each fallback is tried in order after the primary provider exhausts its retries.

Validation

Validate extracted data with automatic retry — invalid results are fed back to the LLM with error details.

Closure-based

let user: User = client.extract("...")
    .validate(|u: &User| {
        if u.age > 150 { Err("age must be <= 150".into()) } else { Ok(()) }
    })
    .await?.value;

Trait-based

use instructors::prelude::*;

#[derive(Debug, Deserialize, JsonSchema)]
struct Email { address: String }

impl Validate for Email {
    fn validate(&self) -> Result<(), ValidationError> {
        if self.address.contains('@') { Ok(()) }
        else { Err("invalid email".into()) }
    }
}

let email: Email = client.extract("...").validated().await?.value;

List Extraction

Extract multiple items from text with extract_many:

#[derive(Debug, Deserialize, JsonSchema)]
struct Entity {
    name: String,
    entity_type: String,
}

let entities: Vec<Entity> = client
    .extract_many("Apple CEO Tim Cook met Google CEO Sundar Pichai")
    .await?.value;

Batch Processing

Process multiple prompts concurrently with configurable concurrency:

let prompts = vec!["review 1".into(), "review 2".into(), "review 3".into()];

let results = client
    .extract_batch::<Review>(prompts)
    .concurrency(5)
    .validate(|r: &Review| { /* ... */ Ok(()) })
    .run()
    .await;

// each result is independent — partial failures don't affect others
for result in results {
    match result {
        Ok(r) => println!("{:?}", r.value),
        Err(e) => eprintln!("failed: {e}"),
    }
}

Multi-turn Conversations

Pass message history for context-aware extraction:

use instructors::Message;

let result = client.extract::<Summary>("summarize the above")
    .messages(vec![
        Message::user("Here is a long document..."),
        Message::assistant("I see the document."),
    ])
    .await?;

Classification

Enums work naturally for classification tasks:

#[derive(Debug, Deserialize, JsonSchema)]
enum Sentiment { Positive, Negative, Neutral }

let sentiment: Sentiment = client
    .extract("This product is amazing!")
    .await?.value;

Nested Types

Complex nested structures with vectors, options, and enums:

#[derive(Debug, Deserialize, JsonSchema)]
struct Paper {
    title: String,
    authors: Vec<Author>,
    keywords: Vec<String>,
}

#[derive(Debug, Deserialize, JsonSchema)]
struct Author {
    name: String,
    affiliation: Option<String>,
}

let paper: Paper = client.extract(&pdf_text).model("gpt-4o").await?.value;

Retry & Timeout

Enable exponential backoff on HTTP 429/503 errors and set an overall request timeout:

use std::time::Duration;
use instructors::BackoffConfig;

let client = Client::openai("sk-...")
    .with_retry_backoff(BackoffConfig::default())  // 500ms base, 30s cap, 3 retries
    .with_timeout(Duration::from_secs(120));        // overall timeout

// per-request override
let result = client.extract::<Contact>("...")
    .retry_backoff(BackoffConfig {
        base_delay: Duration::from_millis(200),
        max_delay: Duration::from_secs(10),
        jitter: true,
        max_http_retries: 5,
    })
    .timeout(Duration::from_secs(30))
    .await?;

Without backoff configured, HTTP 429/503 errors fail immediately (default behavior unchanged).

Configuration

let result: MyStruct = client
    .extract("input text")
    .model("gpt-4o-mini")            // override model
    .system("You are an expert...")   // custom system prompt
    .temperature(0.0)                 // deterministic output
    .max_tokens(2048)                 // limit output tokens
    .max_retries(3)                   // retry on parse/validation failure
    .context("extra context...")      // append to prompt
    .retry_backoff(BackoffConfig::default()) // HTTP 429/503 backoff
    .timeout(Duration::from_secs(30))        // overall timeout
    .await?
    .value;

Client Defaults

Set defaults once, override per-request:

let client = Client::openai("sk-...")
    .with_model("gpt-4o-mini")
    .with_temperature(0.0)
    .with_max_retries(3)
    .with_system("Extract data precisely.");

// all extractions use the defaults above
let a: TypeA = client.extract("...").await?.value;
let b: TypeB = client.extract("...").await?.value;

// override for a specific request
let c: TypeC = client.extract("...").model("gpt-4o").await?.value;

Cost Tracking

Built-in token counting and cost estimation via tiktoken:

let result = client.extract::<Contact>("...").await?;

println!("input:  {} tokens", result.usage.input_tokens);
println!("output: {} tokens", result.usage.output_tokens);
println!("cost:   ${:.6}", result.usage.cost.unwrap_or(0.0));
println!("retries: {}", result.usage.retries);

Disable with default-features = false:

[dependencies]
instructors = { version = "1", default-features = false }

JSON Repair

When an LLM returns malformed JSON — trailing commas, single quotes, unquoted keys, markdown code fences, etc. — instructors automatically attempts to repair the output before deserialization. If repair succeeds, the fixed JSON is parsed directly without burning a retry. This saves both tokens and latency, especially with smaller or open-source models that are more likely to produce slightly broken output.

Repair is transparent: you don't need to configure anything. It runs on every response before serde_json parsing, and falls back to the normal retry path if the output can't be fixed.

Lifecycle Hooks

Observe requests and responses:

let result = client.extract::<Contact>("...")
    .on_request(|model, prompt| {
        println!("[req] model={model}, prompt_len={}", prompt.len());
    })
    .on_response(|usage| {
        println!("[res] tokens={}, cost={:?}", usage.total_tokens, usage.cost);
    })
    .await?;

How It Works

  1. #[derive(JsonSchema)] generates a JSON Schema from your Rust type (via schemars)
  2. Schema is cached per type (thread-local, zero lock contention)
  3. The schema is transformed for the target provider:
    • OpenAI: wrapped in response_format with strict mode (additionalProperties: false, all fields required)
    • Anthropic: wrapped as a tool with input_schema, forced via tool_choice
    • Gemini: passed as response_schema with response_mime_type: "application/json"
  4. LLM is constrained to produce valid JSON matching the schema
  5. Response JSON is automatically repaired if malformed (trailing commas, single quotes, unquoted keys, markdown fences)
  6. Response is deserialized with serde_json::from_str::<T>()
  7. If Validate trait or .validate() closure is present, validation runs
  8. On parse/validation failure, error feedback is sent back and the request is retried

Ecosystem

instructors is part of airs (AI in Rust Series):

Crate Description
tiktoken High-performance BPE tokenizer for all mainstream LLMs
embedrs Unified embedding — cloud APIs + local inference, one interface
chunkedrs AI-native text chunking — recursive, markdown, semantic

License

MIT