Secretary
Secretary is a Rust library that transforms natural language into structured data using large language models (LLMs). With its powerful derive macro system, you can extract structured information from unstructured text with minimal boilerplate code.
Features
- ๐ Unified Task Trait: Single trait combining data extraction, schema definition, and system prompt generation with
#[derive(Task)] - ๐ Schema-Based Extraction: Define your data structure using Rust structs with field-level instructions
- ๐ Context-Aware Conversations: Maintain conversation state for multi-turn interactions
- ๐ Declarative Field Instructions: Use
#[task(instruction = "...")]attributes to guide extraction - โก Async Support: Built-in async/await support for concurrent processing
- ๐ Extensible LLM Support: Currently supports OpenAI API with more providers planned
- ๐ก๏ธ Type Safety: Leverage Rust's type system for reliable data extraction
- ๐งน Simplified API: Consolidated traits reduce boilerplate and complexity
Quick Start
Basic Example
use Task;
use OpenAILLM;
use GenerateData;
use ;
// Define your data structure with extraction instructions
How It Works
- Define Your Schema: Create a Rust struct with
#[derive(Task)]and field-level instructions - Add Required Fields: Include
contextandadditional_instructionsfields (marked with#[serde(skip)]) - Annotate Fields: Use
#[task(instruction = "...")]to guide the LLM on how to extract each field - Automatic Implementation: The derive macro implements all necessary traits (data model, system prompt generation, context management)
- Create Task Instance: Initialize with
YourStruct::new(additional_instructions) - Process Text: Send natural language input to an LLM through the Secretary API
- Get Structured Data: Receive JSON that can be parsed back into your struct
Field Instructions
The #[task(instruction = "...")] attribute tells the LLM how to extract each field:
Advanced Features
Async Processing
Secretary provides full async support for concurrent processing:
use AsyncGenerateData;
use tokio;
async
Context-Aware Conversations
Maintain conversation state for multi-turn interactions:
use Role;
System Prompt Generation
The derive macro automatically generates comprehensive system prompts:
let task = new;
let prompt = task.get_system_prompt;
println!;
// Output includes:
// - JSON structure specification
// - Field-specific instructions
// - Additional instructions
// - Formatting guidelines
Examples
The examples/ directory contains practical demonstrations:
Basic Usage
sync.rs- Basic person information extraction using synchronous APIasync.rs- Async product information extraction with comprehensive testing
Run examples with:
# Basic synchronous example
# Async example with comprehensive testing
# To test with real API, set environment variables:
Environment Setup
For production use with OpenAI:
In your code:
let api_base = var
.expect;
let api_key = var
.expect;
let model = var
.expect;
let llm = new?;
API Reference
Core Traits
| Trait | Purpose | Key Methods |
|---|---|---|
Task |
Main trait for data extraction tasks | new(), get_system_prompt(), push() |
GenerateData |
Synchronous LLM interaction | generate_data(), generate_data_with_context() |
AsyncGenerateData |
Asynchronous LLM interaction | async_generate_data(), async_generate_data_with_context() |
IsLLM |
LLM provider abstraction | access_client(), access_model() |
ToJSON/FromJSON |
Serialization utilities | to_json(), from_json() |
Derive Macro Attributes
#[derive(Task)]- Implements the Task trait automatically#[task(instruction = "...")]- Provides field-specific extraction instructions#[serde(skip)]- Required forcontextandadditional_instructionsfields
Troubleshooting
Common Issues
"Failed to execute function" Error
- Check your API key and endpoint configuration
- Verify network connectivity
- Ensure the model name is correct
Serialization Errors
- Ensure all data fields implement
SerializeandDeserialize - Check that field types match the expected JSON structure
- Verify that optional fields are properly handled
Context Management Issues
- Remember to include required fields:
contextandadditional_instructions - Mark these fields with
#[serde(skip)] - Use
push()method to add messages to context
Performance Tips
- Use async methods for concurrent processing
- Batch multiple requests when possible
- Consider caching LLM responses for repeated queries
- Use specific field instructions to improve extraction accuracy
Roadmap
- Support for additional LLM providers (Anthropic, Azure OpenAI, etc.)
- Enhanced error handling and validation
- Performance optimizations and caching
- Integration with more serialization formats
- Advanced prompt engineering features
- Streaming response support
Contributing
Contributions are welcome!
License
This project is licensed under the MIT License - see the LICENSE file for details.