GroqAI - Rust Client SDK Library
A modern, type-safe Rust SDK for the Groq API with enterprise-grade features, providing lightning-fast AI inference capabilities with comprehensive error handling and built-in resilience.
Features
- 🚀 High Performance - Built for speed with async/await support and efficient HTTP transport
- 💬 Chat Completions - Support for both streaming and non-streaming conversations with advanced message types
- 🎵 Audio Processing - Transcription and translation using Whisper models with file and URL support
- 📁 File Management - Complete file lifecycle management (upload, list, retrieve, delete)
- 🔄 Batch Processing - Efficient bulk operations for large-scale tasks with status monitoring
- 🤖 Model Information - Retrieve available models and their detailed capabilities
- 🎯 Fine-tuning - Custom model training support with supervised learning
- 🛡️ Enterprise Error Handling - Comprehensive error types, automatic retries, and graceful degradation
- 📊 Smart Rate Limiting - Built-in rate limiting with exponential backoff and retry-after header support
- 🔧 Flexible Configuration - Customizable timeouts, proxies, base URLs, and transport settings
- 🔒 Type Safety - Strongly typed API with compile-time guarantees
- 🌐 Proxy Support - Full HTTP/HTTPS proxy support for enterprise environments
- 📝 Rich Message Types - Support for text, images, and multi-part messages
- 🔄 Conversation Management - Built-in conversation history management with token optimization
Quick Start
Installation
Add this to your Cargo.toml:
[]
= "0.1.8"
= { = "1.47", = ["full"] }
= { = "1.0", = ["derive"] }
Or install via cargo:
Basic Usage
use ;
async
API Reference
Chat Completions
Non-streaming Chat
use ;
let client = new?.build?;
let messages = vec!;
let response = client
.chat
.messages
.temperature
.send
.await?;
println!;
Streaming Chat
use StreamExt;
let mut stream = client
.chat
.messages
.stream
.send_stream
.await?;
while let Some = stream.next.await
Audio Processing
Transcription
use AudioTranscriptionRequest;
use PathBuf;
let request = AudioTranscriptionRequest ;
let transcription = client.audio.transcribe.await?;
println!;
Translation
use AudioTranslationRequest;
let request = AudioTranslationRequest ;
let translation = client.audio.translate.await?;
println!;
File Management
use FileCreateRequest;
use PathBuf;
// Upload a file
let request = new?;
let file = client.files.create.await?;
// List files
let files = client.files.list.await?;
for file in files.data
// Retrieve a file
let file = client.files.retrieve.await?;
// Delete a file
let deletion = client.files.delete.await?;
Batch Processing
use BatchCreateRequest;
// Create a batch job
let request = BatchCreateRequest ;
let batch = client.batches.create.await?;
println!;
// Check batch status
let batch = client.batches.retrieve.await?;
println!;
// List batches
let batches = client.batches.list.await?;
Model Information
// List available models
let models = client.models.list.await?;
for model in models.data
// Get model details
let model = client.models.retrieve.await?;
println!;
Configuration
Custom Configuration
use Duration;
use Url;
let client = new?
.base_url
.timeout
.build?;
Using Proxy
let proxy = http?;
let client = new?
.proxy
.build?;
Error Handling
The library provides comprehensive error handling through the GroqError enum:
use GroqError;
match client.chat.messages.send.await
Supported Models
The SDK supports all current Groq models with built-in type safety:
Chat Models
- Llama 3.1 Series:
llama-3.1-8b-instant- Fast responses for simple tasksllama-3.1-70b-versatile- Balanced performance and capabilityllama-3.1-405b-reasoning- Advanced reasoning and complex tasksllama-3.3-70b-versatile- Latest model with enhanced capabilities
- Mixtral:
mixtral-8x7b-32768- Large context window for complex conversations - Gemma:
gemma2-9b-it- Efficient instruction-tuned model - Qwen:
qwen2.5-72b-instruct- Multilingual capabilities
Audio Models
- Whisper:
whisper-large-v3- State-of-the-art speech recognition and translation
Model Selection Helper
use KnownModel;
// Type-safe model selection
let model = Llama3_1_70bVersatile;
let response = client.chat.send.await?;
Rate Limiting
The client includes built-in rate limiting with exponential backoff:
// Rate limiting is handled automatically
let response = client.chat.messages.send.await?;
Advanced Features
Multi-Modal Messages
use ;
let messages = vec!;
Conversation History Management
// Built-in conversation management with token optimization
let mut conversation = Vecnew;
conversation.push;
// Automatic history trimming to stay within token limits
trim_conversation_history;
Enterprise Proxy Configuration
use Proxy;
let proxy = all?
.basic_auth;
let client = new?
.proxy
.timeout
.build?;
Examples
Check out the examples/ directory for comprehensive examples:
cli_chat.rs- Interactive CLI chat application with streaming supportchat_completion.rs- Basic chat completionstreaming_chat.rs- Streaming responsesaudio_transcription.rs- Audio processingbatch_processing.rs- Batch operationsfile_management.rs- File operationsmodel_info.rs- Model information and capabilities
Requirements
- Rust 1.70 or later
- A valid Groq API key (get one at console.groq.com)
Project Status
This SDK is actively maintained and production-ready. Current version: 0.1.8
Roadmap
- ✅ Chat Completions (streaming & non-streaming)
- ✅ Audio Transcription & Translation
- ✅ File Management
- ✅ Batch Processing
- ✅ Model Information
- ✅ Fine-tuning Support
- ✅ Enterprise Features (proxy, rate limiting)
- 🔄 Function Calling (in progress)
- 📋 Vision API enhancements
- 📋 Advanced streaming features
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Groq for providing the lightning-fast AI inference API
- The Rust community for excellent async and HTTP libraries
- Contributors and users who help improve this SDK
Architecture
The SDK is built with a modular architecture:
- Transport Layer (
transport.rs) - HTTP client with retry logic and rate limiting - API Modules (
api/) - Endpoint-specific implementations for each Groq service - Type System (
types.rs) - Strongly typed request/response structures - Error Handling (
error.rs) - Comprehensive error types with context - Rate Limiting (
rate_limit.rs) - Smart rate limiting with exponential backoff - Client Builder (
client.rs) - Flexible client configuration
Performance Considerations
- Async/Await: Built on Tokio for high-performance async operations
- Connection Pooling: Reuses HTTP connections for better performance
- Streaming: Efficient streaming for real-time applications
- Memory Management: Optimized for low memory footprint
- Rate Limiting: Prevents API quota exhaustion with smart backoff
Security Features
- API Key Validation: Validates API key format at build time
- HTTPS Only: All communications use TLS encryption
- Proxy Support: Full support for corporate proxy environments
- Error Sanitization: Sensitive data is not logged in error messages
Testing
The SDK includes comprehensive tests:
# Run all tests
# Run specific test modules
Support
- 📖 Documentation
- 🐛 Issue Tracker
- 💬 Discussions
- 📧 Author
Note: This is an unofficial client SDK. For official support, please refer to the Groq documentation.