api_xai
Comprehensive Rust client for X.AI's Grok API with enterprise reliability features.
🎯 Architecture: Stateless HTTP Client
This API crate is designed as a stateless HTTP client with zero persistence requirements. It provides:
- Direct HTTP calls to the X.AI Grok API
- In-memory operation state only (resets on restart)
- No external storage dependencies (databases, files, caches)
- No configuration persistence beyond environment variables
This ensures lightweight, containerized deployments and eliminates operational complexity.
🏛️ Governing Principle: "Thin Client, Rich API"
Expose all server-side functionality transparently while maintaining zero client-side intelligence or automatic behaviors.
Key principles:
- API Transparency: One-to-one mapping with X.AI Grok API endpoints
- Zero Automatic Behavior: No implicit decision-making or magic thresholds
- Explicit Control: Developer decides when, how, and why operations occur
- Configurable Reliability: Enterprise features available through explicit configuration
Scope
In Scope
- Chat completions (single and multi-turn)
- Streaming responses (Server-Sent Events)
- Tool/function calling
- Model listing and details
- Enterprise reliability (retry, circuit breaker, rate limiting, failover)
- Health checks (liveness/readiness probes)
- Token counting (local, using tiktoken)
- Response caching (LRU)
- Input validation
- CURL diagnostics for debugging
- Batch operations (parallel request orchestration)
- Performance metrics (Prometheus)
- Synchronous API wrapper
Out of Scope
- Vision/multimodal (no XAI API support)
- Audio processing (no XAI API support)
- Embeddings (no XAI API support)
- Safety settings/content moderation (no XAI API endpoints)
- Model tuning/deployment (no XAI API support)
- WebSocket streaming (XAI uses SSE only)
Features
Core Capabilities:
- Chat completions with full conversational support
- SSE streaming responses
- Complete function/tool calling integration
- Model management (list, retrieve)
Enterprise Reliability:
- Retry logic with exponential backoff and jitter
- Circuit breaker for failure threshold management
- Rate limiting with token bucket algorithm
- Multi-endpoint failover rotation
- Kubernetes-style health checks
- Structured logging with tracing
Client-Side Enhancements:
- Token counting using tiktoken (GPT-4 encoding)
- LRU response caching
- Request parameter validation
- CURL command generation for debugging
- Parallel batch processing
- Prometheus metrics collection
Installation
Add to your Cargo.toml:
[]
= { = "0.1.0", = ["full"] }
Quick Start
Basic Chat
use ;
async
Streaming Chat
use ;
use StreamExt;
async
Authentication
Option 1: Workspace Secret (Recommended)
Create secret/-secrets.sh in your project root:
#!/bin/bash
Option 2: Environment Variable
The crate uses workspace_tools for secret management with automatic fallback chain:
- Workspace secrets (
./secret/-secrets.sh) - Alternative files (
secrets.sh,.env) - Environment variable
Feature Flags
Core Features
enabled- Master switch for core functionalitystreaming- SSE streaming supporttool_calling- Function calling and tools
Enterprise Reliability
retry- Exponential backoff retry logiccircuit_breaker- Circuit breaker patternrate_limiting- Token bucket rate limitingfailover- Multi-endpoint failoverhealth_checks- Health monitoringstructured_logging- Tracing integration
Client-Side Enhancements
count_tokens- Local token counting (requires: tiktoken-rs)caching- Response caching (requires: lru)input_validation- Request validationcurl_diagnostics- Debug utilitiesbatch_operations- Parallel processingperformance_metrics- Metrics collection (requires: prometheus)sync_api- Sync wrappers
Presets
full- All features enabled (default)
Testing
Test Coverage
- 122 doc tests passing
- 107 integration tests passing
- 229 total tests with real API validation
- No-mockup policy: all tests use real API calls
Documentation
- API Reference - Complete API documentation
- OpenAPI Summary - Endpoint reference
- Specification - Detailed project specification
- Examples - Real-world usage examples
Dependencies
- reqwest: HTTP client with async support
- tokio: Async runtime
- serde: Serialization/deserialization
- workspace_tools: Secret management
- error_tools: Unified error handling
- tiktoken-rs: Token counting (optional)
- lru: Response caching (optional)
- prometheus: Metrics collection (optional)
All dependencies workspace-managed for consistency.
OpenAI Compatibility
The X.AI Grok API is OpenAI-compatible, using the same REST endpoint patterns and request/response formats. Token counting uses GPT-4 encoding (cl100k_base) via tiktoken for accurate counts.
Contributing
- Follow established patterns in existing code
- Use 2-space indentation consistently
- Add tests for new functionality
- Update documentation for public APIs
- Ensure zero clippy warnings:
cargo clippy -- -D warnings - Follow zero-tolerance mock policy (real API integration only)
- Follow the "Thin Client, Rich API" principle
License
MIT
Links
- Specification - Technical specification
- Examples - Usage examples
- API Reference - Complete documentation