# X.AI Grok API Research Summary
**Date:** 2025-11-08
**Researcher:** AI Assistant
**Purpose:** Pre-development API discovery for api_xai crate implementation
---
## Key Findings
### ✅ Specification Accuracy Verified
The specification created in `/api/xai/spec.md` accurately predicted the X.AI API structure:
| Base URL | `https://api.x.ai/v1` | `https://api.x.ai/v1` | ✅ Correct |
| Default Model | `grok-beta` | `grok-beta` (Grok 4 Fast) | ✅ Correct |
| API Key Prefix | `xai-` | `xai-` | ✅ Correct |
| OpenAI Compatible | Yes | Yes (full compatibility) | ✅ Correct |
| Authentication | Bearer Token | Bearer Token | ✅ Correct |
| Streaming | SSE | SSE | ✅ Correct |
| Function Calling | Yes | Yes (+ agentic tools) | ✅ Correct + Bonus |
### 🎯 API Highlights
**Unique Features:**
1. **Agentic Server-Side Tool Calling** - Model autonomously executes tools without client intervention
- Built-in: web_search, x_search, code_execution
- No client-side orchestration needed
2. **Massive Context Windows**
- Grok 4: 256K tokens
- Grok 4 Fast: 2M tokens (2 million!)
3. **Flexible Message Roles** - No strict alternation required (unlike OpenAI)
4. **Real-time Search Integration** - Native web and X platform search
**Standard Features:**
- Chat completions API (`/v1/chat/completions`)
- SSE streaming with `stream: true`
- Traditional function calling (client-managed)
- Model listing (`/v1/models`)
- OpenAI SDK compatibility (just change base_url)
### 📊 Available Models
#### Grok 4 (Production)
- **ID:** `grok-4`
- **Context:** 256K tokens
- **Pricing:** $3/1M input, $15/1M output, $0.75/1M cached
- **Strengths:** Advanced reasoning, multimodal, tool use
#### Grok 4 Fast (Recommended Default)
- **ID:** `grok-beta`, `grok-4-fast-reasoning`, `grok-4-fast-non-reasoning`
- **Context:** 2M tokens
- **Pricing:** More cost-efficient (40% fewer thinking tokens)
- **Strengths:** Agentic search, cost-efficiency, unified architecture
### 🔐 Authentication
**Method:** Bearer Token in Authorization header
```
Authorization: Bearer xai-...
```
**Key Management:**
- Obtain from: https://console.x.ai
- Configurable permissions (endpoints + models)
- Environment variable: `XAI_API_KEY`
### 📝 Request Format (Chat)
```json
{
"model": "grok-2-1212",
"messages": [
{ "role": "system", "content": "..." },
{ "role": "user", "content": "..." }
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}
```
**Key Difference from OpenAI:**
- Message roles can be in any order (no alternation enforcement)
- Stateless (no automatic history)
### 🌊 Streaming
**Enable:** `"stream": true`
**Protocol:** Server-Sent Events (SSE)
**Format:**
```
data: {"choices":[{"delta":{"content":"Hello"}}]}
data: {"choices":[{"delta":{"content":" world"}}]}
data: [DONE]
```
**X.AI SDK Difference:** Uses `.stream()` method instead of `stream=True` parameter
### 🛠️ Function Calling
**Two Modes:**
1. **Traditional (Client-Managed)**
- Define tools in request
- Model returns tool_calls
- Client executes and sends results
- Similar to OpenAI
2. **Agentic (Server-Managed)** ⭐ Unique to X.AI
- Built-in tools: web_search, x_search, code_execution
- Model autonomously executes on server
- Real-time observability in streaming mode
- No client orchestration needed
### ⚠️ Error Codes
| 401 | Invalid API key | Check credentials |
| 429 | Rate limit | Exponential backoff |
| 498 | Capacity exceeded | Retry with delay |
| 500-503 | Server errors | Retry with backoff |
### 💰 Pricing
**Public Beta:** $25/month API credits (requires X Premium+ $40/month)
**Grok 4:** $3/1M input, $15/1M output
**Cached Inputs:** 75% discount ($0.75/1M for Grok 4)
---
## Implementation Implications
### ✅ Keep From Spec
1. **OpenAI Compatibility Strategy** - Correct approach
2. **Feature Gating** - All enterprise features optional
3. **Stateless Design** - No persistent state
4. **SSE Streaming** - Use `eventsource-stream` crate
5. **Bearer Auth** - Use `secrecy` crate for keys
### 📝 Consider Adding
1. **Agentic Tools Support** - Unique X.AI feature
- May require additional types for server-side tool responses
- Streaming observability for tool calls
2. **Large Context Handling** - 2M token context
- No special handling needed, but document the capability
3. **Cached Input Support** - Cost optimization
- May add parameter for cache control
### 🎯 Default Values
Based on research, recommended defaults:
```rust
pub const DEFAULT_MODEL : &str = "grok-2-1212";
pub const DEFAULT_BASE_URL : &str = "https://api.x.ai/v1";
pub const DEFAULT_TIMEOUT_SECS : u64 = 30;
pub const DEFAULT_MAX_TOKENS : u32 = 1000;
pub const DEFAULT_TEMPERATURE : f32 = 0.7;
```
---
## Simplified Implementation Path
Given OpenAI compatibility:
1. **Start with chat completions** - Most important endpoint
2. **Add streaming** - SSE with eventsource-stream
3. **Add function calling** - Traditional mode first
4. **Consider agentic tools** - Optional enhancement
5. **Models endpoint** - Simple GET request
**Complexity Assessment:** ⭐⭐ (Medium-Low)
- OpenAI compatibility simplifies implementation
- Can adapt OpenAI client patterns
- Primary work: types, error handling, streaming
---
## Code Examples Reference
### Minimal Chat (curl)
```bash
curl https://api.x.ai/v1/chat/completions \
-H "Authorization: Bearer $XAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"grok-2-1212","messages":[{"role":"user","content":"Hi"}]}'
```
### With OpenAI Python SDK
```python
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1"
)
response = client.chat.completions.create(
model="grok-2-1212",
messages=[{"role":"user","content":"Hello"}]
)
```
**Takeaway:** Existing OpenAI client libraries work with just URL + key change!
---
## Documentation Sources
**Primary:**
- Official Docs: https://docs.x.ai (blocked during research)
- Console: https://console.x.ai
**Secondary (Used):**
- Third-party guides and tutorials
- API integration documentation
- Developer community resources
**Confidence Level:** ✅ High
- Multiple source corroboration
- Consistent information across sources
- OpenAI compatibility well-documented
---
## Next Steps for Implementation
1. ✅ **Specification validated** - No major changes needed
2. ⏭️ **Start Phase 1** - Implement core modules (error, secret, environment, client, chat)
3. ⏭️ **TDD Approach** - Write failing tests first
4. ⏭️ **Reference api_reference.md** - Use as implementation guide
**Estimated Complexity:** Simple to Medium
- OpenAI compatibility = simpler implementation
- Can reuse patterns from api_claude
- Main work: proper types and error handling
---
**Research Status:** ✅ COMPLETE
**Ready for Development:** YES
**Recommended Next Action:** Begin Phase 1 (MVP) implementation