# Tiny Dancer Admin API Documentation
## Overview
The Tiny Dancer Admin API provides a production-ready REST API for monitoring, health checks, and administration of the AI routing system. It's designed to integrate seamlessly with Kubernetes, Prometheus, and other cloud-native tools.
## Features
- **Health Checks**: Kubernetes-compatible liveness and readiness probes
- **Metrics Export**: Prometheus-compatible metrics endpoint
- **Hot Reloading**: Update models without downtime
- **Circuit Breaker Management**: Monitor and control circuit breaker state
- **Configuration Management**: View and update router configuration
- **Optional Authentication**: Bearer token authentication for admin endpoints
- **CORS Support**: Configurable CORS for web applications
## Quick Start
### Running the Server
```bash
# With admin API feature enabled
cargo run --example admin-server --features admin-api
```
### Basic Configuration
```rust
use ruvector_tiny_dancer_core::api::{AdminServer, AdminServerConfig};
use ruvector_tiny_dancer_core::router::Router;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let router = Router::default()?;
let config = AdminServerConfig {
bind_address: "0.0.0.0".to_string(),
port: 8080,
auth_token: Some("your-secret-token".to_string()),
enable_cors: true,
};
let server = AdminServer::new(Arc::new(router), config);
server.serve().await?;
Ok(())
}
```
## API Endpoints
### Health Checks
#### `GET /health`
Basic liveness probe that always returns 200 OK if the service is running.
**Response:**
```json
{
"status": "healthy",
"version": "0.1.0",
"uptime_seconds": 3600
}
```
**Use Case:** Kubernetes liveness probe
```yaml
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 3
periodSeconds: 10
```
---
#### `GET /health/ready`
Readiness probe that checks if the service can accept traffic.
**Checks:**
- Circuit breaker state
- Model loaded status
**Response (Ready):**
```json
{
"ready": true,
"circuit_breaker": "closed",
"model_loaded": true,
"version": "0.1.0",
"uptime_seconds": 3600
}
```
**Response (Not Ready):**
```json
{
"ready": false,
"circuit_breaker": "open",
"model_loaded": true,
"version": "0.1.0",
"uptime_seconds": 3600
}
```
**Status Codes:**
- `200 OK`: Service is ready
- `503 Service Unavailable`: Service is not ready
**Use Case:** Kubernetes readiness probe
```yaml
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
```
---
### Metrics
#### `GET /metrics`
Exports metrics in Prometheus exposition format.
**Response Format:** `text/plain; version=0.0.4`
**Metrics Exported:**
```
# HELP tiny_dancer_requests_total Total number of routing requests
# TYPE tiny_dancer_requests_total counter
tiny_dancer_requests_total 12345
# HELP tiny_dancer_lightweight_routes_total Requests routed to lightweight model
# TYPE tiny_dancer_lightweight_routes_total counter
tiny_dancer_lightweight_routes_total 10000
# HELP tiny_dancer_powerful_routes_total Requests routed to powerful model
# TYPE tiny_dancer_powerful_routes_total counter
tiny_dancer_powerful_routes_total 2345
# HELP tiny_dancer_inference_time_microseconds Average inference time
# TYPE tiny_dancer_inference_time_microseconds gauge
tiny_dancer_inference_time_microseconds 450.5
# HELP tiny_dancer_latency_microseconds Latency percentiles
# TYPE tiny_dancer_latency_microseconds gauge
tiny_dancer_latency_microseconds{quantile="0.5"} 400
tiny_dancer_latency_microseconds{quantile="0.95"} 800
tiny_dancer_latency_microseconds{quantile="0.99"} 1200
# HELP tiny_dancer_errors_total Total number of errors
# TYPE tiny_dancer_errors_total counter
tiny_dancer_errors_total 5
# HELP tiny_dancer_circuit_breaker_trips_total Circuit breaker trip count
# TYPE tiny_dancer_circuit_breaker_trips_total counter
tiny_dancer_circuit_breaker_trips_total 2
# HELP tiny_dancer_uptime_seconds Service uptime
# TYPE tiny_dancer_uptime_seconds counter
tiny_dancer_uptime_seconds 3600
```
**Use Case:** Prometheus scraping
```yaml
scrape_configs:
- job_name: 'tiny-dancer'
static_configs:
- targets: ['localhost:8080']
metrics_path: '/metrics'
```
---
### Admin Endpoints
All admin endpoints support optional bearer token authentication.
#### `POST /admin/reload`
Hot reload the routing model from disk without restarting the service.
**Headers:**
```
Authorization: Bearer your-secret-token
```
**Response:**
```json
{
"success": true,
"message": "Model reloaded successfully"
}
```
**Status Codes:**
- `200 OK`: Model reloaded successfully
- `401 Unauthorized`: Invalid or missing authentication token
- `500 Internal Server Error`: Failed to reload model
**Example:**
```bash
curl -X POST http://localhost:8080/admin/reload \
-H "Authorization: Bearer your-token-here"
```
---
#### `GET /admin/config`
Get the current router configuration.
**Headers:**
```
Authorization: Bearer your-secret-token
```
**Response:**
```json
{
"model_path": "./models/fastgrnn.safetensors",
"confidence_threshold": 0.85,
"max_uncertainty": 0.15,
"enable_circuit_breaker": true,
"circuit_breaker_threshold": 5,
"enable_quantization": true,
"database_path": null
}
```
**Status Codes:**
- `200 OK`: Configuration retrieved
- `401 Unauthorized`: Invalid or missing authentication token
**Example:**
```bash
curl http://localhost:8080/admin/config \
-H "Authorization: Bearer your-token-here"
```
---
#### `PUT /admin/config`
Update the router configuration (runtime only, not persisted).
**Headers:**
```
Authorization: Bearer your-secret-token
Content-Type: application/json
```
**Request Body:**
```json
{
"confidence_threshold": 0.90,
"max_uncertainty": 0.10,
"circuit_breaker_threshold": 10
}
```
**Response:**
```json
{
"success": true,
"message": "Configuration updated",
"updated_fields": ["confidence_threshold", "max_uncertainty"]
}
```
**Status Codes:**
- `200 OK`: Configuration updated
- `401 Unauthorized`: Invalid or missing authentication token
- `501 Not Implemented`: Feature not yet implemented
**Note:** Currently returns 501 as runtime config updates require Router API extensions.
---
#### `GET /admin/circuit-breaker`
Get the current circuit breaker status.
**Headers:**
```
Authorization: Bearer your-secret-token
```
**Response:**
```json
{
"enabled": true,
"state": "closed",
"failure_count": 2,
"success_count": 1234
}
```
**Status Codes:**
- `200 OK`: Status retrieved
- `401 Unauthorized`: Invalid or missing authentication token
**Example:**
```bash
curl http://localhost:8080/admin/circuit-breaker \
-H "Authorization: Bearer your-token-here"
```
---
#### `POST /admin/circuit-breaker/reset`
Reset the circuit breaker to closed state.
**Headers:**
```
Authorization: Bearer your-secret-token
```
**Response:**
```json
{
"success": true,
"message": "Circuit breaker reset successfully"
}
```
**Status Codes:**
- `200 OK`: Circuit breaker reset
- `401 Unauthorized`: Invalid or missing authentication token
- `501 Not Implemented`: Feature not yet implemented
**Note:** Currently returns 501 as circuit breaker reset requires Router API extensions.
---
### System Information
#### `GET /info`
Get comprehensive system information.
**Response:**
```json
{
"version": "0.1.0",
"api_version": "v1",
"uptime_seconds": 3600,
"config": {
"model_path": "./models/fastgrnn.safetensors",
"confidence_threshold": 0.85,
"max_uncertainty": 0.15,
"enable_circuit_breaker": true,
"circuit_breaker_threshold": 5,
"enable_quantization": true,
"database_path": null
},
"circuit_breaker_enabled": true,
"metrics": {
"total_requests": 12345,
"lightweight_routes": 10000,
"powerful_routes": 2345,
"avg_inference_time_us": 450.5,
"p50_latency_us": 400,
"p95_latency_us": 800,
"p99_latency_us": 1200,
"error_count": 5,
"circuit_breaker_trips": 2
}
}
```
**Example:**
```bash
curl http://localhost:8080/info
```
---
## Authentication
The admin API supports optional bearer token authentication for admin endpoints.
### Configuration
```rust
let config = AdminServerConfig {
bind_address: "0.0.0.0".to_string(),
port: 8080,
auth_token: Some("your-secret-token-here".to_string()),
enable_cors: true,
};
```
### Usage
Include the bearer token in the Authorization header:
```bash
curl -H "Authorization: Bearer your-secret-token-here" \
http://localhost:8080/admin/reload
```
### Security Best Practices
1. **Always enable authentication in production**
2. **Use strong, random tokens** (minimum 32 characters)
3. **Rotate tokens regularly**
4. **Use HTTPS in production** (configure via reverse proxy)
5. **Limit admin API access** to internal networks only
6. **Monitor failed authentication attempts**
### Environment Variables
```bash
export TINY_DANCER_AUTH_TOKEN="your-secret-token-here"
export TINY_DANCER_BIND_ADDRESS="0.0.0.0"
export TINY_DANCER_PORT="8080"
```
---
## Kubernetes Integration
### Deployment Example
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: tiny-dancer
spec:
replicas: 3
selector:
matchLabels:
app: tiny-dancer
template:
metadata:
labels:
app: tiny-dancer
spec:
containers:
- name: tiny-dancer
image: tiny-dancer:latest
ports:
- containerPort: 8080
name: admin-api
env:
- name: TINY_DANCER_AUTH_TOKEN
valueFrom:
secretKeyRef:
name: tiny-dancer-secrets
key: auth-token
livenessProbe:
httpGet:
path: /health
port: admin-api
initialDelaySeconds: 3
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: admin-api
initialDelaySeconds: 5
periodSeconds: 5
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
```
### Service Example
```yaml
apiVersion: v1
kind: Service
metadata:
name: tiny-dancer
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
selector:
app: tiny-dancer
ports:
- name: admin-api
port: 8080
targetPort: 8080
type: ClusterIP
```
---
## Monitoring with Grafana
### Prometheus Query Examples
```promql
# Request rate
rate(tiny_dancer_requests_total[5m])
# Error rate
rate(tiny_dancer_errors_total[5m]) / rate(tiny_dancer_requests_total[5m])
# P95 latency
tiny_dancer_latency_microseconds{quantile="0.95"}
# Lightweight routing ratio
tiny_dancer_lightweight_routes_total / tiny_dancer_requests_total
# Circuit breaker trips over time
increase(tiny_dancer_circuit_breaker_trips_total[1h])
```
### Dashboard Panels
1. **Request Rate**: Line graph of requests per second
2. **Error Rate**: Gauge showing error percentage
3. **Latency Percentiles**: Multi-line graph (P50, P95, P99)
4. **Routing Distribution**: Pie chart (lightweight vs powerful)
5. **Circuit Breaker Status**: Single stat panel
6. **Uptime**: Single stat panel
---
## Performance Considerations
### Metrics Collection
The metrics endpoint is designed for high-performance scraping:
- **No locks during read**: Uses atomic operations where possible
- **O(1) complexity**: All metrics are pre-aggregated
- **Minimal allocations**: Prometheus format generated on-the-fly
- **Scrape interval**: Recommended 15-30 seconds
### Health Check Latency
- Health check: ~10μs
- Readiness check: ~50μs (includes circuit breaker check)
### Memory Overhead
- Admin server: ~2MB base memory
- Per-connection overhead: ~50KB
- Metrics storage: ~1KB
---
## Error Handling
### Common Error Responses
#### 401 Unauthorized
```json
{
"error": "Missing or invalid Authorization header"
}
```
#### 500 Internal Server Error
```json
{
"success": false,
"message": "Failed to reload model: File not found"
}
```
#### 503 Service Unavailable
```json
{
"ready": false,
"circuit_breaker": "open",
"model_loaded": true,
"version": "0.1.0",
"uptime_seconds": 3600
}
```
---
## Production Checklist
- [ ] Enable authentication for admin endpoints
- [ ] Configure HTTPS via reverse proxy (nginx, Envoy, etc.)
- [ ] Set up Prometheus scraping
- [ ] Configure Grafana dashboards
- [ ] Set up alerts for error rate and latency
- [ ] Implement log aggregation
- [ ] Configure network policies (K8s)
- [ ] Set resource limits
- [ ] Enable CORS only for trusted origins
- [ ] Rotate authentication tokens regularly
- [ ] Monitor circuit breaker trips
- [ ] Set up automated model reload workflows
---
## Troubleshooting
### Server Won't Start
**Symptom:** `Failed to bind to 0.0.0.0:8080: Address already in use`
**Solution:** Change the port or stop the conflicting service:
```bash
lsof -i :8080
kill <PID>
```
### Authentication Failing
**Symptom:** `401 Unauthorized`
**Solution:** Check that the token matches exactly:
```bash
# Test with curl
curl -H "Authorization: Bearer your-token" http://localhost:8080/admin/config
```
### Metrics Not Updating
**Symptom:** Metrics show zero values
**Solution:** Ensure you're recording metrics after each routing operation:
```rust
use ruvector_tiny_dancer_core::api::record_routing_metrics;
// After routing
record_routing_metrics(&metrics, inference_time_us, lightweight_count, powerful_count);
```
---
## Future Enhancements
- [ ] Runtime configuration persistence
- [ ] Circuit breaker manual reset API
- [ ] WebSocket support for real-time metrics streaming
- [ ] OpenTelemetry integration
- [ ] Custom metric labels
- [ ] Rate limiting
- [ ] Request/response logging middleware
- [ ] Distributed tracing integration
- [ ] GraphQL API alternative
- [ ] Admin UI dashboard
---
## Support
For issues, questions, or contributions, please visit:
- GitHub: https://github.com/ruvnet/ruvector
- Documentation: https://docs.ruvector.io
---
## License
This API is part of the Tiny Dancer routing system and follows the same license terms.