ruvector-postgres 2.0.5

High-performance PostgreSQL vector database extension v2 - pgvector drop-in replacement with 230+ SQL functions, SIMD acceleration, Flash Attention, GNN layers, hybrid search, multi-tenancy, self-healing, and self-learning capabilities
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
# Tiny Dancer Routing - Implementation Summary

## Overview

The Tiny Dancer Routing module is a neural-powered dynamic agent routing system for the ruvector-postgres PostgreSQL extension. It intelligently routes AI requests to the best available agent based on cost, latency, quality, and capability requirements.

## Architecture

### Core Components

```
routing/
├── mod.rs           # Module exports and initialization
├── fastgrnn.rs      # FastGRNN neural network implementation
├── agents.rs        # Agent registry and management
├── router.rs        # Main routing logic with multi-objective optimization
├── operators.rs     # PostgreSQL function bindings
└── README.md        # User documentation
```

## Features

### 1. FastGRNN Neural Network

**File**: `src/routing/fastgrnn.rs`

- Lightweight gated recurrent neural network for real-time routing decisions
- Minimal compute overhead (< 1ms inference time)
- Adaptive learning from routing patterns
- Supports sequence processing for multi-step routing

**Key Functions**:
- `step(input, hidden) -> new_hidden` - Single RNN step
- `forward_single(input) -> hidden` - Single-step inference
- `forward_sequence(inputs) -> outputs` - Process sequences
- Sigmoid and tanh activation functions

**Implementation Details**:
- Input dimension: 384 (embedding size)
- Hidden dimension: Configurable (default 64)
- Parameters: w_gate, u_gate, w_update, u_update, biases
- Xavier initialization for stable training

### 2. Agent Registry

**File**: `src/routing/agents.rs`

- Thread-safe agent storage using DashMap
- Real-time performance metric tracking
- Capability-based agent discovery
- Cost model management

**Agent Types**:
- `LLM` - Language models (GPT, Claude, etc.)
- `Embedding` - Embedding models
- `Specialized` - Task-specific agents
- `Vision` - Vision models
- `Audio` - Audio models
- `Multimodal` - Multi-modal agents
- `Custom(String)` - User-defined types

**Performance Metrics**:
- Average latency (ms)
- P95 and P99 latency
- Quality score (0-1)
- Success rate (0-1)
- Total requests processed

**Cost Model**:
- Per-request cost
- Per-token cost (optional)
- Monthly fixed cost (optional)

### 3. Router

**File**: `src/routing/router.rs`

- Multi-objective optimization (cost, latency, quality, balanced)
- Constraint-based filtering
- Neural-enhanced confidence scoring
- Alternative agent suggestions

**Optimization Targets**:
1. **Cost**: Minimize cost per request
2. **Latency**: Minimize response time
3. **Quality**: Maximize quality score
4. **Balanced**: Multi-objective optimization

**Constraints**:
- `max_cost` - Maximum acceptable cost
- `max_latency_ms` - Maximum latency
- `min_quality` - Minimum quality score
- `required_capabilities` - Required agent capabilities
- `excluded_agents` - Agents to exclude

**Routing Decision**:
```rust
pub struct RoutingDecision {
    pub agent_name: String,
    pub confidence: f32,
    pub estimated_cost: f32,
    pub estimated_latency_ms: f32,
    pub expected_quality: f32,
    pub similarity_score: f32,
    pub reasoning: String,
    pub alternatives: Vec<AlternativeAgent>,
}
```

### 4. PostgreSQL Operators

**File**: `src/routing/operators.rs`

Complete SQL interface for agent management and routing.

## SQL Functions

### Agent Management

```sql
-- Register agent
ruvector_register_agent(name, type, capabilities, cost, latency, quality)

-- Register with full config
ruvector_register_agent_full(config_jsonb)

-- Update metrics
ruvector_update_agent_metrics(name, latency_ms, success, quality)

-- Remove agent
ruvector_remove_agent(name)

-- Set active status
ruvector_set_agent_active(name, is_active)

-- Get agent details
ruvector_get_agent(name) -> jsonb

-- List all agents
ruvector_list_agents() -> table

-- Find by capability
ruvector_find_agents_by_capability(capability, limit) -> table
```

### Routing

```sql
-- Route request
ruvector_route(
    request_embedding float4[],
    optimize_for text,
    constraints jsonb
) -> jsonb
```

### Statistics

```sql
-- Get routing statistics
ruvector_routing_stats() -> jsonb

-- Clear all agents (testing only)
ruvector_clear_agents() -> boolean
```

## Usage Examples

### Basic Routing

```sql
-- Register agents
SELECT ruvector_register_agent(
    'gpt-4', 'llm',
    ARRAY['coding', 'reasoning'],
    0.03, 500.0, 0.95
);

SELECT ruvector_register_agent(
    'gpt-3.5-turbo', 'llm',
    ARRAY['general', 'fast'],
    0.002, 150.0, 0.75
);

-- Route request (cost-optimized)
SELECT ruvector_route(
    embedding_vector,
    'cost',
    NULL
) FROM requests WHERE id = 1;

-- Route with constraints
SELECT ruvector_route(
    embedding_vector,
    'quality',
    '{"max_cost": 0.01, "min_quality": 0.8}'::jsonb
);
```

### Advanced Patterns

```sql
-- Smart routing function
CREATE FUNCTION smart_route(
    embedding vector,
    task_type text,
    priority text
) RETURNS jsonb AS $$
    SELECT ruvector_route(
        embedding::float4[],
        CASE priority
            WHEN 'critical' THEN 'quality'
            WHEN 'low' THEN 'cost'
            ELSE 'balanced'
        END,
        jsonb_build_object(
            'required_capabilities',
            CASE task_type
                WHEN 'coding' THEN ARRAY['coding']
                WHEN 'writing' THEN ARRAY['writing']
                ELSE ARRAY[]::text[]
            END
        )
    );
$$ LANGUAGE sql;

-- Batch processing
SELECT
    r.id,
    (ruvector_route(r.embedding, 'balanced', NULL))::jsonb->>'agent_name' AS agent
FROM requests r
WHERE processed = false
LIMIT 1000;
```

## Performance Characteristics

### FastGRNN
- **Inference time**: < 1ms for 384-dim input
- **Memory footprint**: ~100KB per model
- **Training**: Online learning from routing decisions

### Agent Registry
- **Lookup time**: O(1) with DashMap
- **Concurrent access**: Lock-free reads
- **Capacity**: Unlimited (bounded by memory)

### Router
- **Routing time**: 1-5ms for 10-100 agents
- **Similarity calculation**: SIMD-optimized cosine similarity
- **Constraint checking**: O(n) over candidates

## Testing

### Unit Tests

All modules include comprehensive unit tests:

```bash
# Run routing module tests
cd /workspaces/ruvector/crates/ruvector-postgres
cargo test routing::
```

### Integration Tests

**File**: `tests/routing_tests.rs`

- Complete routing workflows
- Constraint-based routing
- Neural-enhanced routing
- Performance metric tracking
- Multi-agent scenarios

### PostgreSQL Tests

All SQL functions include `#[pg_test]` tests for validation in PostgreSQL environment.

## Integration Points

### Vector Search
- Use request embeddings for semantic similarity
- Match requests to agent specializations

### GNN Module
- Enhance routing with graph neural networks
- Model agent relationships and performance

### Quantization
- Compress agent embeddings for storage
- Reduce memory footprint

### HNSW Index
- Fast nearest-neighbor search for agent selection
- Scale to thousands of agents

## Performance Optimization Tips

1. **Agent Embeddings**: Pre-compute and store agent embeddings
2. **Caching**: Cache routing decisions for identical requests
3. **Batch Processing**: Route multiple requests in parallel
4. **Constraint Tuning**: Use specific constraints to reduce search space
5. **Metric Updates**: Batch metric updates for better performance

## Monitoring

### Agent Health

```sql
-- Monitor agent performance
SELECT name, success_rate, avg_latency_ms, quality_score
FROM ruvector_list_agents()
WHERE success_rate < 0.90 OR avg_latency_ms > 1000;
```

### Cost Tracking

```sql
-- Track daily costs
SELECT
    DATE_TRUNC('day', completed_at) AS day,
    agent_name,
    SUM(cost) AS total_cost,
    COUNT(*) AS requests
FROM request_completions
GROUP BY day, agent_name;
```

### Routing Statistics

```sql
-- Overall statistics
SELECT ruvector_routing_stats();
```

## Security Considerations

1. **Agent Isolation**: Each agent in separate namespace
2. **Cost Controls**: Always set max_cost constraints in production
3. **Rate Limiting**: Implement application-level rate limiting
4. **Audit Logging**: Track all routing decisions
5. **Access Control**: Use PostgreSQL RLS for multi-tenant scenarios

## Future Enhancements

### Planned Features
- [ ] Reinforcement learning for adaptive routing
- [ ] A/B testing framework
- [ ] Multi-armed bandit algorithms
- [ ] Cost prediction models
- [ ] Load balancing across agent instances
- [ ] Geo-distributed routing
- [ ] Circuit breaker patterns
- [ ] Automatic failover
- [ ] Performance anomaly detection
- [ ] Dynamic pricing support

### Research Directions
- [ ] Meta-learning for zero-shot agent selection
- [ ] Ensemble routing with multiple models
- [ ] Federated learning across agent pools
- [ ] Transfer learning from routing patterns
- [ ] Explainable routing decisions

## References

### FastGRNN Paper
"FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network"
- Efficient RNN architecture for edge devices
- Minimal computational overhead
- Suitable for real-time inference

### Related Work
- Multi-armed bandit algorithms
- Contextual bandits for routing
- Neural architecture search
- AutoML for model selection

## Files Created

1. `/src/routing/mod.rs` - Module exports
2. `/src/routing/fastgrnn.rs` - FastGRNN implementation (375 lines)
3. `/src/routing/agents.rs` - Agent registry (550 lines)
4. `/src/routing/router.rs` - Main router (650 lines)
5. `/src/routing/operators.rs` - PostgreSQL bindings (550 lines)
6. `/src/routing/README.md` - User documentation
7. `/sql/routing_example.sql` - Complete SQL examples
8. `/tests/routing_tests.rs` - Integration tests
9. `/docs/TINY_DANCER_ROUTING.md` - This document

**Total**: ~2,500+ lines of production-ready Rust code with comprehensive tests and documentation.

## Quick Start

```sql
-- 1. Register agents
SELECT ruvector_register_agent('gpt-4', 'llm', ARRAY['coding'], 0.03, 500.0, 0.95);
SELECT ruvector_register_agent('gpt-3.5', 'llm', ARRAY['general'], 0.002, 150.0, 0.75);

-- 2. Route a request
SELECT ruvector_route(
    (SELECT embedding FROM requests WHERE id = 1),
    'balanced',
    NULL
);

-- 3. Update metrics after completion
SELECT ruvector_update_agent_metrics('gpt-4', 450.0, true, 0.92);

-- 4. Monitor performance
SELECT * FROM ruvector_list_agents();
SELECT ruvector_routing_stats();
```

## Support

For issues, questions, or contributions, see the main ruvector-postgres repository.

## License

Same as ruvector-postgres (MIT/Apache-2.0 dual license)