ruvector-postgres 0.2.3

High-performance PostgreSQL vector database extension - pgvector drop-in replacement with 53+ SQL functions, SIMD acceleration, hyperbolic embeddings, GNN layers, and self-learning capabilities
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
# RuVector-Postgres

[![Crates.io](https://img.shields.io/crates/v/ruvector-postgres.svg)](https://crates.io/crates/ruvector-postgres)
[![Documentation](https://docs.rs/ruvector-postgres/badge.svg)](https://docs.rs/ruvector-postgres)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![PostgreSQL](https://img.shields.io/badge/PostgreSQL-14--17-blue.svg)](https://www.postgresql.org/)
[![Docker](https://img.shields.io/badge/Docker-available-blue.svg)](https://hub.docker.com/r/ruvector/postgres)

**The most advanced PostgreSQL vector database extension.** A drop-in pgvector replacement with 53+ SQL functions, SIMD acceleration, 39 attention mechanisms, GNN layers, hyperbolic embeddings, and self-learning capabilities.

## Why RuVector?

| Feature | pgvector | RuVector-Postgres |
|---------|----------|-------------------|
| Vector Search | HNSW, IVFFlat | HNSW, IVFFlat (optimized) |
| Distance Metrics | 3 | 8+ (including hyperbolic) |
| **Attention Mechanisms** | - | **39 types** |
| **Graph Neural Networks** | - | **GCN, GraphSAGE, GAT** |
| **Hyperbolic Embeddings** | - | **Poincare, Lorentz** |
| **Sparse Vectors / BM25** | Partial | **Full support** |
| **Self-Learning** | - | **ReasoningBank** |
| **Agent Routing** | - | **Tiny Dancer** |
| **Graph/Cypher** | - | **Full support** |
| AVX-512/NEON SIMD | Partial | **Full** |
| Quantization | No | **Scalar, Product, Binary** |

## Installation

### Docker (Recommended)

```bash
docker run -d --name ruvector-pg \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  ruvector/postgres:latest
```

### From Source

```bash
# Install pgrx
cargo install cargo-pgrx --version "0.12.9" --locked
cargo pgrx init --pg16 $(which pg_config)

# Build and install
cd crates/ruvector-postgres
cargo pgrx install --release
```

### CLI Tool

```bash
npm install -g @ruvector/postgres-cli
ruvector-pg -c "postgresql://localhost:5432/mydb" install
```

## Quick Start

```sql
-- Create the extension
CREATE EXTENSION ruvector;

-- Create a table with vector column
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding ruvector(1536)
);

-- Create an HNSW index
CREATE INDEX ON documents USING ruhnsw (embedding ruvector_l2_ops);

-- Find similar documents
SELECT content, embedding <-> '[0.15, 0.25, ...]'::ruvector AS distance
FROM documents
ORDER BY distance
LIMIT 10;
```

## 53+ SQL Functions

RuVector exposes all advanced AI capabilities as native PostgreSQL functions.

### Core Vector Operations

```sql
-- Distance metrics
SELECT ruvector_cosine_distance(a, b);
SELECT ruvector_l2_distance(a, b);
SELECT ruvector_inner_product(a, b);
SELECT ruvector_manhattan_distance(a, b);

-- Vector operations
SELECT ruvector_normalize(embedding);
SELECT ruvector_add(a, b);
SELECT ruvector_scalar_mul(embedding, 2.0);
```

### Hyperbolic Geometry (8 functions)

Perfect for hierarchical data like taxonomies, knowledge graphs, and org charts.

```sql
-- Poincare ball model
SELECT ruvector_poincare_distance(a, b, -1.0);  -- curvature -1

-- Lorentz hyperboloid model
SELECT ruvector_lorentz_distance(a, b, -1.0);

-- Hyperbolic operations
SELECT ruvector_mobius_add(a, b, -1.0);       -- Hyperbolic translation
SELECT ruvector_exp_map(base, tangent, -1.0); -- Tangent to manifold
SELECT ruvector_log_map(base, target, -1.0);  -- Manifold to tangent

-- Model conversion
SELECT ruvector_poincare_to_lorentz(poincare_vec, -1.0);
SELECT ruvector_lorentz_to_poincare(lorentz_vec, -1.0);

-- Minkowski inner product
SELECT ruvector_minkowski_dot(a, b);
```

### Sparse Vectors & BM25 (14 functions)

Full sparse vector support with text scoring.

```sql
-- Create sparse vectors
SELECT ruvector_sparse_create(ARRAY[0, 5, 10], ARRAY[0.5, 0.3, 0.2], 100);
SELECT ruvector_sparse_from_dense(dense_vector, 0.01);  -- threshold

-- Sparse operations
SELECT ruvector_sparse_dot(a, b);
SELECT ruvector_sparse_cosine(a, b);
SELECT ruvector_sparse_l2_distance(a, b);
SELECT ruvector_sparse_add(a, b);
SELECT ruvector_sparse_scale(vec, 2.0);
SELECT ruvector_sparse_normalize(vec);
SELECT ruvector_sparse_topk(vec, 10);  -- Top-k elements

-- Text scoring
SELECT ruvector_bm25_score(query_terms, doc_freqs, doc_len, avg_doc_len, total_docs);
SELECT ruvector_tf_idf(term_freq, doc_freq, total_docs);
```

### 39 Attention Mechanisms

Full transformer-style attention in PostgreSQL.

```sql
-- Scaled dot-product attention
SELECT ruvector_attention_scaled_dot(query, keys, values);

-- Multi-head attention
SELECT ruvector_attention_multi_head(query, keys, values, num_heads);

-- Flash attention (memory efficient)
SELECT ruvector_attention_flash(query, keys, values, block_size);

-- Sparse attention patterns
SELECT ruvector_attention_sparse(query, keys, values, sparsity_pattern);

-- Linear attention (O(n) complexity)
SELECT ruvector_attention_linear(query, keys, values);

-- Causal/masked attention
SELECT ruvector_attention_causal(query, keys, values);

-- Cross attention
SELECT ruvector_attention_cross(query, context_keys, context_values);

-- Self attention
SELECT ruvector_attention_self(input, num_heads);
```

### Graph Neural Networks (5 functions)

GNN layers for graph-structured data.

```sql
-- GCN (Graph Convolutional Network)
SELECT ruvector_gnn_gcn_layer(features, adjacency, weights);

-- GraphSAGE (inductive learning)
SELECT ruvector_gnn_graphsage_layer(features, neighbor_features, weights);

-- GAT (Graph Attention Network)
SELECT ruvector_gnn_gat_layer(features, adjacency, attention_weights);

-- Message passing
SELECT ruvector_gnn_message_pass(node_features, edge_index, edge_weights);

-- Aggregation
SELECT ruvector_gnn_aggregate(messages, aggregation_type);  -- mean, max, sum
```

### Agent Routing - Tiny Dancer (11 functions)

Intelligent query routing to specialized AI agents.

```sql
-- Route query to best agent
SELECT ruvector_route_query(query_embedding, agent_registry);

-- Route with context
SELECT ruvector_route_with_context(query, context, agents);

-- Multi-agent routing
SELECT ruvector_multi_agent_route(query, agents, top_k);

-- Agent management
SELECT ruvector_register_agent(name, capabilities, embedding);
SELECT ruvector_update_agent_performance(agent_id, metrics);
SELECT ruvector_get_routing_stats();

-- Affinity calculation
SELECT ruvector_calculate_agent_affinity(query, agent);
SELECT ruvector_select_best_agent(query, agent_list);

-- Adaptive routing
SELECT ruvector_adaptive_route(query, context, learning_rate);

-- FastGRNN acceleration
SELECT ruvector_fastgrnn_forward(input, hidden, weights);
```

### Self-Learning / ReasoningBank (7 functions)

Adaptive search parameter optimization.

```sql
-- Record learning trajectory
SELECT ruvector_record_trajectory(input, output, success, context);

-- Get verdict on approach
SELECT ruvector_get_verdict(trajectory_id);

-- Memory distillation
SELECT ruvector_distill_memory(trajectories, compression_ratio);

-- Adaptive search
SELECT ruvector_adaptive_search(query, context, ef_search);

-- Learning feedback
SELECT ruvector_learning_feedback(search_id, relevance_scores);

-- Get learned patterns
SELECT ruvector_get_learning_patterns(context);

-- Optimize search parameters
SELECT ruvector_optimize_search_params(query_type, historical_data);
```

### Graph Storage & Cypher (8 functions)

Graph operations with Cypher query support.

```sql
-- Create graph elements
SELECT ruvector_graph_create_node(labels, properties, embedding);
SELECT ruvector_graph_create_edge(from_node, to_node, edge_type, properties);

-- Graph queries
SELECT ruvector_graph_get_neighbors(node_id, edge_type, depth);
SELECT ruvector_graph_shortest_path(start_node, end_node);
SELECT ruvector_graph_pagerank(edge_table, damping, iterations);

-- Cypher queries
SELECT ruvector_cypher_query('MATCH (n:Person)-[:KNOWS]->(m) RETURN n, m');

-- Traversal
SELECT ruvector_graph_traverse(start_node, direction, max_depth);

-- Similarity search on graph
SELECT ruvector_graph_similarity_search(query_embedding, node_type, top_k);
```

## Vector Types

### `ruvector(n)` - Dense Vector

```sql
CREATE TABLE items (embedding ruvector(1536));
-- Storage: 8 + (4 x dimensions) bytes
```

### `halfvec(n)` - Half-Precision Vector

```sql
CREATE TABLE items (embedding halfvec(1536));
-- Storage: 8 + (2 x dimensions) bytes (50% savings)
```

### `sparsevec(n)` - Sparse Vector

```sql
CREATE TABLE items (embedding sparsevec(50000));
INSERT INTO items VALUES ('{1:0.5, 100:0.8, 5000:0.3}/50000');
-- Storage: 12 + (8 x non_zero_elements) bytes
```

## Distance Operators

| Operator | Distance | Use Case |
|----------|----------|----------|
| `<->` | L2 (Euclidean) | General similarity |
| `<=>` | Cosine | Text embeddings |
| `<#>` | Inner Product | Normalized vectors |
| `<+>` | Manhattan (L1) | Sparse features |

## Index Types

### HNSW (Hierarchical Navigable Small World)

```sql
CREATE INDEX ON items USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 16, ef_construction = 64);

SET ruvector.ef_search = 100;  -- Tune search quality
```

### IVFFlat (Inverted File Flat)

```sql
CREATE INDEX ON items USING ruivfflat (embedding ruvector_l2_ops)
WITH (lists = 100);

SET ruvector.ivfflat_probes = 10;  -- Tune search quality
```

## Performance Benchmarks

*AMD EPYC 7763 (64 cores), 256GB RAM:*

| Operation | 10K vectors | 100K vectors | 1M vectors |
|-----------|-------------|--------------|------------|
| HNSW Build | 0.8s | 8.2s | 95s |
| HNSW Search (top-10) | 0.3ms | 0.5ms | 1.2ms |
| Cosine Distance | 0.01ms | 0.01ms | 0.01ms |
| Poincare Distance | 0.02ms | 0.02ms | 0.02ms |
| GCN Forward | 2.1ms | 18ms | 180ms |
| BM25 Score | 0.05ms | 0.08ms | 0.15ms |

*Single distance calculation (1536 dimensions):*

| Metric | AVX2 Time | Speedup vs Scalar |
|--------|-----------|-------------------|
| L2 (Euclidean) | 38 ns | 3.7x |
| Cosine | 51 ns | 3.7x |
| Inner Product | 36 ns | 3.7x |

## Use Cases

### Semantic Search with RAG

```sql
SELECT content, embedding <=> $query_embedding AS similarity
FROM documents
WHERE category = 'technical'
ORDER BY similarity
LIMIT 5;
```

### Knowledge Graph with Hierarchical Embeddings

```sql
-- Use hyperbolic embeddings for taxonomy
SELECT name, ruvector_poincare_distance(embedding, $query, -1.0) AS distance
FROM taxonomy_nodes
ORDER BY distance
LIMIT 10;
```

### Hybrid Search (Vector + BM25)

```sql
SELECT
    content,
    0.7 * (1.0 / (1.0 + embedding <-> $query_vector)) +
    0.3 * ruvector_bm25_score(terms, doc_freqs, length, avg_len, total) AS score
FROM documents
ORDER BY score DESC
LIMIT 10;
```

### Multi-Agent Query Routing

```sql
SELECT ruvector_route_query(
    $user_query_embedding,
    (SELECT array_agg(row(name, capabilities)) FROM agents)
) AS best_agent;
```

### Graph Neural Network Inference

```sql
SELECT ruvector_gnn_gcn_layer(
    node_features,
    adjacency_matrix,
    trained_weights
) AS updated_features
FROM graph_nodes;
```

## CLI Tool

Install the CLI for easy management:

```bash
npm install -g @ruvector/postgres-cli

# Commands
ruvector-pg install                    # Install extension
ruvector-pg vector create table --dim 384 --index hnsw
ruvector-pg hyperbolic poincare-distance --a "[0.1,0.2]" --b "[0.3,0.4]"
ruvector-pg gnn gcn --features "[[...]]" --adj "[[...]]"
ruvector-pg graph query "MATCH (n) RETURN n"
ruvector-pg routing route --query "[...]" --agents agents.json
ruvector-pg learning adaptive-search --context "[...]"
ruvector-pg bench run --type all --size 10000
```

## Related Packages

- [`@ruvector/postgres-cli`]https://www.npmjs.com/package/@ruvector/postgres-cli - CLI for RuVector PostgreSQL
- [`ruvector-core`]https://crates.io/crates/ruvector-core - Core vector operations library
- [`ruvector-tiny-dancer`]https://crates.io/crates/ruvector-tiny-dancer - Agent routing library

## Documentation

| Document | Description |
|----------|-------------|
| [docs/API.md]docs/API.md | Complete SQL API reference |
| [docs/ARCHITECTURE.md]docs/ARCHITECTURE.md | System architecture |
| [docs/SIMD_OPTIMIZATION.md]docs/SIMD_OPTIMIZATION.md | SIMD details |
| [docs/guides/ATTENTION_QUICK_REFERENCE.md]docs/guides/ATTENTION_QUICK_REFERENCE.md | Attention mechanisms |
| [docs/GNN_QUICK_REFERENCE.md]docs/GNN_QUICK_REFERENCE.md | GNN layers |
| [docs/ROUTING_QUICK_REFERENCE.md]docs/ROUTING_QUICK_REFERENCE.md | Tiny Dancer routing |
| [docs/LEARNING_MODULE_README.md]docs/LEARNING_MODULE_README.md | ReasoningBank |

## Requirements

- PostgreSQL 14, 15, 16, or 17
- x86_64 (AVX2/AVX-512) or ARM64 (NEON)
- Linux, macOS, or Windows (WSL)

## License

MIT License - See [LICENSE](../../LICENSE)

## Contributing

Contributions welcome! See [CONTRIBUTING.md](../../CONTRIBUTING.md)