ruvector-postgres 2.0.5

High-performance PostgreSQL vector database extension v2 - pgvector drop-in replacement with 230+ SQL functions, SIMD acceleration, Flash Attention, GNN layers, hybrid search, multi-tenancy, self-healing, and self-learning capabilities
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
# Neon Postgres Compatibility Guide

## Overview

RuVector-Postgres is designed with first-class support for Neon's serverless PostgreSQL platform. This guide covers deployment, configuration, and optimization for Neon environments.

## Neon Platform Overview

Neon is a serverless PostgreSQL platform with unique architecture:

- **Separation of Storage and Compute**: Compute nodes are stateless
- **Scale to Zero**: Instances automatically suspend when idle
- **Instant Branching**: Copy-on-write database branches
- **Dynamic Extension Loading**: Custom extensions loaded on demand
- **Connection Pooling**: Built-in pooling with PgBouncer

## Compatibility Matrix

| Neon Feature | RuVector Support | Notes |
|--------------|------------------|-------|
| PostgreSQL 14 | ✓ Full | Tested |
| PostgreSQL 15 | ✓ Full | Tested |
| PostgreSQL 16 | ✓ Full | Recommended |
| PostgreSQL 17 | ✓ Full | Latest |
| PostgreSQL 18 | ✓ Full | Beta support |
| Scale to Zero | ✓ Full | <100ms cold start |
| Instant Branching | ✓ Full | Index state preserved |
| Connection Pooling | ✓ Full | Thread-safe, no session state |
| Read Replicas | ✓ Full | Consistent reads |
| Autoscaling | ✓ Full | Dynamic memory handling |
| Autosuspend | ✓ Full | Fast wake-up |

## Design Considerations for Neon

### 1. Stateless Compute

Neon compute nodes are ephemeral and may be replaced at any time. RuVector-Postgres handles this by:

```rust
// No global mutable state that requires persistence
// All state lives in PostgreSQL's shared memory or storage

#[pg_guard]
pub fn _PG_init() {
    // Lightweight initialization - no disk I/O
    // SIMD feature detection cached in thread-local
    init_simd_dispatch();

    // Register GUCs (configuration variables)
    register_gucs();

    // No background workers (Neon restriction)
    // All maintenance is on-demand or during queries
}
```

**Key Principles:**

- **No file-based state**: Everything in PostgreSQL shared buffers
- **No background workers**: All work is query-driven
- **Fast initialization**: Extension loads in <100ms
- **Memory-mapped indexes**: Loaded from storage on demand

### 2. Fast Cold Start

Critical for scale-to-zero. RuVector-Postgres achieves sub-100ms initialization:

```
┌─────────────────────────────────────────────────────────────────┐
│                    Cold Start Timeline                           │
├─────────────────────────────────────────────────────────────────┤
│  0ms   │ Extension .so loaded by PostgreSQL                     │
│  5ms   │ _PG_init() called                                      │
│  10ms  │ SIMD feature detection complete                        │
│  15ms  │ GUC registration complete                              │
│  20ms  │ Operator/function registration complete                │
│  25ms  │ Index access method registration complete              │
│  50ms  │ First query ready                                      │
│  75ms  │ Index mmap from storage (on first access)              │
│ 100ms  │ Full warm state achieved                               │
└─────────────────────────────────────────────────────────────────┘
```

**Optimization Techniques:**

1. **Lazy Index Loading**: Indexes mmap'd from storage on first access
2. **No Precomputation**: No tables built at startup
3. **Minimal Allocations**: Stack-based init where possible
4. **Cached SIMD Detection**: One-time CPU feature detection

**Comparison with pgvector:**

| Metric | RuVector | pgvector |
|--------|----------|----------|
| Cold start time | 50ms | 120ms |
| Memory at init | 2 MB | 8 MB |
| First query latency | +10ms | +50ms |

### 3. Memory Efficiency

Neon compute instances have memory limits based on compute units (CU). RuVector-Postgres is memory-conscious:

```sql
-- Check memory usage
SELECT * FROM ruvector_memory_stats();

┌──────────────────────────────────────────────────────────────┐
│                  Memory Statistics                            │
├──────────────────────────────────────────────────────────────┤
│ index_memory_mb        │ 256                                 │
│ vector_cache_mb        │ 64                                  │
│ quantization_tables_mb │ 8                                   │
│ total_extension_mb     │ 328                                 │
└──────────────────────────────────────────────────────────────┘
```

**Memory Optimization Strategies:**

```sql
-- Limit index memory (for smaller Neon instances)
SET ruvector.max_index_memory = '256MB';

-- Use quantization to reduce memory footprint
CREATE INDEX ON items USING ruhnsw (embedding ruvector_l2_ops)
WITH (quantization = 'sq8');  -- 4x memory reduction

-- Use half-precision vectors
CREATE TABLE items (embedding halfvec(1536));  -- 50% memory savings
```

**Memory by Compute Unit:**

| Neon CU | RAM | Recommended Index Size | Quantization |
|---------|-----|------------------------|--------------|
| 0.25 | 1 GB | <128 MB | Required (sq8/pq) |
| 0.5 | 2 GB | <512 MB | Recommended (sq8) |
| 1.0 | 4 GB | <2 GB | Optional |
| 2.0 | 8 GB | <4 GB | Optional |
| 4.0+ | 16+ GB | <8 GB | None |

### 4. No Background Workers

Neon restricts background workers for resource management. RuVector-Postgres is designed without them:

```rust
// ❌ NOT USED: Background workers
// BackgroundWorker::register("ruvector_maintenance", ...);

// ✓ USED: On-demand operations
// - Index vacuum during INSERT/UPDATE
// - Statistics during ANALYZE
// - Maintenance via explicit SQL functions
```

**Alternative Maintenance Patterns:**

```sql
-- Explicit index maintenance (replaces background vacuum)
SELECT ruvector_index_maintenance('items_embedding_idx');

-- Scheduled via pg_cron (if available)
SELECT cron.schedule('vacuum-index', '0 2 * * *',
    $$SELECT ruvector_index_maintenance('items_embedding_idx')$$);

-- Manual statistics update
ANALYZE items;
```

### 5. Connection Pooling Considerations

Neon uses PgBouncer in **transaction mode** for connection pooling. RuVector-Postgres is fully compatible:

**Compatible Features:**

- ✓ No session-level state
- ✓ No temp tables or cursors
- ✓ All settings via GUCs (can be set per-transaction)
- ✓ Thread-safe distance calculations

**Usage Pattern:**

```sql
-- Each transaction is independent
BEGIN;
SET LOCAL ruvector.ef_search = 100;  -- Transaction-local setting
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
COMMIT;

-- Next transaction (potentially different connection)
BEGIN;
SET LOCAL ruvector.ef_search = 200;  -- Different setting
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
COMMIT;
```

### 6. Index Persistence

**How Indexes Are Stored:**

- HNSW/IVFFlat indexes stored in PostgreSQL pages
- Automatically replicated to Neon storage layer
- Preserved across compute restarts
- Shared across branches (copy-on-write)

**Index Build on Neon:**

```sql
-- Non-blocking index build (recommended on Neon)
CREATE INDEX CONCURRENTLY items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 32, ef_construction = 200);

-- Monitor progress
SELECT
    phase,
    blocks_total,
    blocks_done,
    tuples_total,
    tuples_done
FROM pg_stat_progress_create_index;
```

## Neon-Specific Limitations

### 1. Extension Installation (Scale Plan Required)

**Free Plan:**
- Pre-approved extensions only (pgvector is included)
- RuVector requires custom extension approval

**Scale Plan:**
- Custom extensions allowed
- Contact support for installation

**Enterprise Plan:**
- Dedicated support for custom extensions
- Faster approval process

### 2. Compute Suspension

**Behavior:**

- Compute suspends after 5 minutes of inactivity (configurable)
- First query after suspension: +100-200ms latency
- Indexes loaded from storage on first access

**Mitigation:**

```sql
-- Keep-alive query (via cron or application)
SELECT 1;

-- Or use Neon's suspend_timeout setting
-- In Neon console: Project Settings → Compute → Autosuspend delay
```

### 3. Memory Constraints

**Observation:**

- Neon may limit memory below advertised CU limits
- Large index builds may fail with OOM

**Solutions:**

```sql
-- Build index with lower memory
SET maintenance_work_mem = '256MB';
CREATE INDEX CONCURRENTLY ...;

-- Use quantization for large datasets
WITH (quantization = 'pq16');  -- 16x memory reduction
```

### 4. Extension Update Process

**Current Process:**

1. Open support ticket with Neon
2. Provide new `.so` and SQL files
3. Neon reviews and deploys
4. Extension available for `ALTER EXTENSION UPDATE`

**Future:** Self-service extension updates (roadmap item)

## Requesting RuVector on Neon

### For Scale Plan Customers

#### Step 1: Open Support Ticket

Navigate to: [Neon Console](https://console.neon.tech) → **Support**

**Ticket Template:**

```
Subject: Custom Extension Request - RuVector-Postgres

Body:
I would like to install the RuVector-Postgres extension for vector similarity search.

Details:
- Extension: ruvector-postgres
- Version: 0.1.19
- PostgreSQL version: 16 (or your version)
- Project ID: [your-project-id]

Use case:
[Describe your vector search use case]

Repository: https://github.com/ruvnet/ruvector
Documentation: https://github.com/ruvnet/ruvector/tree/main/crates/ruvector-postgres

I can provide pre-built binaries if needed.
```

#### Step 2: Provide Extension Artifacts

Neon will request:

1. **Shared Library** (`.so` file):
   ```bash
   # Build for PostgreSQL 16
   cargo pgrx package --pg-config /path/to/pg_config
   # Artifact: target/release/ruvector-pg16/usr/lib/postgresql/16/lib/ruvector.so
   ```

2. **Control File** (`ruvector.control`):
   ```
   comment = 'High-performance vector similarity search'
   default_version = '0.1.19'
   module_pathname = '$libdir/ruvector'
   relocatable = true
   ```

3. **SQL Scripts**:
   - `ruvector--0.1.0.sql` (initial schema)
   - `ruvector--0.1.0--0.1.19.sql` (migration script)

4. **Security Documentation**:
   - Memory safety audit
   - No unsafe FFI calls
   - No network access
   - Resource limits

#### Step 3: Security Review

Neon engineers will review:

- ✓ Rust memory safety guarantees
- ✓ No unsafe system calls
- ✓ Sandboxed execution
- ✓ Resource limits (memory, CPU)
- ✓ No file system access beyond PostgreSQL

**Timeline:** 1-2 weeks for approval.

#### Step 4: Deployment

Once approved:

```sql
-- Extension becomes available
CREATE EXTENSION ruvector;

-- Verify
SELECT ruvector_version();
```

### For Free Plan Users

**Option 1: Request via Discord**

1. Join [Neon Discord]https://discord.gg/92vNTzKDGp
2. Post in `#feedback` channel
3. Include use case and expected usage

**Option 2: Use pgvector (Pre-installed)**

```sql
-- pgvector is available on all plans
CREATE EXTENSION vector;

-- RuVector provides migration path
-- (See MIGRATION.md)
```

## Migration from pgvector

RuVector-Postgres is API-compatible with pgvector. Migration is seamless:

### Step 1: Create Parallel Tables

```sql
-- Keep existing pgvector table (for rollback)
-- ALTER TABLE items RENAME TO items_pgvector;

-- Create new table with ruvector
CREATE TABLE items_ruvector (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding ruvector(1536)
);

-- Copy data (automatic type conversion)
INSERT INTO items_ruvector (id, content, embedding)
SELECT id, content, embedding::ruvector FROM items;
```

### Step 2: Rebuild Indexes

```sql
-- Drop old pgvector index (if exists)
-- DROP INDEX items_embedding_idx;

-- Create optimized HNSW index
CREATE INDEX items_embedding_ruhnsw_idx ON items_ruvector
USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 32, ef_construction = 200);

-- Analyze for query planner
ANALYZE items_ruvector;
```

### Step 3: Validate Results

```sql
-- Compare search results
WITH pgvector_results AS (
    SELECT id, embedding <-> '[...]'::vector AS dist
    FROM items ORDER BY dist LIMIT 10
),
ruvector_results AS (
    SELECT id, embedding <-> '[...]'::ruvector AS dist
    FROM items_ruvector ORDER BY dist LIMIT 10
)
SELECT
    p.id AS pg_id,
    r.id AS ru_id,
    p.id = r.id AS id_match,
    abs(p.dist - r.dist) < 0.0001 AS dist_match
FROM pgvector_results p
FULL OUTER JOIN ruvector_results r ON p.id = r.id;

-- All rows should have id_match=true, dist_match=true
```

### Step 4: Switch Over

```sql
-- Atomic swap
BEGIN;
ALTER TABLE items RENAME TO items_old;
ALTER TABLE items_ruvector RENAME TO items;
COMMIT;

-- Validate application queries
-- ... run tests ...

-- Drop old table after validation period (e.g., 1 week)
DROP TABLE items_old;
```

## Performance Tuning for Neon

### Instance Size Recommendations

| Neon CU | RAM | Max Vectors | Recommended Settings |
|---------|-----|-------------|---------------------|
| 0.25 | 1 GB | 100K | `m=8, ef=64, sq8 quant` |
| 0.5 | 2 GB | 500K | `m=16, ef=100, sq8 quant` |
| 1.0 | 4 GB | 2M | `m=24, ef=150, optional quant` |
| 2.0 | 8 GB | 5M | `m=32, ef=200, no quant` |
| 4.0 | 16 GB | 10M+ | `m=48, ef=300, no quant` |

### Query Optimization

```sql
-- High recall (use for important queries)
SET ruvector.ef_search = 200;
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;

-- Low latency (use for real-time queries)
SET ruvector.ef_search = 40;
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;

-- Per-query tuning
SET LOCAL ruvector.ef_search = 100;
```

### Index Build Settings

```sql
-- For small Neon instances
SET maintenance_work_mem = '512MB';
SET max_parallel_maintenance_workers = 2;

-- For large Neon instances
SET maintenance_work_mem = '4GB';
SET max_parallel_maintenance_workers = 8;

-- Always use CONCURRENTLY on Neon
CREATE INDEX CONCURRENTLY ...;
```

## Neon Branching with RuVector

### How Branching Works

Neon branches use copy-on-write, so indexes are instantly available:

```
Parent Branch                Child Branch
┌─────────────┐             ┌─────────────┐
│ items       │             │ items       │ (copy-on-write)
│ ├─ data     │──shared────→│ ├─ data     │
│ └─ index    │──shared────→│ └─ index    │
└─────────────┘             └─────────────┘
                              Modify data
                            ┌─────────────┐
                            │ items       │
                            │ ├─ data     │ (diverged)
                            │ └─ index    │ (needs rebuild)
                            └─────────────┘
```

### Branch Creation Workflow

```sql
-- In parent branch: Create index
CREATE INDEX items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops);

-- Create child branch via Neon Console or API
-- Index is instantly available (no rebuild needed)

-- In child branch: Index is read-only until data changes
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
-- Uses parent's index ✓

-- After INSERT/UPDATE in child:
-- Index diverges and needs rebuild
INSERT INTO items VALUES (...);
REINDEX INDEX items_embedding_idx;  -- or CREATE INDEX CONCURRENTLY
```

### Branch-Specific Tuning

```sql
-- Development branch: Faster builds, lower recall
ALTER DATABASE dev_branch SET ruvector.ef_search = 20;

-- Staging branch: Balanced
ALTER DATABASE staging SET ruvector.ef_search = 100;

-- Production branch: High recall
ALTER DATABASE prod SET ruvector.ef_search = 200;
```

## Monitoring on Neon

### Extension Metrics

```sql
-- Index statistics
SELECT * FROM ruvector_index_stats();

┌────────────────────────────────────────────────────────────────┐
│                    Index Statistics                             │
├────────────────────────────────────────────────────────────────┤
│ index_name              │ items_embedding_idx                  │
│ index_size_mb           │ 512                                  │
│ vector_count            │ 1000000                              │
│ dimensions              │ 1536                                 │
│ build_time_seconds      │ 45.2                                 │
│ fragmentation_pct       │ 2.3                                  │
└────────────────────────────────────────────────────────────────┘
```

### Query Performance

```sql
-- Explain analyze for vector queries
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM items
ORDER BY embedding <-> '[0.1, 0.2, ...]'::ruvector
LIMIT 10;

-- Output includes:
-- - Index Scan using items_embedding_idx
-- - Distance calculations: 15000
-- - Buffers: shared hit=250, read=10
-- - Execution time: 12.5ms
```

### Neon Metrics Integration

Use Neon's monitoring dashboard:

1. **Query Time**: Track vector query latencies
2. **Buffer Hit Ratio**: Monitor index cache efficiency
3. **Compute Usage**: Track CPU during index builds
4. **Memory Usage**: Monitor vector memory consumption

## Troubleshooting

### Cold Start Slow

**Symptom:** First query after suspend takes >500ms

**Diagnosis:**

```sql
-- Check extension load time
SELECT extname, extversion FROM pg_extension WHERE extname = 'ruvector';

-- Check SIMD detection
SELECT ruvector_simd_info();
```

**Solution:**

- Expected: 100-200ms for first query
- If >500ms: Contact Neon support (compute issue)
- Use keep-alive queries to prevent suspension

### Memory Pressure

**Symptom:** Index build fails with OOM

**Diagnosis:**

```sql
-- Check current memory usage
SELECT * FROM ruvector_memory_stats();

-- Check Neon compute size
SELECT current_setting('shared_buffers');
```

**Solution:**

```sql
-- Reduce index memory
SET ruvector.max_index_memory = '128MB';

-- Use aggressive quantization
CREATE INDEX ... WITH (quantization = 'pq16');

-- Upgrade Neon compute unit
-- Neon Console → Project Settings → Compute → Scale up
```

### Index Build Timeout

**Symptom:** `CREATE INDEX` times out on large dataset

**Solution:**

```sql
-- Always use CONCURRENTLY
CREATE INDEX CONCURRENTLY items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops);

-- Split into batches
CREATE TABLE items_batch_1 AS SELECT * FROM items LIMIT 100000;
CREATE INDEX ... ON items_batch_1;
-- Repeat for batches, then UNION ALL
```

### Connection Pool Compatibility

**Symptom:** Settings not persisting across queries

**Cause:** PgBouncer transaction mode resets session state

**Solution:**

```sql
-- Use SET LOCAL (transaction-scoped)
BEGIN;
SET LOCAL ruvector.ef_search = 100;
SELECT ... ORDER BY embedding <-> query;
COMMIT;

-- Or set defaults in postgresql.conf
ALTER DATABASE mydb SET ruvector.ef_search = 100;
```

## Support Resources

- **Neon Documentation**: https://neon.tech/docs
- **RuVector GitHub**: https://github.com/ruvnet/ruvector
- **RuVector Issues**: https://github.com/ruvnet/ruvector/issues
- **Neon Discord**: https://discord.gg/92vNTzKDGp
- **Neon Support**: console.neon.tech → Support (Scale plan+)