hive-gpu 0.2.0

High-performance GPU acceleration for vector operations with Device Info API (Metal, CUDA, ROCm)
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
# hive-gpu - API Reference


## Overview


This document provides a comprehensive reference for the hive-gpu API. For tutorials and examples, see the [README](../README.md) and [examples](../examples/).

---

## Core Types


### `GpuVector`


Represents a vector with its associated data and metadata.

```rust
pub struct GpuVector {
    pub id: String,
    pub data: Vec<f32>,
    pub metadata: HashMap<String, String>,
}
```

#### Methods


##### `new`

```rust
pub fn new(id: String, data: Vec<f32>) -> Self
```

Creates a new GPU vector with the given ID and data.

**Parameters:**
- `id`: Unique identifier for the vector
- `data`: Vector data (f32 values)

**Example:**
```rust
let vector = GpuVector::new("vec_1".to_string(), vec![1.0, 2.0, 3.0]);
```

##### `with_metadata`

```rust
pub fn with_metadata(
    id: String,
    data: Vec<f32>,
    metadata: HashMap<String, String>
) -> Self
```

Creates a new GPU vector with metadata.

**Example:**
```rust
let mut metadata = HashMap::new();
metadata.insert("source".to_string(), "document_1".to_string());
let vector = GpuVector::with_metadata("vec_1".into(), vec![1.0, 2.0], metadata);
```

##### `dimension`

```rust
pub fn dimension(&self) -> usize
```

Returns the dimension (length) of the vector.

##### `memory_size`

```rust
pub fn memory_size(&self) -> usize
```

Returns the approximate memory usage in bytes.

---

### `GpuDistanceMetric`


Distance metric for vector similarity computation.

```rust
pub enum GpuDistanceMetric {
    Cosine,
    Euclidean,
    DotProduct,
}
```

#### Variants


- **`Cosine`**: Cosine similarity (1 - cosine distance)
  - Range: [0, 2] higher = more similar
  - Best for: Semantic similarity, normalized vectors
  
- **`Euclidean`**: Euclidean (L2) distance
  - Range: [0, ∞] lower = more similar
  - Best for: Spatial data, absolute distances
  
- **`DotProduct`**: Dot product similarity
  - Range: (-∞, ∞) (higher = more similar)
  - Best for: Magnitude-aware similarity

---

### `GpuSearchResult`


Result of a vector similarity search.

```rust
pub struct GpuSearchResult {
    pub id: String,
    pub score: f32,
    pub index: usize,
}
```

#### Fields


- **`id`**: Vector identifier
- **`score`**: Similarity score (interpretation depends on metric)
- **`index`**: Internal storage index

---

### `HnswConfig`


Configuration for HNSW (Hierarchical Navigable Small World) graph.

```rust
pub struct HnswConfig {
    pub max_connections: usize,      // Default: 16
    pub ef_construction: usize,      // Default: 100
    pub ef_search: usize,            // Default: 50
    pub max_level: usize,            // Default: 8
    pub level_multiplier: f32,       // Default: 0.5
    pub seed: Option<u64>,           // Default: None
}
```

#### Fields


- **`max_connections`**: Maximum bidirectional links per node (M parameter)
  - Higher = better recall, more memory
  - Typical range: 8-64
  
- **`ef_construction`**: Size of dynamic candidate list during construction
  - Higher = better quality, slower construction
  - Typical range: 100-500
  
- **`ef_search`**: Size of dynamic candidate list during search
  - Higher = better recall, slower search
  - Typical range: 50-500
  
- **`max_level`**: Maximum number of layers in the hierarchy
  - Typical range: 6-10
  
- **`level_multiplier`**: Level assignment probability
  - Default: 0.5
  
- **`seed`**: Random seed for reproducible level assignment
  - None = random seed

**Example:**
```rust
let config = HnswConfig {
    max_connections: 32,
    ef_construction: 200,
    ef_search: 100,
    max_level: 8,
    level_multiplier: 0.5,
    seed: Some(42),
};
```

---

### `GpuDeviceInfo`


Comprehensive GPU device information including memory, capabilities, and backend-specific details.

```rust
pub struct GpuDeviceInfo {
    pub name: String,                          // Device name (e.g., "Apple M2 Pro")
    pub backend: String,                       // Backend type (e.g., "Metal", "CUDA")
    pub total_vram_bytes: u64,                 // Total VRAM in bytes
    pub available_vram_bytes: u64,             // Currently available VRAM
    pub used_vram_bytes: u64,                  // Currently used VRAM
    pub driver_version: String,                // Driver version string
    pub compute_capability: Option<String>,    // Compute capability (CUDA) or architecture
    pub max_threads_per_block: u32,            // Maximum threads per block/workgroup
    pub max_shared_memory_per_block: u64,      // Maximum shared memory per block (bytes)
    pub device_id: u32,                        // Device ID
    pub pci_bus_id: Option<String>,            // PCI bus ID (if available)
}
```

#### Methods


##### `vram_usage_percent`

```rust
pub fn vram_usage_percent(&self) -> f64
```

Calculate VRAM usage percentage (0.0 to 100.0).

**Returns:** VRAM usage percentage

**Example:**
```rust
let context = MetalNativeContext::new()?;
let info = context.device_info()?;
println!("VRAM usage: {:.1}%", info.vram_usage_percent());
```

##### `has_available_vram`

```rust
pub fn has_available_vram(&self, required_bytes: u64) -> bool
```

Check if the specified amount of VRAM is available.

**Parameters:**
- `required_bytes`: Amount of VRAM required in bytes

**Returns:** `true` if sufficient VRAM is available, `false` otherwise

**Example:**
```rust
let info = context.device_info()?;
if info.has_available_vram(1024 * 1024 * 1024) { // 1 GB
    println!("Sufficient VRAM available");
} else {
    println!("Insufficient VRAM");
}
```

##### `available_vram_mb`

```rust
pub fn available_vram_mb(&self) -> u64
```

Get available VRAM in megabytes (convenience method).

**Returns:** Available VRAM in MB

**Example:**
```rust
let info = context.device_info()?;
println!("Available: {} MB", info.available_vram_mb());
```

##### `total_vram_mb`

```rust
pub fn total_vram_mb(&self) -> u64
```

Get total VRAM in megabytes (convenience method).

**Returns:** Total VRAM in MB

**Example:**
```rust
let info = context.device_info()?;
println!("Total: {} MB", info.total_vram_mb());
```

#### Usage Example


```rust
use hive_gpu::metal::MetalNativeContext;
use hive_gpu::traits::GpuContext;

// Create context
let context = MetalNativeContext::new()?;

// Get device information
let info = context.device_info()?;

// Inspect device properties
println!("Device: {}", info.name);
println!("Backend: {}", info.backend);
println!("Total VRAM: {} MB", info.total_vram_mb());
println!("Available VRAM: {} MB", info.available_vram_mb());
println!("Usage: {:.1}%", info.vram_usage_percent());
println!("Driver: {}", info.driver_version);
println!("Max threads/block: {}", info.max_threads_per_block);

// Check if sufficient VRAM for operation
let required_vram = 2 * 1024 * 1024 * 1024; // 2 GB
if info.has_available_vram(required_vram) {
    // Proceed with operation
} else {
    println!("Insufficient VRAM for operation");
}
```

---

### `GpuMemoryStats`


GPU memory usage statistics.

```rust
pub struct GpuMemoryStats {
    pub total_allocated: usize,
    pub available: usize,
    pub utilization: f32,
    pub buffer_count: usize,
}
```

---

## Core Traits


### `GpuContext`


Factory trait for creating GPU vector storage instances.

```rust
pub trait GpuContext {
    fn create_storage(
        &self,
        dimension: usize,
        metric: GpuDistanceMetric
    ) -> Result<Box<dyn GpuVectorStorage>>;

    fn create_storage_with_config(
        &self,
        dimension: usize,
        metric: GpuDistanceMetric,
        config: HnswConfig
    ) -> Result<Box<dyn GpuVectorStorage>>;

    fn memory_stats(&self) -> GpuMemoryStats;
    fn device_info(&self) -> GpuDeviceInfo;
}
```

#### Methods


##### `create_storage`


Creates a new vector storage with default configuration.

**Parameters:**
- `dimension`: Vector dimension (must be consistent for all vectors)
- `metric`: Distance metric to use

**Returns:** `Result<Box<dyn GpuVectorStorage>>`

**Example:**
```rust
let context = MetalNativeContext::new()?;
let storage = context.create_storage(128, GpuDistanceMetric::Cosine)?;
```

##### `create_storage_with_config`


Creates a new vector storage with HNSW configuration.

**Parameters:**
- `dimension`: Vector dimension
- `metric`: Distance metric
- `config`: HNSW configuration

**Example:**
```rust
let config = HnswConfig {
    max_connections: 32,
    ef_construction: 200,
    ..Default::default()
};
let storage = context.create_storage_with_config(128, GpuDistanceMetric::Cosine, config)?;
```

---

### `GpuVectorStorage`


Main interface for vector operations.

```rust
pub trait GpuVectorStorage {
    fn add_vectors(&mut self, vectors: &[GpuVector]) -> Result<Vec<usize>>;
    fn search(&self, query: &[f32], limit: usize) -> Result<Vec<GpuSearchResult>>;
    fn remove_vectors(&mut self, ids: &[String]) -> Result<()>;
    fn vector_count(&self) -> usize;
    fn dimension(&self) -> usize;
    fn get_vector(&self, id: &str) -> Result<Option<GpuVector>>;
    fn clear(&mut self) -> Result<()>;
}
```

#### Methods


##### `add_vectors`


Adds multiple vectors to storage.

**Parameters:**
- `vectors`: Slice of `GpuVector` to add

**Returns:** `Result<Vec<usize>>` - Internal indices of added vectors

**Errors:**
- `InvalidDimension`: Vector dimension doesn't match storage
- `OutOfMemory`: Insufficient GPU memory
- `GpuOperationFailed`: GPU computation failed

**Example:**
```rust
let vectors = vec![
    GpuVector::new("v1".into(), vec![1.0, 2.0, 3.0]),
    GpuVector::new("v2".into(), vec![4.0, 5.0, 6.0]),
];
let indices = storage.add_vectors(&vectors)?;
```

**Performance:**
- Time Complexity: O(n × d) where n = vector count, d = dimension
- With HNSW: O(n × log(n) × d)
- Batching is more efficient than individual additions

##### `search`


Searches for k nearest neighbors.

**Parameters:**
- `query`: Query vector (must match storage dimension)
- `limit`: Maximum number of results (k)

**Returns:** `Result<Vec<GpuSearchResult>>` - Sorted by similarity (descending)

**Errors:**
- `InvalidDimension`: Query dimension doesn't match storage
- `InvalidOperation`: Empty storage
- `GpuOperationFailed`: GPU computation failed

**Example:**
```rust
let query = vec![1.0, 2.0, 3.0];
let results = storage.search(&query, 10)?;
for result in results {
    println!("{}: {}", result.id, result.score);
}
```

**Performance:**
- Time Complexity: O(n × d) brute-force
- With HNSW: O(log(n) × d)
- GPU-accelerated: Up to 100x faster than CPU

##### `remove_vectors`


Removes vectors by their IDs.

**Parameters:**
- `ids`: Slice of vector IDs to remove

**Returns:** `Result<()>`

**Example:**
```rust
storage.remove_vectors(&["v1", "v2"])?;
```

##### `vector_count`


Returns the total number of vectors in storage.

```rust
let count = storage.vector_count();
```

##### `dimension`


Returns the vector dimension of the storage.

```rust
let dim = storage.dimension();
```

##### `get_vector`


Retrieves a vector by ID.

**Parameters:**
- `id`: Vector ID

**Returns:** `Result<Option<GpuVector>>` - `None` if not found

**Example:**
```rust
if let Some(vector) = storage.get_vector("v1")? {
    println!("Found vector with {} dimensions", vector.dimension());
}
```

##### `clear`


Removes all vectors from storage.

```rust
storage.clear()?;
assert_eq!(storage.vector_count(), 0);
```

---

### `GpuBackend`


Backend information and capabilities.

```rust
pub trait GpuBackend {
    fn device_info(&self) -> GpuDeviceInfo;
    fn supports_operations(&self) -> GpuCapabilities;
    fn memory_stats(&self) -> GpuMemoryStats;
}
```

---

## Backend Implementations


### Metal Native (macOS)


#### `MetalNativeContext`


Metal backend context for Apple Silicon.

```rust
impl MetalNativeContext {
    pub fn new() -> Result<Self>;
    pub fn device(&self) -> &MetalDevice;
    pub fn command_queue(&self) -> &CommandQueue;
    pub fn device_name(&self) -> String;
    pub fn supports_mps(&self) -> bool;
    pub fn max_threadgroup_size(&self) -> MTLSize;
    pub fn max_buffer_size(&self) -> u64;
}
```

**Example:**
```rust
use hive_gpu::metal::context::MetalNativeContext;
use hive_gpu::traits::{GpuContext, GpuVectorStorage};

let context = MetalNativeContext::new()?;
println!("Device: {}", context.device_name());

let mut storage = context.create_storage(128, GpuDistanceMetric::Cosine)?;
```

---

## Error Handling


### `HiveGpuError`


Main error type for the library.

```rust
pub enum HiveGpuError {
    NoDeviceAvailable,
    DeviceNotSupported(String),
    OutOfMemory(String),
    AllocationFailed(String),
    InvalidDimension { expected: usize, got: usize },
    InvalidOperation(String),
    InvalidConfiguration(String),
    ShaderCompilationFailed(String),
    GpuOperationFailed(String),
    BackendError(String),
    Io(std::io::Error),
    Other(String),
}
```

#### Variants


- **`NoDeviceAvailable`**: No GPU device found
- **`DeviceNotSupported`**: Device doesn't support required features
- **`OutOfMemory`**: Insufficient GPU memory
- **`AllocationFailed`**: Buffer allocation failed
- **`InvalidDimension`**: Vector dimension mismatch
- **`InvalidOperation`**: Operation not allowed in current state
- **`InvalidConfiguration`**: Invalid configuration parameters
- **`ShaderCompilationFailed`**: GPU shader compilation error
- **`GpuOperationFailed`**: GPU computation error
- **`BackendError`**: Backend-specific error
- **`Io`**: I/O error
- **`Other`**: Other errors

---

## Backend Detection


### Functions


#### `detect_available_backends`


```rust
pub fn detect_available_backends() -> Vec<GpuBackendType>
```

Returns a list of all available GPU backends on the system.

**Example:**
```rust
use hive_gpu::backends::detector::detect_available_backends;

let backends = detect_available_backends();
for backend in backends {
    println!("Available: {}", backend);
}
```

#### `select_best_backend`


```rust
pub fn select_best_backend() -> Result<GpuBackendType>
```

Selects the best available backend based on performance priority.

**Priority:** Metal > CUDA > CPU

**Example:**
```rust
use hive_gpu::backends::detector::select_best_backend;

let best = select_best_backend()?;
println!("Using backend: {}", best);
```

---

## Feature Flags


Enable specific backends via Cargo features:

```toml
[dependencies]
hive-gpu = { version = "0.1", features = ["metal-native"] }  # macOS
hive-gpu = { version = "0.1", features = ["cuda"] }          # NVIDIA
hive-gpu = { version = "0.1", features = ["rocm"] }          # AMD GPU
hive-gpu = { version = "0.1", features = ["metal-native", "cuda", "rocm"] }  # All native backends
```

### Available Features


- **`metal-native`** (default): Pure Metal backend for Apple Silicon
- **`cuda`**: CUDA backend for NVIDIA GPUs (planned)
- **`rocm`**: ROCm backend for AMD GPUs (planned)

---

## Performance Characteristics


### Time Complexity


| Operation | Brute Force | With HNSW |
|-----------|-------------|-----------|
| Add Vector | O(1) | O(log n) |
| Search | O(n × d) | O(log n × d) |
| Remove Vector | O(n) | O(log n) |

Where:
- `n` = number of vectors
- `d` = vector dimension

### Memory Usage


| Component | Memory | Notes |
|-----------|--------|-------|
| Vector Data | n × d × 4 bytes | f32 values |
| HNSW Graph | n × M × 8 bytes | M = max_connections |
| Metadata | n × ~64 bytes | ID + metadata |

### GPU Memory (VRAM)


All vector data and HNSW graphs are stored entirely in GPU memory (VRAM) for maximum performance. CPU memory is only used for:
- Initial vector upload
- Final search results download
- Metadata storage

---

## Best Practices


### 1. Batch Operations


```rust
// ✅ GOOD: Batch addition
let vectors = (0..1000).map(|i| GpuVector::new(/*...*/)).collect::<Vec<_>>();
storage.add_vectors(&vectors)?;

// ❌ BAD: Individual additions
for i in 0..1000 {
    storage.add_vectors(&[GpuVector::new(/*...*/)])?;  // Slow!
}
```

### 2. Dimension Consistency


```rust
// ✅ GOOD: All vectors have same dimension
let storage = context.create_storage(128, GpuDistanceMetric::Cosine)?;
let v1 = GpuVector::new("v1".into(), vec![0.0; 128]);
let v2 = GpuVector::new("v2".into(), vec![0.0; 128]);

// ❌ BAD: Dimension mismatch
let v3 = GpuVector::new("v3".into(), vec![0.0; 64]);  // Error!
```

### 3. HNSW Configuration


```rust
// For high recall (accuracy):
let config = HnswConfig {
    max_connections: 32,
    ef_construction: 200,
    ef_search: 100,
    ..Default::default()
};

// For high speed:
let config = HnswConfig {
    max_connections: 16,
    ef_construction: 100,
    ef_search: 50,
    ..Default::default()
};
```

### 4. Error Handling


```rust
// ✅ GOOD: Handle errors properly
match storage.search(&query, 10) {
    Ok(results) => process_results(results),
    Err(HiveGpuError::InvalidDimension { expected, got }) => {
        eprintln!("Dimension mismatch: expected {}, got {}", expected, got);
    }
    Err(e) => eprintln!("Search failed: {}", e),
}

// ❌ BAD: Unwrap
let results = storage.search(&query, 10).unwrap();  // Panic on error!
```

---

## Version Compatibility


| hive-gpu | Rust | Metal | CUDA | ROCm |
|----------|------|-------|------|------|
| 0.1.x | 1.85+ | 0.27+ | N/A | N/A |
| 0.2.x (planned) | 1.85+ | 0.27+ | TBD | TBD |

---

*Last Updated: 2025-01-03*
*API Version: 0.1.6*