Expand description
GPU acceleration for vector operations using CUDA
This module provides GPU acceleration for:
- Distance calculations (cosine, euclidean, etc.)
- Batch vector operations
- Parallel search algorithms
- Matrix operations for embeddings
§CUDA Feature Gating (Pure Rust Policy)
GPU acceleration is optional and properly feature-gated:
- Default build: 100% Pure Rust, no CUDA required, CPU implementations only
- With
cudafeature: GPU acceleration when CUDA toolkit is installed - With
cudafeature but no toolkit: Graceful fallback to CPU implementations
All CUDA-dependent code is gated with #[cfg(all(feature = "cuda", cuda_runtime_available))]
to ensure the crate builds successfully regardless of CUDA availability.
Re-exports§
pub use accelerator::create_default_accelerator;pub use accelerator::create_memory_optimized_accelerator;pub use accelerator::create_performance_accelerator;pub use accelerator::is_gpu_available;pub use accelerator::GpuAccelerator;pub use buffer::GpuBuffer;pub use config::GpuConfig;pub use config::OptimizationLevel;pub use config::PrecisionMode;pub use device::GpuDevice;pub use index::AdvancedGpuVectorIndex;pub use index::BatchVectorProcessor;pub use index::GpuVectorIndex;pub use index_builder::BatchSizeCalculator;pub use index_builder::ComputedBatch;pub use index_builder::GpuBatchDistanceComputer;pub use index_builder::GpuDistanceMetric;pub use index_builder::GpuHnswIndexBuilder;pub use index_builder::GpuIndexBuildStats;pub use index_builder::GpuIndexBuilderConfig;pub use index_builder::GpuIndexOptimizer;pub use index_builder::GpuMemoryBudget;pub use index_builder::HnswGraph;pub use index_builder::HnswNode;pub use index_builder::IncrementalGpuIndexBuilder;pub use index_builder::IndexedBatch;pub use index_builder::PipelinedIndexBuilder;pub use index_builder::PreparedBatch;pub use load_balancer::GpuLoadBalancer;pub use load_balancer::SimpleGpuDevice;pub use load_balancer::WorkloadChunk;pub use load_balancer::WorkloadDistributor;pub use memory_pool::GpuMemoryPool;pub use multi_gpu::GpuDeviceMetrics;pub use multi_gpu::GpuTaskOutput;pub use multi_gpu::GpuTaskResult;pub use multi_gpu::LoadBalancingStrategy;pub use multi_gpu::MultiGpuConfig;pub use multi_gpu::MultiGpuConfigFactory;pub use multi_gpu::MultiGpuManager;pub use multi_gpu::MultiGpuStats;pub use multi_gpu::MultiGpuTask;pub use multi_gpu::TaskPriority;pub use performance::GpuPerformanceStats;pub use types::GpuExecutionConfig;pub use kernels::*;
Modules§
- accelerator
- Main GPU accelerator implementation
- buffer
- GPU memory buffer management
- config
- GPU configuration structures and enums
- device
- GPU device information and management
- index
- GPU-accelerated vector index implementations
- index_
builder - GPU-accelerated HNSW index construction
- kernels
- CUDA kernel implementations for various vector operations
- load_
balancer - GPU load balancing for distributing index-building work across multiple devices.
- memory_
pool - GPU memory pool management for efficient allocation and reuse
- multi_
gpu - Multi-GPU load balancing for distributed vector index operations
- performance
- GPU performance monitoring and statistics
- runtime
- CUDA runtime types and utilities
- types
- Common GPU types and structures