Expand description
Embedding backend abstraction layer.
Defines the EmbedBackend trait that all embedding backends (CPU, CUDA,
MLX) implement, plus the Encoding input type and BackendKind
discriminant. Use load_backend to construct a backend by kind.
Modules§
- arch
- Model architecture trait and variant enum.
- blas_
info - Runtime BLAS detection and optimization recommendations.
- driver
- Hardware-agnostic compute driver trait.
- generic
- Generic backend that pairs a
Driverwith aModelArch.
Structs§
- Encoding
- Pre-tokenized encoding ready for inference.
- Inference
Opts - Inference optimization parameters passed through to model loading.
Enums§
- Backend
Kind - Available embedding backend implementations.
- Device
Hint - Device hint passed to
load_backend.
Traits§
- Embed
Backend - Trait for embedding backends.
Functions§
- detect_
backends - Detect all available backends and load them.
- load_
backend - Construct an embedding backend of the given kind.