pub struct EmbeddingConfig {
pub provider: EmbeddingProvider,
pub model: String,
pub dimensions: usize,
pub endpoint: Option<String>,
pub api_key: Option<String>,
pub api_version: Option<String>,
pub max_completion_tokens: usize,
pub batch_size: usize,
pub mock: bool,
pub mock_mode: MockVectorMode,
pub onnx: OnnxEmbeddingConfig,
pub huggingface_tokenizer: Option<String>,
}Expand description
Unified embedding configuration.
Provider-agnostic; holds fields for all supported backends.
Load from environment variables via EmbeddingConfig::from_env, or construct
programmatically and pass to EmbeddingConfig::create_engine.
Environment variables (match Python SDK names):
EMBEDDING_PROVIDER— backend selection (default:openai;onnxon Android)MOCK_EMBEDDING— set totrue/1/yesto force mock modeEMBEDDING_MODEL— model identifierEMBEDDING_DIMENSIONS— vector sizeEMBEDDING_ENDPOINT— API endpoint URLEMBEDDING_API_KEY— API key (fallback:LLM_API_KEY)EMBEDDING_API_VERSION— API version stringEMBEDDING_MAX_COMPLETION_TOKENS— maximum tokens (default: 8191)EMBEDDING_BATCH_SIZE— texts per embedding request (default: 36)EMBEDDING_ONNX_BATCH_SIZE— ONNX inference batch size (default: 32;onnxfeature only)HUGGINGFACE_TOKENIZER— HuggingFace tokenizer identifier
Fields§
§provider: EmbeddingProviderWhich backend to use for embedding generation.
model: StringModel identifier. For ONNX this is informational; for API providers this is sent in the request body. Default depends on provider (BGE-Small-v1.5 for ONNX, empty for others).
dimensions: usizeEmbedding vector dimensionality. Must match the model output.
endpoint: Option<String>API endpoint URL (used by OpenAI-compatible and Ollama providers).
api_key: Option<String>API key. Reads EMBEDDING_API_KEY first, falls back to LLM_API_KEY.
api_version: Option<String>API version string (e.g. “2023-05-15” for Azure OpenAI).
max_completion_tokens: usizeMaximum tokens for completion requests (default: 8191).
batch_size: usizeNumber of texts to send in a single embedding request (default: 36).
Matches the Python SDK and stays within the small client-batch limits of
the self-hosted servers this adapter targets (e.g. TEI defaults to 32).
Raise it via EMBEDDING_BATCH_SIZE for providers that accept larger
batches. For the OpenAI-compatible engine, up to MAX_CONCURRENT_BATCHES
sub-batches are also dispatched concurrently.
mock: boolIf true, use mock embeddings regardless of provider.
Overrides provider to Mock. Set via MOCK_EMBEDDING=true.
mock_mode: MockVectorModeHow the mock engine generates vectors when provider is Mock.
Defaults to MockVectorMode::Zero. Set via MOCK_EMBEDDING=deterministic
to derive content-stable vectors from sha256(text).
onnx: OnnxEmbeddingConfigONNX-specific configuration. Only consulted when provider is Onnx or Fastembed.
huggingface_tokenizer: Option<String>HuggingFace tokenizer identifier for chunking token counting.
When set, used by HuggingFaceTokenCounter in the chunking crate.
Implementations§
Source§impl EmbeddingConfig
impl EmbeddingConfig
Sourcepub fn from_env() -> Self
pub fn from_env() -> Self
Load configuration from environment variables.
Reads the same env var names as the Python SDK so that a shared .env file
works across both implementations without modification.
Sourcepub fn effective_provider(&self) -> EmbeddingProvider
pub fn effective_provider(&self) -> EmbeddingProvider
Returns the effective provider, substituting Mock when self.mock is true.
Sourcepub async fn create_engine(&self) -> EmbeddingResult<Arc<dyn EmbeddingEngine>>
pub async fn create_engine(&self) -> EmbeddingResult<Arc<dyn EmbeddingEngine>>
Create an embedding engine based on this configuration.
Dispatches to the appropriate engine implementation based on
EmbeddingConfig::effective_provider. Providers not yet implemented
return [EmbeddingError::NotImplemented].
Trait Implementations§
Source§impl Clone for EmbeddingConfig
impl Clone for EmbeddingConfig
Source§fn clone(&self) -> EmbeddingConfig
fn clone(&self) -> EmbeddingConfig
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for EmbeddingConfig
impl Debug for EmbeddingConfig
Source§impl Default for EmbeddingConfig
impl Default for EmbeddingConfig
Source§impl<'de> Deserialize<'de> for EmbeddingConfig
impl<'de> Deserialize<'de> for EmbeddingConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for EmbeddingConfig
impl RefUnwindSafe for EmbeddingConfig
impl Send for EmbeddingConfig
impl Sync for EmbeddingConfig
impl Unpin for EmbeddingConfig
impl UnsafeUnpin for EmbeddingConfig
impl UnwindSafe for EmbeddingConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> DeserializeOwned for Twhere
T: for<'de> Deserialize<'de>,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more