Skip to main content

Module embedding

Module embedding 

Source
Expand description

Standalone embedding service for generating dataset embeddings.

This service is decoupled from harvesting — it processes datasets that are already stored in the database with embedding IS NULL. This enables:

  • Harvesting metadata without an embedding API key
  • Switching embedding providers without re-harvesting
  • Backfilling embeddings after outages
  • Independent scaling of harvest and embedding workloads

§Example

use ceres_core::embedding::EmbeddingService;

let service = EmbeddingService::new(store, embedding_provider);

// Embed all pending datasets
let stats = service.embed_pending(None, &reporter, cancel_token).await?;
println!("Embedded {} datasets", stats.embedded);

Structs§

EmbeddingService
Standalone service for generating embeddings for datasets already in the database.
EmbeddingStats
Statistics from an embedding run.