1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
//! lunaris-embed — real `Embedder` impls for Phase 2 hot path (INGEST-02).
//!
//! - **Default backend** (`feature = "candle"`): `CandleEmbeddingGemma` — loads
//! EmbeddingGemma 300M tokenizer + token-embedding matrix from a local cache,
//! mean-pools the embedded token vectors per input, and L2-normalises to a 768-d
//! unit vector. Falls back with an actionable `LunarisError::Storage` error
//! when the weights cache is missing.
//! - **Alt backend** (`feature = "ollama"`): `OllamaEmbedder` — POSTs each batch
//! to `<endpoint>/api/embed` (Ollama's `embed` HTTP API), validates the response
//! shape against [`Embedder::dim`], and returns 768-d rows. 10s HTTP timeout
//! (CLAUDE.md: "design for failure — timeouts").
//!
//! Phase 1's [`lunaris_core::StubEmbedder`] remains the deterministic test impl —
//! ingest tests inject it via the `Lunaris::with_embedder` escape hatch so they
//! don't pay model-load latency. Production callers get `CandleEmbeddingGemma`
//! by default through `Lunaris::open(url)` (Plan 02-01 Task 3).
//!
//! ## Latency budget swap escape hatch
//!
//! Per `02-01-PLAN.md` critical constraints: if candle local inference busts
//! the per-batch budget on the dev box (8ms p50 / 20ms p99 per blueprint §4.1),
//! callers swap to `OllamaEmbedder` via `Lunaris::with_embedder(Arc::new(...))`.
//! The trait shape does NOT change either way — that's the whole point of the
//! Phase 1 [`Embedder`] interface lock.
// RFC 0007 §3 — FallbackEmbedder<P, F> static-dispatch combinator with
// per-instance CircuitBreaker. Always built; mirrors lunaris-extract::fallback.
pub use ;
pub use ;
pub use ;
pub use ;
// Re-export the trait from core so callers can `use lunaris_embed::Embedder` in
// one import alongside the concrete backends.
pub use Embedder;