quiverdb-providers 0.29.0

Opt-in, provider-agnostic embedding & reranking adapters (ADR-0047/0058).

The Quiver engine is deliberately model-agnostic: it stores and searches float vectors and knows nothing about embedding models. This crate is the edge adapter that lets an operator turn "give me text" into a stored/searched vector without the client running an embedding model — the single biggest RAG friction. It lives in its own lean crate (no axum/tonic) so it can be shared by both the network server (quiver-server) and the in-process MCP server (quiver-mcp) without either pulling the other's dependency tree (ADR-0058); it is never used by quiver-core or the quiver-embed engine crate, so library-mode users pay nothing.

Design (ADR-0047)

Provider-agnostic. [EmbeddingProvider] / [RerankProvider] are traits; OpenAI-compatible servers (OpenAI, Ollama's /v1 endpoint, vLLM, LM Studio, llama.cpp, …) share one HTTP adapter parameterized by base URL + auth, Cohere has its own shape, and a deterministic [FakeEmbedder]/[FakeReranker] backs tests and the acceptance script. No vendor is hard-coded; selection is config.
Opt-in, per collection, default off. Configured in the server config ([embedding.<collection>] / [rerank.<collection>]), not the on-disk descriptor — so the engine and the crash gate are untouched.
No secrets on disk. Config stores the name of an environment variable ([EmbeddingConfig::api_key_env]); the value is resolved at registry-build time and never persisted.

Testing honesty

The pure request-build and response-parse functions are unit-tested, and the fake provider exercises the full text-in/text-out path. The methods that make a live HTTP call ([OpenAiCompatEmbedder::embed], [CohereEmbedder::embed], [CohereReranker::rerank]) are thin shells around those tested helpers and a ureq call; live network calls are not in CI (stated, not faked).