bext_plugin_api/search.rs
1//! Search-client capability trait and types.
2//!
3//! A `SearchClientPlugin` is the runtime-side face of a search backend:
4//! issue queries, push documents into an index, delete documents by id.
5//! Backends fall into two families:
6//!
7//! 1. **Dedicated search engines** (`@bext/search-meili`,
8//! `@bext/search-typesense`, `@bext/search-elastic`) — an external
9//! service holds the inverted index and handles ranking.
10//! 2. **SQL full-text** (`@bext/search-pg`) — the existing Postgres
11//! instance runs `to_tsvector` / `plainto_tsquery` on a regular table.
12//! No new infrastructure, good enough for "I want search and I have
13//! one database" sites.
14//!
15//! The trait stays sync to match the rest of `bext-plugin-api`. Backends
16//! that speak native async (the Meilisearch SDK, the Elasticsearch Rust
17//! client) either use their blocking sibling or own a small tokio runtime
18//! and call `block_on` — the same pattern `@bext/auth-jwt`'s JWKS fetcher
19//! and `@bext/flags-openfeature` use. Plugins cannot expose async across
20//! the sandbox boundary, so the host-facing shape is sync.
21//!
22//! ## Query shape is intentionally small
23//!
24//! `SearchQuery` carries a text string, equality filters on attributes,
25//! a limit, and an offset. That covers the 80% case (`autocomplete`,
26//! `search within category`, `keyword + facet`) without leaking any
27//! vendor's query DSL into the trait. Two escape hatches exist for
28//! richer needs:
29//!
30//! * A backend that wants a raw JSON query can accept it in `text` and
31//! document the shape — the trait does not parse `text`.
32//! * A backend can expose its own richer API *behind* `SearchClientPlugin`
33//! at construction time, then narrow down to the trait when called
34//! from capability-dispatching code.
35//!
36//! The alternative — growing a rich shared query DSL — is the trap the
37//! [architecture doc](../../plan/ecosystem/00-architecture.md) calls out:
38//! vendor-coupled shapes end up looking like whichever backend shipped
39//! first and never fit the next one cleanly.
40//!
41//! ## Document and hit payloads are JSON strings
42//!
43//! `Document::fields_json` and `SearchHit::source_json` are plain JSON
44//! strings, not `serde_json::Value`s. This matches the Session capability
45//! carrying session data, the Lifecycle capability carrying event
46//! payloads, and the Feature Flag capability carrying structured flag
47//! values. The reason is the same every time: the WASM / QuickJS / nsjail
48//! sandbox ABI is flatter when it only has to transport bytes, and the
49//! host-facing code pays one `serde_json::from_str` at the edge instead
50//! of shoving a fully-typed value across the boundary.
51
52/// A single document to push into an index.
53///
54/// `id` is the stable external identifier — backends use it to
55/// deduplicate on re-index and as the target for `delete`. `fields_json`
56/// is a JSON object encoded as a string; the backend decides which
57/// top-level keys are searchable vs filterable vs stored. The trait does
58/// not validate the JSON beyond requiring it to be parseable.
59#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
60pub struct Document {
61 /// Stable, caller-supplied id. Backends use this for upsert /
62 /// delete semantics.
63 pub id: String,
64 /// JSON object as a string. Top-level keys map to indexable fields.
65 pub fields_json: String,
66}
67
68/// A query to issue against a named index.
69///
70/// Deliberately minimal — see the module docs for the rationale.
71#[derive(Debug, Clone, Default, serde::Serialize, serde::Deserialize)]
72pub struct SearchQuery {
73 /// Free-form text query. Empty string means "match all", subject to
74 /// filters. Backends may also accept a raw DSL here if they choose.
75 pub text: String,
76 /// Attribute equality filters, applied as `AND`. Each pair is
77 /// `(field, value)` — backends translate them to their native filter
78 /// shape (`attribute = "value"` for Meili, `WHERE col = $1` for pg).
79 /// Richer filter trees are out of scope for the shared shape.
80 pub filters: Vec<(String, String)>,
81 /// Maximum number of hits to return. `0` means "use the backend's
82 /// default". Callers that want a cap should set an explicit value.
83 pub limit: u32,
84 /// Number of hits to skip from the start — pagination cursor.
85 pub offset: u32,
86}
87
88/// A single search hit returned by the backend.
89///
90/// `score` is a backend-defined relevance number. It is not normalised
91/// across backends — callers that rank across multiple providers should
92/// do their own re-ranking. `source_json` is the indexed document
93/// re-serialised as a JSON string, matching the `Document::fields_json`
94/// convention on the write side.
95#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
96pub struct SearchHit {
97 /// The document's stable id — same value the caller passed to
98 /// `index`.
99 pub id: String,
100 /// Relevance score from the backend. Comparable within a single
101 /// result set, not across backends or across queries.
102 pub score: f32,
103 /// Stored document payload as a JSON string. Empty string if the
104 /// backend chose not to return the source (some providers make this
105 /// configurable).
106 pub source_json: String,
107}
108
109/// Result of a single `search` call.
110#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
111pub struct SearchResults {
112 /// Hits, ordered by the backend's relevance ranking.
113 pub hits: Vec<SearchHit>,
114 /// Total matches in the index (not just `hits.len()`). Backends that
115 /// cannot compute a total cheaply return an estimate; callers that
116 /// need exactness either page through all hits or use a backend that
117 /// supports it.
118 pub total: u64,
119 /// Wall-clock time the backend reports for the query in
120 /// milliseconds. `0` if the backend does not expose timing.
121 pub took_ms: u32,
122}
123
124/// Typed error returned by every `SearchClientPlugin` method.
125///
126/// Flat enum, not `Result<_, String>`, because classification matters:
127/// the capability dispatcher distinguishes "you asked for something that
128/// does not exist" from "you asked for something you cannot see" from
129/// "your query itself was malformed" from "the backend blew up". Each
130/// variant carries a message for operator-facing logs; callers should
131/// match on the variant, not inspect the string.
132#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
133pub enum SearchError {
134 /// Index does not exist in the backend. Not the same as "empty
135 /// index"; that returns `Ok(SearchResults { hits: vec![], .. })`.
136 IndexNotFound(String),
137 /// Authentication or authorisation failed. Wrong API key, wrong
138 /// role, wrong network. Distinct from `Backend` so the dispatcher
139 /// can escalate credentials issues without paging on transport
140 /// flakes.
141 AccessDenied(String),
142 /// The query itself was malformed — syntax error, unsupported
143 /// filter shape, out-of-range offset. Caller error, not backend
144 /// fault. Recoverable by fixing the input.
145 BadQuery(String),
146 /// Everything else: network failure, backend 5xx, driver panic,
147 /// timeout. Not classified further because the caller cannot
148 /// recover from any of them except by retrying or alerting.
149 Backend(String),
150}
151
152impl std::fmt::Display for SearchError {
153 fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
154 match self {
155 Self::IndexNotFound(msg) => write!(f, "search index not found: {msg}"),
156 Self::AccessDenied(msg) => write!(f, "search access denied: {msg}"),
157 Self::BadQuery(msg) => write!(f, "search bad query: {msg}"),
158 Self::Backend(msg) => write!(f, "search backend error: {msg}"),
159 }
160 }
161}
162
163impl std::error::Error for SearchError {}
164
165/// A search backend.
166///
167/// The runtime holds one instance per configured backend and dispatches
168/// `search.query`, `search.index`, and `search.delete` host calls
169/// through it. All three methods are sync; backends that need async
170/// transport wrap it internally. Every method takes an explicit index
171/// name so a single backend can host many logical collections — this
172/// matches Meili, Elastic, Typesense, and the pg-FTS convention of
173/// "one index == one table".
174pub trait SearchClientPlugin: Send + Sync {
175 /// Unique identifier for this backend (e.g. `"meili"`, `"pg"`).
176 fn name(&self) -> &str;
177
178 /// Execute a query against `index`. See `SearchQuery` for the shape.
179 ///
180 /// An empty result is `Ok(SearchResults { hits: vec![], .. })`.
181 /// `IndexNotFound` means the index itself is missing; `BadQuery`
182 /// means the query was malformed; `Backend` is everything else.
183 fn search(&self, index: &str, query: &SearchQuery) -> Result<SearchResults, SearchError>;
184
185 /// Upsert documents into `index`. Documents with an id that already
186 /// exists are replaced; new ids are inserted. Bulk semantics — the
187 /// call is one round-trip per backend batch, not per document.
188 fn index(&self, index: &str, docs: Vec<Document>) -> Result<(), SearchError>;
189
190 /// Delete documents from `index` by id. Missing ids are silently
191 /// ignored — deleting a non-existent document is not an error (same
192 /// convention as Redis `DEL`, S3 `DeleteObject`, and every other
193 /// idempotent delete in the plugin API).
194 fn delete(&self, index: &str, ids: Vec<String>) -> Result<(), SearchError>;
195
196 /// Health check. Default: always healthy. Remote backends should
197 /// override to ping their transport so the runtime can route
198 /// around a dead provider without blowing up in `search`.
199 fn is_healthy(&self) -> bool {
200 true
201 }
202}