bext-plugin-api 0.2.0

Plugin trait definitions and shared types for bext — the public ABI for plugin authors
Documentation
//! Search-client capability trait and types.
//!
//! A `SearchClientPlugin` is the runtime-side face of a search backend:
//! issue queries, push documents into an index, delete documents by id.
//! Backends fall into two families:
//!
//! 1. **Dedicated search engines** (`@bext/search-meili`,
//!    `@bext/search-typesense`, `@bext/search-elastic`) — an external
//!    service holds the inverted index and handles ranking.
//! 2. **SQL full-text** (`@bext/search-pg`) — the existing Postgres
//!    instance runs `to_tsvector` / `plainto_tsquery` on a regular table.
//!    No new infrastructure, good enough for "I want search and I have
//!    one database" sites.
//!
//! The trait stays sync to match the rest of `bext-plugin-api`. Backends
//! that speak native async (the Meilisearch SDK, the Elasticsearch Rust
//! client) either use their blocking sibling or own a small tokio runtime
//! and call `block_on` — the same pattern `@bext/auth-jwt`'s JWKS fetcher
//! and `@bext/flags-openfeature` use. Plugins cannot expose async across
//! the sandbox boundary, so the host-facing shape is sync.
//!
//! ## Query shape is intentionally small
//!
//! `SearchQuery` carries a text string, equality filters on attributes,
//! a limit, and an offset. That covers the 80% case (`autocomplete`,
//! `search within category`, `keyword + facet`) without leaking any
//! vendor's query DSL into the trait. Two escape hatches exist for
//! richer needs:
//!
//! * A backend that wants a raw JSON query can accept it in `text` and
//!   document the shape — the trait does not parse `text`.
//! * A backend can expose its own richer API *behind* `SearchClientPlugin`
//!   at construction time, then narrow down to the trait when called
//!   from capability-dispatching code.
//!
//! The alternative — growing a rich shared query DSL — is the trap the
//! [architecture doc](../../plan/ecosystem/00-architecture.md) calls out:
//! vendor-coupled shapes end up looking like whichever backend shipped
//! first and never fit the next one cleanly.
//!
//! ## Document and hit payloads are JSON strings
//!
//! `Document::fields_json` and `SearchHit::source_json` are plain JSON
//! strings, not `serde_json::Value`s. This matches the Session capability
//! carrying session data, the Lifecycle capability carrying event
//! payloads, and the Feature Flag capability carrying structured flag
//! values. The reason is the same every time: the WASM / QuickJS / nsjail
//! sandbox ABI is flatter when it only has to transport bytes, and the
//! host-facing code pays one `serde_json::from_str` at the edge instead
//! of shoving a fully-typed value across the boundary.

/// A single document to push into an index.
///
/// `id` is the stable external identifier — backends use it to
/// deduplicate on re-index and as the target for `delete`. `fields_json`
/// is a JSON object encoded as a string; the backend decides which
/// top-level keys are searchable vs filterable vs stored. The trait does
/// not validate the JSON beyond requiring it to be parseable.
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct Document {
    /// Stable, caller-supplied id. Backends use this for upsert /
    /// delete semantics.
    pub id: String,
    /// JSON object as a string. Top-level keys map to indexable fields.
    pub fields_json: String,
}

/// A query to issue against a named index.
///
/// Deliberately minimal — see the module docs for the rationale.
#[derive(Debug, Clone, Default, serde::Serialize, serde::Deserialize)]
pub struct SearchQuery {
    /// Free-form text query. Empty string means "match all", subject to
    /// filters. Backends may also accept a raw DSL here if they choose.
    pub text: String,
    /// Attribute equality filters, applied as `AND`. Each pair is
    /// `(field, value)` — backends translate them to their native filter
    /// shape (`attribute = "value"` for Meili, `WHERE col = $1` for pg).
    /// Richer filter trees are out of scope for the shared shape.
    pub filters: Vec<(String, String)>,
    /// Maximum number of hits to return. `0` means "use the backend's
    /// default". Callers that want a cap should set an explicit value.
    pub limit: u32,
    /// Number of hits to skip from the start — pagination cursor.
    pub offset: u32,
}

/// A single search hit returned by the backend.
///
/// `score` is a backend-defined relevance number. It is not normalised
/// across backends — callers that rank across multiple providers should
/// do their own re-ranking. `source_json` is the indexed document
/// re-serialised as a JSON string, matching the `Document::fields_json`
/// convention on the write side.
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct SearchHit {
    /// The document's stable id — same value the caller passed to
    /// `index`.
    pub id: String,
    /// Relevance score from the backend. Comparable within a single
    /// result set, not across backends or across queries.
    pub score: f32,
    /// Stored document payload as a JSON string. Empty string if the
    /// backend chose not to return the source (some providers make this
    /// configurable).
    pub source_json: String,
}

/// Result of a single `search` call.
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct SearchResults {
    /// Hits, ordered by the backend's relevance ranking.
    pub hits: Vec<SearchHit>,
    /// Total matches in the index (not just `hits.len()`). Backends that
    /// cannot compute a total cheaply return an estimate; callers that
    /// need exactness either page through all hits or use a backend that
    /// supports it.
    pub total: u64,
    /// Wall-clock time the backend reports for the query in
    /// milliseconds. `0` if the backend does not expose timing.
    pub took_ms: u32,
}

/// Typed error returned by every `SearchClientPlugin` method.
///
/// Flat enum, not `Result<_, String>`, because classification matters:
/// the capability dispatcher distinguishes "you asked for something that
/// does not exist" from "you asked for something you cannot see" from
/// "your query itself was malformed" from "the backend blew up". Each
/// variant carries a message for operator-facing logs; callers should
/// match on the variant, not inspect the string.
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub enum SearchError {
    /// Index does not exist in the backend. Not the same as "empty
    /// index"; that returns `Ok(SearchResults { hits: vec![], .. })`.
    IndexNotFound(String),
    /// Authentication or authorisation failed. Wrong API key, wrong
    /// role, wrong network. Distinct from `Backend` so the dispatcher
    /// can escalate credentials issues without paging on transport
    /// flakes.
    AccessDenied(String),
    /// The query itself was malformed — syntax error, unsupported
    /// filter shape, out-of-range offset. Caller error, not backend
    /// fault. Recoverable by fixing the input.
    BadQuery(String),
    /// Everything else: network failure, backend 5xx, driver panic,
    /// timeout. Not classified further because the caller cannot
    /// recover from any of them except by retrying or alerting.
    Backend(String),
}

impl std::fmt::Display for SearchError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Self::IndexNotFound(msg) => write!(f, "search index not found: {msg}"),
            Self::AccessDenied(msg) => write!(f, "search access denied: {msg}"),
            Self::BadQuery(msg) => write!(f, "search bad query: {msg}"),
            Self::Backend(msg) => write!(f, "search backend error: {msg}"),
        }
    }
}

impl std::error::Error for SearchError {}

/// A search backend.
///
/// The runtime holds one instance per configured backend and dispatches
/// `search.query`, `search.index`, and `search.delete` host calls
/// through it. All three methods are sync; backends that need async
/// transport wrap it internally. Every method takes an explicit index
/// name so a single backend can host many logical collections — this
/// matches Meili, Elastic, Typesense, and the pg-FTS convention of
/// "one index == one table".
pub trait SearchClientPlugin: Send + Sync {
    /// Unique identifier for this backend (e.g. `"meili"`, `"pg"`).
    fn name(&self) -> &str;

    /// Execute a query against `index`. See `SearchQuery` for the shape.
    ///
    /// An empty result is `Ok(SearchResults { hits: vec![], .. })`.
    /// `IndexNotFound` means the index itself is missing; `BadQuery`
    /// means the query was malformed; `Backend` is everything else.
    fn search(&self, index: &str, query: &SearchQuery) -> Result<SearchResults, SearchError>;

    /// Upsert documents into `index`. Documents with an id that already
    /// exists are replaced; new ids are inserted. Bulk semantics — the
    /// call is one round-trip per backend batch, not per document.
    fn index(&self, index: &str, docs: Vec<Document>) -> Result<(), SearchError>;

    /// Delete documents from `index` by id. Missing ids are silently
    /// ignored — deleting a non-existent document is not an error (same
    /// convention as Redis `DEL`, S3 `DeleteObject`, and every other
    /// idempotent delete in the plugin API).
    fn delete(&self, index: &str, ids: Vec<String>) -> Result<(), SearchError>;

    /// Health check. Default: always healthy. Remote backends should
    /// override to ping their transport so the runtime can route
    /// around a dead provider without blowing up in `search`.
    fn is_healthy(&self) -> bool {
        true
    }
}