Crate diesel_paradedb

Expand description

Diesel ORM bindings for ParadeDB’s pg_search extension.

ParadeDB adds a Postgres type (pdb.query), two custom operators (||| and @@@), and a family of query-builder functions in the pdb schema. Mainline Diesel does not know about any of these, so queries that use them must otherwise be written with sql_query. This crate bridges that gap: once imported, BM25 queries, match-all filters, relevance scoring, and faceting aggregates compose with regular Diesel select / filter / order calls and are checked at compile time.

§What is modelled

sql_types::ParadeQuery — Diesel SQL-type marker that mirrors ParadeDB’s pdb.query type. Every pdb::* builder function returns this type, and the @@@ operator expects it on the right.
Infix operators [TermMatch] (|||) and [ParadeMatch] (@@@), exposed via the fluent extension traits TermMatchDsl / ParadeMatchDsl (.term_match(...) and .parade_match(...)).
Scalar function bindings: pdb_all, pdb_score.
Aggregate / window expressions: pdb_agg_terms, pdb_agg_avg, pdb_agg_stats, pdb_agg_value_count. Each returns a PdbAgg that can be used directly (aggregate form) or via .over_all() (window form, which attaches a facet to every row of a non-grouped Top-K BM25 query — the shape ParadeDB documents for search-with-facets).

§What is NOT modelled

The BM25 index access method itself. Indexes do not live in Diesel’s type system — CREATE INDEX ... USING bm25 (...) belongs in your migrations. This crate type-checks queries that use a BM25 index; existence of the index is a runtime contract.
The legacy paradedb.searchqueryinput type. ParadeDB exposes both pdb.query (newer) and paradedb.searchqueryinput (legacy). This crate models only the newer pdb.query.

§Quick start

use diesel::prelude::*;
use diesel_paradedb::{pdb_all, pdb_score, ParadeMatchDsl, TermMatchDsl};

// Top-5 BM25 hits with score, ordered by relevance.
let rows: Vec<(i32, String, f32)> = my_table::table
    .select((my_table::id, my_table::text, pdb_score(my_table::id)))
    .filter(my_table::text.term_match("service outage"))
    .order(pdb_score(my_table::id).desc())
    .limit(5)
    .load(conn)?;

// Match-all + sentiment facet via the window form.
use diesel_paradedb::pdb_agg_terms;
let (id, facet): (i32, serde_json::Value) = my_table::table
    .select((my_table::id, pdb_agg_terms("sentiment", 10).over_all()))
    .filter(my_table::id.parade_match(pdb_all()))
    .order(pdb_score(my_table::id).desc())
    .limit(1)
    .get_result(conn)?;

§Diesel version compatibility

This crate pins to a single Diesel minor. diesel-paradedb 0.1.x targets diesel = "~2.2". A Diesel 2.3 release will trigger diesel-paradedb 0.2; a Diesel 3.0 will trigger diesel-paradedb 3.0 (the crate’s major tracks Diesel’s).

§Async usage

Works unchanged with diesel-async: this crate only adds new expression types and operators, all of which use Diesel’s standard QueryFragment / Expression traits. No async feature flag is required.

Modules§

sql_types: SQL-type markers exposed for use in Queryable derives and explicit type annotations on select(...) clauses. The convention mirrors pgvector’s pgvector::sql_types::Vector re-export.

Structs§

PdbAgg: Aggregate form of pdb.agg(<json>). Use the pdb_agg_* constructors rather than building this directly — they keep the JSON config private and consistent across call sites.
PdbAggOver: Window form of pdb.agg(<json>) OVER (). Produced by PdbAgg::over_all. Behaves as a non-aggregate expression so it can coexist with normal SELECT columns — which is the entire point of using OVER () in the search-with-facets pattern.

Traits§

ParadeMatchDsl: Fluent .parade_match(...) for structured ParadeDB queries against any column.
TermMatchDsl: Fluent .term_match(...) for BM25 term queries against a Text column.

Functions§

pdb_agg_avg: pdb.agg('{"avg": {"field": <field>}}') — average of a numeric fast field.
pdb_agg_stats: pdb.agg('{"stats": {"field": <field>}}') — count / min / max / sum / avg in one pass.
pdb_agg_terms: pdb.agg('{"terms": {"field": <field>, "size": <size>}}') — bucket-count aggregate over the distinct values of a fast field.
pdb_agg_value_count: pdb.agg('{"value_count": {"field": <field>}}') — non-null value count.
pdb_all: pdb.all() — match-all ParadeDB query. Pair with @@@ to filter every row through the BM25 index (e.g. so a window-form pdb.agg(...) has a Top-K context to compute over).
pdb_score: pdb.score(key) — BM25 relevance score for the row. Only meaningful inside a query whose @@@ / ||| filter targets the same BM25 index; outside that context Postgres returns NULL at runtime. Modelled as Float4 (non-nullable in the typical hot path).

Type Aliases§

pdb_all: The return type of pdb_all()
pdb_score: The return type of pdb_score()