flusso-query — the typed query client
flusso keeps an OpenSearch index in sync with Postgres from a declarative schema
(the write side; see README.md). That schema is a contract: it
fixes the shape of every document — which fields exist, their types, which are
nested arrays, which are scalars. flusso enforces that contract on the write
side; flusso-query enforces it on the read side.
flusso-query does not generate the document types for you. You write the
document struct by hand and keep full control — its derives, field types, which
fields it projects, how doc keys map to Rust names. #[derive(FlussoDocument)]
then, at compile time and with no database, does two things against the resolved
schema:
- Validates that every struct field lines up with the schema — the field exists, its Rust type matches, its nullability matches. A struct that has drifted stops compiling, pointing at the offending field.
- Generates the typed query surface from the schema — field handles,
get/query, the schema hash — so a service queries the index like a typed function: field names checked at compile time, operators that only exist for the field types that support them, results that deserialize into your struct.
The query surface comes from the full schema, not from the struct — so you can
filter or sort on a field even if your struct doesn't deserialize it. When the
schema changes and the index is rebuilt, anything that no longer fits stops
compiling at cargo build.
Quick reference
| Looking for… | Jump to |
|---|---|
| A full worked example (structs + query) | A query, from the caller's seat |
| Which operators each field type exposes | What the field type buys you |
and/or/not, clause vs combinator style, count/ids |
Composing queries |
| Per-query options, compound/scoring, standalone, sort, search-level | Query options and extra query types |
Nested arrays — filter by vs filter of (any/all, filter_nested) |
Filtering nested collections |
| Several searches, one round-trip | _msearch |
| One blended result list across indexes | Combined search |
How the macro finds the schema; path for nested structs |
Binding to the schema |
flusso type → Rust type → handle |
flusso types → Rust types |
| Anything the typed builder can't express | The escape hatch |
| Reading a prefixed deployment | Resolving the index name |
A query, from the caller's seat
A service that searches the users index from the schema guide. You write
the structs; #[derive(FlussoDocument)] validates them and generates the query
surface. The only input is the index name — it finds flusso.toml itself (see
Binding to the schema).
The document structs
use ;
/// A `users` document — *you* write this. It's a **projection**: it deserializes
/// the fields below and omits the rest of the index (addresses, profile,
/// avgOrderValue, …), which the derive allows. The derive checks every field
/// against the `users` schema and hangs the typed query surface off `User`.
// ← the only input: which index
/// The `account` object — a same-row sub-object. A nested/object struct validates
/// against its `path` in the same index; it has no entry points of its own.
The query
async
What you can't write
The compiler refuses the mistakes:
User::email().matches(..)—matchesis a text operator andemailis akeyword.User::full_name().eq(..)—full_nameis analyzedtext, so it has no exacteq.User::nmae()— there is no such handle.hit.source.totl— no such field.
And the struct itself can't drift. Declaring email: i32, or email: Option<String> when the field is required, or a field totl the schema doesn't
have, is a compile error from the derive. The schema has been lifted into the
type system, so the compiler checks both your queries and your struct against it.
OS_PASSWORDis just a variable name picked for the example. The full env-var story (secrets, the reserved deployment overrides,FLUSSO_CONFIG) lives in the configuration guide.
What the field type buys you
The handle a schema field generates determines the operators in scope: an operator that doesn't make sense for a field's type doesn't exist on its handle, so the mistake is a compile error rather than a 400 from OpenSearch.
| Handle | Operators |
|---|---|
Keyword |
eq, any_of, prefix, wildcard, regexp, fuzzy, exists |
Text |
matches, match_phrase, match_phrase_prefix, matches_fuzzy, any_of (exact, via .keyword), exists — no exact eq (it's analyzed) |
Bool |
eq, exists, asc/desc |
Number |
eq, any_of, lt, lte, gt, gte, between, exists |
Date |
eq, any_of, lt, lte, gt, gte, between, exists |
Object<S> |
exists (a same-document sub-object — an object field or a to-one join (belongs_to/has_one); S is the enclosing scope). Its sub-fields are flattened, so query them via the child struct's dotted-path handles (Account::tier()), not through this handle |
Nested<S, T> |
any(q) / all(q) to match parents and lift the child query into scope S — q is a child query built from T's handles (merging); matching(q) (with .sort/.size) to shape what's returned — see Filtering nested collections; plus exists |
Geo |
within(Distance::km(12.0), center), within_box, within_polygon, exists; sort with distance_from(center) (nearest-first sugar) or distance_sort(center, order, DistanceUnit) |
TextMap |
key(k) → a Text leaf for one key; search(q) (cross-key, .prefer(key, weight) / .only_preferred()); has_key(k), exists |
KeywordMap |
key(k) → a Keyword leaf for one key; has_key(k), exists — no search (exact-match, use key(..).eq(..)) |
NumberMap |
key(k) → a Number leaf for one key (eq/lt/gte/between/…); has_key(k), exists |
DateMap |
key(k) → a Date leaf for one key (eq/gte/between/…); has_key(k), exists |
Binary |
exists — base64-encoded, not searchable |
Json |
exists, raw(serde_json::Value) — the untyped fallback |
Each operator's argument is typed by kind, not by one fixed Rust type — a
handle accepts any value implementing FlussoValue<kind::…> (which requires
serde::Serialize):
- Numerics are split per type (
kind::Byte/Short/Integer/Long/Float/Double/Decimal), and a value is accepted only if it widens into that kind without loss. So aLongfield takes any integer, aDoubletakesf32/f64or a small int, aDecimaltakesDecimal(decimalfeature) or an integer — but a float on an integer field, or ani64on aShort, is a compile error. Bare literals work where lossless:eq(5)onlong/integer/double/decimal,eq(5.0)onfloat/double. Abyte/shortfield needs a typed literal (eq(5i16)), since5defaults toi32(which would narrow). Keywordtakes aString/&stror a#[derive(FlussoValue)]keyword enum/newtype (matched against its serde string form).Booltakes aboolor a bool#[derive(FlussoValue)]newtype.Datetakes aString/&strISO-8601 literal or — with thechronofeature — aNaiveDate/NaiveDateTime/DateTime<Utc>.
A custom money/quantity type queries with no cast — Order::total().eq(Money(d))
— as long as it's a FlussoValue of the field's kind (see below). A value of the
wrong kind is a compile error (Order::created_at().gte(5), Order::age().eq(1.5)).
Subfield accessors. flusso's sink auto-enriches text/keyword fields
(auto_subfields, on by default) with exact / sortable / searchable subfields,
and the handles expose them — no Keyword::at("code.keyword") string path:
full_name // Text → analyzed full-text match
full_name.keyword // Keyword → exact / wildcard / prefix
full_name.keyword_lowercase // Keyword → case-insensitive match / sort
email.text // Text → full-text over a keyword field
(A wildcard belongs on .keyword(), not the analyzed handle, which matches
tokens not the whole value.) These accessors exist only when the subfield is
actually provisioned — and the derive enforces it at compile time: a
text/keyword handle is stamped with subfields only when every OpenSearch
sink has auto_subfields on and the field declares no custom fields.
Otherwise the handle is …<NoSubfields> and .keyword() / .text() /
.keyword_lowercase() (and the Text::any_of / Text::asc sugar built on
them) simply don't exist — calling one is a compile error, not a runtime 400.
Sorting is similar — sort(…) accepts handles whose type is sortable (numbers,
dates, keywords, booleans, and text). Text::asc/desc is sugar that sorts
via the case-insensitive .keyword_lowercase subfield automatically; reach for
User::full_name().keyword().desc() when you want an exact-case sort instead.
A geo_point sorts by proximity with User::location().distance_from(center).
A few clauses span more than one field, so they're free functions:
multi_match("ada", [User::full_name(), User::bio()]) runs one analyzed query
across several Text fields (weight one with User::full_name().boosted(3.0)).
The typed surface is broad — see Query options and extra query
types for the per-query options
(boost/case_insensitive/fuzziness/…), the compound/scoring queries
(constant_score/dis_max/function_score/boosting), the standalone ones
(ids/query_string/simple_query_string/script_score/…), and the
search-level controls (min_score/collapse/search_after/highlight/…).
What's left for the raw hatch is knn, geo_shape, span
and parent/child queries — types with no corresponding flusso field.
Composing queries
A handle's operator produces a Query<S> — carrying the scope it was built
in. The root document and any flattened object/to-one-join sub-field share the
scope Root; only a nested array introduces a fresh scope, tagged with its
element type (Query<Order>).
Queries compose with and / or / not within a scope, and the Search
builder exposes the bool-query clauses directly:
// Combinator style.
let q = email.eq
.and
.and;
query.query.send.await?;
// Clause style — `filter`/`must_not` don't score, `query`(=must)/`should` do.
query
.query // scored
.filter // filtered, cached, no score
.must_not
.should
.send
.await?;
query/filter/must_not/should accept anything that is a Query<Root>, so
the two styles mix freely. Which clause to reach for:
| Clause | Scores? | Bool slot | Use for |
|---|---|---|---|
query |
yes | must |
full-text relevance you want ranked |
filter |
no (cached) | filter |
exact constraints — terms, ranges, nested any |
must_not |
no | must_not |
exclusions |
should |
yes | should |
optional boosts; a real constraint with min_should_match |
A built search can finish as a count instead of a page: .count() sends the
same query to _count and returns the number of matching documents (u64) —
cheaper than send() when the hits aren't needed. Sort, from/size, and
filter_nested projections are ignored; they never change which documents match.
let open_orders: u64 = query
.filter
.count
.await?;
Or as an id page: .ids() runs the same search with _source: false and
returns Vec<String> — the matching document ids (root primary keys,
stringified), in order, no sources fetched. Sort and from/size apply as in
send(); filter_nested projections are dropped. The cheap way to feed another
lookup — e.g. search in OpenSearch, then load the full rows from Postgres:
let user_ids: = query
.filter
.sort
.size
.ids
.await?;
Query options and extra query types
Each leaf operator returns a small builder that carries that query's options
plus the universal boost(f32) and name(&str) (_name, surfaced in a hit's
matched_queries). With no option set it renders the DSL shorthand; set one and
it expands. A builder drops straight into a clause (it's an AsQuery), so no
.build() is needed:
query
.should // weighted text
.should
.should // typo-tolerant
.min_should_match // make should a real filter
.filter // uuid keyword (feature)
.filter // enum keyword
.sort // null-aware sort
.send.await?;
The options per query type (all optional; plus the universal boost/name on every builder):
| Query | Options |
|---|---|
term / prefix / wildcard / regexp |
case_insensitive |
prefix / wildcard |
rewrite |
regexp |
flags, max_determinized_states |
fuzzy |
fuzziness, prefix_length, max_expansions, transpositions |
matches |
fuzziness, operator, minimum_should_match, prefix_length, analyzer, zero_terms_query, lenient |
| phrase matches | slop, analyzer |
multi_match |
type, operator, fuzziness, tie_breaker, minimum_should_match |
| range | format, time_zone, relation |
within (geo) |
distance_type, validation_method |
nested any |
score_mode, ignore_unmapped |
The enumerable options are closed enums, not strings — a typo is a compile
error, not a 400. operator/default_operator take Operator { And, Or };
fuzziness takes Fuzziness { Auto, AutoBounds(u32, u32), Edits(u32) };
multi_match's type takes MultiMatchType; zero_terms_query takes
ZeroTermsQuery { None, All }; a range relation takes RangeRelation { Intersects, Contains, Within }; function_score's score_mode/boost_mode
take ScoreMode/BoostMode; a nested score_mode takes NestedScoreMode
(which, unlike ScoreMode, has None for a filter-only nested clause). Geo's
distance_type/validation_method take DistanceType/ValidationMethod, and
a sort's numeric_type / Sort::script type take NumericType /
ScriptSortType. minimum_should_match takes a MinimumShouldMatch
(2/.into() for a count, MinimumShouldMatch::percent(75), or ::raw("3<90%")
for the combining mini-language). Genuinely open-ended params (analyzer,
format, time_zone, unmapped_type, flags) stay String.
.or()/.and()on a builder needuse flusso_query::AsQuery;in scope (the combinators are provided methods on that trait). Composing via theSearchclauses (.should()/.filter()/…) needs no import.
Bool / compound & scoring. Search::min_should_match(n) (or
Query::min_should_match on an or-group) turns a top-level free-text should
group into a real constraint instead of scoring-only. The scoring wrappers are
free functions: constant_score(filter), dis_max([..]).tie_breaker(..),
boosting(positive, negative, negative_boost), and
function_score(query).weight(..)/.weight_when(.., filter)/.boost_mode(..).
Standalone query types (free functions, each an AsQuery): ids([..])
(get-many-by-_id), query_string(..) / simple_query_string(..) /
combined_fields(.., [fields]) (user-facing full-text), and the relevance ones
script(..), script_score(query, source), distance_feature(..),
rank_feature(..), more_like_this([fields], [like]). match_bool_prefix is a
Text operator (search-as-you-type).
Sort. .asc()/.desc() on a sortable handle (number, date, keyword, bool,
or text — the last routing through .keyword_lowercase) return a Sort
builder: chain .missing_first()/.missing_last()/.missing(v),
.mode(SortMode::..), .unmapped_type(..)/.numeric_type(..)/.format(..), or
.nested(path)/.nested_filtered(path, q). Also Sort::score() (by _score),
Sort::script(ScriptSortType, source, order), and Geo::distance_from(center)
/ Geo::distance_sort(center, order, DistanceUnit) for proximity sorting. A
geo radius is a typed Distance (Distance::km(12.0) / ::miles(5.0) /
::meters(800.0) / …), so a malformed "12 km" can't reach the query.
Search-level controls on the Search builder: min_score,
track_total_hits, track_scores, search_after([..]) (deep pagination),
collapse(field), post_filter(q), and highlight(Highlight::new().field(..)).
Queries are values; the client appears once
Type::query() takes no client: a Search<T> is a plain, Cloneable value with
no lifetime. Build it anywhere — a helper, a struct field, a cached "prepared
search" — and hand a &Client to the terminal (send / ids / count, all
&self) when it's time to run. One built query can run many times, against
different clients, or with per-call tweaks via clone():
let page = busy_users.send.await?; // run it
let next = busy_users.from.send.await?; // tweak a copy
Several searches, one round-trip (_msearch)
Independent typed searches — different indexes, different document types — can
share one HTTP round-trip. client.msearch(…) takes a tuple of &Search<T>
(arity 1–8) and returns one typed SearchResponse per slot, in order:
let users_q = query.query.size;
let orders_q = query.filter.size;
let = client.msearch.await?;
The "search page with separate sections" primitive: each slot keeps its own query,
sort, pagination, and filter_nested projections, and decodes into its own type.
A slot-level failure fails the whole call with an error naming the slot (no partial
results). For many searches of one type there's
client.msearch_all(&searches) → Vec<SearchResponse<T>>.
One blended result list (combined search)
The other multi-index shape: one query over several indexes, hits ranked together in a single list — the "one search box over everything" primitive. You declare which document types blend by writing an enum with one variant per type:
/// One item in the storefront's global search.
// the `derive` feature, like FlussoDocument
let page = query // MultiSearch<StoreItem> — client-free too
.query
.size
.send
.await?;
for hit in page.hits
Root-scope queries compose across document types (Query<Root> carries no
document type), and each hit is decoded into the variant that owns its index.
The query goes through the stable {INDEX}_{HASH} alias, but OpenSearch reports
the concrete index behind it ({INDEX}_{HASH}_{generation}) in each hit — so the
crate normalizes that generation suffix before dispatch; you just match on
hit.source. A hit from an index no variant claims is an error, not a skip.
count(&client) works on the union too.
Two semantics to know:
- A query on a field that exists in only one of the indexes is fine — it just doesn't match in the others.
- A sort on such a field is rejected by OpenSearch unless it carries an
unmapped_type. Sort the blended list by relevance, or on fields all the union's indexes share.
The derive validates the enum's shape (single-field tuple variants, no duplicate
payload types) and generates the trait's two members. Without the derive
feature, the impl is two short members written by hand — a TARGETS const listing
each variant's (INDEX, SCHEMA_HASH) and a decode that matches on
Type::physical_index().
Building a child filter and merging it into the parent
Because the scope is part of the type, a query is a value you can build, name,
store, and reuse. A nested child struct (one whose path ends in a nested
array) carries its own field handles tagged with the child scope — so they produce
Query<Order>, not Query<Root>:
// Built from Order's own handles. Reusable — a plain function returning a query:
To merge a child filter into a parent, lift it through the nesting that holds
it: User::orders().any(child) (or .all(child)) takes a Query<Order> and
returns a Query<Root> — a nested clause at the orders path — which composes
with parent-scope queries like any other:
let q = email.eq
.and; // Query<Order> → lifted → Query<Root>
query.filter.send.await?;
The scope tag keeps this honest: User::email().and(Order::status().eq(…)) does
not compile — you can't and a Query<Root> with a Query<Order>; the child
query has to be lifted through User::orders() first. A child constraint can never
be silently applied at the wrong level.
Lifting composes through depth: Order::items().any(Item::quantity().gt(1)) is a
Query<Order>, which User::orders().any(…) then lifts the rest of the way to
Query<Root>.
Optional filters
Callers build queries from optional inputs — request params, form fields — and the
if let Some(x) = … { q = q.filter(…) } dance breaks the fluent chain. The
primitive that fixes it: Option<Q> is itself a Query, where None
contributes nothing in any clause — must_not(None) excludes nothing, and(None)
is the identity. So every clause and combinator accepts an optional; you .map the
value into the handle:
query
.filter // skipped when None
.filter
.send.await?;
// Composes inside and/or too — a None branch just drops out:
let q = email.eq
.and; // None → just the email clause
A named filter_some(value, |v| …) sugar that drops the .map is an obvious
follow-on, left out of the first cut.
Filtering nested collections
orders is a nested array, and there are two independent things you might
filter — flusso keeps them separate:
- Filter by nested — choose which users come back, based on their orders.
The
any/allyou've already seen: it's aQuery, so it goes infilter/query/etc. A matching user still carries its wholeordersarray. - Filter of nested — shape the
ordersarray each user comes back with, without changing which users return. A separate clause,filter_nested.
They compose: use either alone, or both together (often with the same predicate).
filter_nested — shaping the returned array
let page = query
// filter BY: only users with a delivered order
.filter
// filter OF: and within each, keep only the delivered orders, newest first, ≤5
.filter_nested
.send.await?;
for hit in &page.hits
User::orders().matching(q) is a nested projection: q is a Query<Order>
built from Order's handles, plus optional .sort(Order::field().desc()),
.size(n), .from(n). matching itself is optional — drop it to keep every child
but still sort or cap the array.
Because filter_nested does not touch which parents match, a user with no
delivered orders still comes back — with orders: []. Pair it with a
filter(User::orders().any(…)) when you also want to drop those users.
filter_nested always replaces hit.source.<path> with the matched subset:
the client fetches the nested matches and substitutes them for that field before
deserializing.
Not yet built: a
.keep_source()opt-out that leaves the stored array intact and a typedhit.nested(handle)side accessor. Todayfilter_nestedalways replaces the array insource.
Depth
filter_nested shapes one nested level — orders. You can still match on deeper
nesting from inside the predicate
(Order::items().any(Item::quantity().gt(1))), and the returned orders honor it;
returning a filtered items array inside each returned order is a deeper
inner-hits case left to the raw hatch for now.
Results
get returns Option<T>; search returns SearchResponse<T>, where T is the
struct you wrote. There is no serde_json::Value in the common path.
Binding to the schema
The macro validates against the resolved mapping — flusso's
IndexMapping: every field with a
concrete type, whether it is nullable, its nested children, and the schema
hash.
flusso's schemas are self-describing: every leaf declares its type and
whether it's required, and joins/objects/aggregates have structural types — so the
mapping resolves with no database, exactly as flusso build does when it writes
flusso.lock. The client reuses that resolution. (See the
schema guide for the
schema format and the configuration guide
for how the index is written.)
The one input: the index name
You never point the macro at a file. You name the index, and the macro finds the schema:
At compile time the macro:
- Locates
flusso.tomlby walking up from the consuming crate'sCARGO_MANIFEST_DIR. (Override with#[flusso(index = "users", config = "…")]or theFLUSSO_CONFIGenv var — see the configuration guide.) - Selects the
[[index]]whosenamematches"users"— the reason an index name is required, since oneflusso.tomldefines several. - Loads that index's
schema:file (resolved relative toflusso.toml) and resolves theIndexMappingin-process — the same resolutionflusso buildperforms. Hermetic: no database, no network. - Tracks
flusso.tomland every schema file it read as build inputs (viainclude_bytes!), so editing the config or a schema retriggers compilation and a drifted struct fails the next build.
The resolved mapping's content hash is the binding's SCHEMA_HASH — the same
hash flusso build writes into flusso.lock and the engine folds into the
physical index name, so binding and index are provably the same schema version.
There is no build.rs, no generated .rs file to include!, and no committed
mapping artifact to keep in sync — the struct is the only file you maintain.
A nested or object struct names its path
A same-row object, a to-one join, or a nested join is its own struct, validated
against a dotted path into the same index — account, orders,
orders.items. It declares the same index so the macro resolves the same config,
then walks to that path's children:
These contribute field validation and their own field handles —
Order::status(), Order::total(), … — producing Query<Order> values you can
compose, store, and lift into a parent query (see Building a child
filter). Only the
root struct gets entry points (get/query) and SCHEMA_HASH.
What the derive expands to
#[derive(FlussoDocument)] on the root User emits (roughly):
// Each nested path has ONE handle namespace whose functions build a `Query` in
// that path's scope. When you wrote a struct for the path, that struct IS the
// namespace — its derive adds the handles, covering the full sub-schema (not just
// the fields it deserializes), producing `Query<Order>`. A `nested` array
// introduces its own scope: `Order`'s handles are tagged `<Order>` (the root and
// flattened objects stay `<Root>`); they must be lifted before joining a root query.
// For a nested path you DIDN'T give a struct, the root derive generates a
// handles-only namespace named `<Path>Fields`, so it's still queryable:
;
What the derive checks
For each field the struct declares, the macro resolves the matching schema field by
its document key — honoring #[serde(rename = "…")] and a container
#[serde(rename_all = "…")], so the struct's serde config and flusso's validation
agree — then checks three things:
| Check | Pass | Compile error |
|---|---|---|
| field exists | the doc key is in the schema | no field `totl` in index `users` (span on the field) |
| type matches | leaf Rust type matches the field's type |
email is `keyword` → expected `String`, found `i32` |
| nullability matches | Option<_> iff the field is nullable |
email is required → expected `String`, found `Option<String>` |
The rules that make this full control rather than a straitjacket:
- Partial projections are allowed. Leaving schema fields off your struct is fine — you only deserialize the subset you declare. Only the three checks above fail.
- Type matching is by leaf identifier +
Optionshape. The macro can't resolve arbitrary type aliases, so it compares the final path segment (String,i32,f64,OffsetDateTime, …) and theOption<_>wrapper against the type table. For anobjectfield it expects a struct, for anestedfield aVec<_>, and defers the inner field checks to that struct's ownFlussoDocumentderive. - Escape hatches. A field typed
serde_json::Valueopts out of type checking.#[flusso(skip)]drops a field from validation entirely — for a computed or app-only field not backed by the index (pair it with#[serde(skip)]or#[serde(default)]).
flusso types → Rust types
The type the derive expects for each schema type (the same bridge the
schema guide defines, with the Rust side added). Declare something
else and it won't compile (modulo the leaf-identifier rule above).
flusso type |
OpenSearch | Rust type | Field handle |
|---|---|---|---|
text |
text |
String |
Text |
identifier |
text |
String |
Text |
keyword |
keyword |
String (or a FlussoValue newtype) |
Keyword |
enum |
keyword |
String or a #[derive(FlussoValue)] enum |
Keyword |
uuid |
keyword |
String, or uuid::Uuid (uuid feature) |
Keyword |
boolean |
boolean |
bool |
Bool |
short |
short |
i16 |
Number<kind::Short> |
integer |
integer |
i32 |
Number<kind::Integer> |
long |
long |
i64 |
Number<kind::Long> |
float |
float |
f32 |
Number<kind::Float> |
double |
double |
f64 |
Number<kind::Double> |
decimal |
double |
Decimal (decimal feature) or f64 |
Number<kind::Decimal> |
date |
date |
time::Date (feature) |
Date |
timestamp |
date |
time::OffsetDateTime (feature) |
Date |
binary |
binary |
String (base64) |
Binary |
json |
object |
serde_json::Value |
Json |
geo_point |
geo_point |
GeoPoint ({ lat, lon }) |
Geo |
custom { opensearch } |
(given) | matching scalar, else serde_json::Value |
by OS type |
object |
object |
a struct | Object |
join belongs_to / has_one |
object |
Option< a struct > |
Object |
join has_many / many_to_many |
nested |
Vec< a struct > |
Nested<S, T> |
Decimals query without a cast. type: decimal maps to OpenSearch double
(lossy in storage) but gets a Number<kind::Decimal> handle, distinct from a
true double's Number<kind::Double> — so it accepts a Decimal or a
losslessly-widening integer (eq(5)), but not an f64 (that's lossy, a
compile error). To pass a rust_decimal::Decimal, enable the decimal
feature on flusso-query (it re-exports Decimal and implements
FlussoValue<kind::Decimal> for it); the document field can be Decimal or
f64 (both deserialize from the stored double). Query precision is f64-bound
(serde_json has no arbitrary precision) — fine for a double-stored column; if
you need exact querying, declare a custom scaled_float (also kind::Decimal)
and note this limit.
Dates are behind a feature so a caller picks time or chrono (or String
for raw ISO-8601); the derive accepts whichever leaf type the chosen feature settles
on.
Custom value types — #[derive(FlussoValue)]. A newtype inherits its inner
type's kinds automatically, so struct Money(Decimal) is a decimal value and
struct Sku(String) a keyword + text value — no kind tag needed, and each is
queryable and rejected exactly where the inner type would be. FlussoValue
requires serde::Serialize (a supertrait), so derive it too. An enum has no
inner type, so it needs an explicit string kind — #[flusso(keyword)] (default)
or #[flusso(text)]; numeric/date tags don't exist (use a newtype, which
inherits). Enum keyword fields stay typed — never #[flusso(skip)] them.
// Newtype: inherits Decimal's kinds — a `decimal`/`scaled_float` value.
;
// Enum: string-valued, so a kind tag is required (keyword default).
tier.eq; // term against "pro"
balance.gte; // a decimal value, no cast
uuid::Uuid is a keyword value behind the uuid feature — id / foreign-key
fields stay in the struct as Uuid (no #[flusso(skip)], no
Keyword::at("…")), and Customer::owner_id().eq(some_uuid) works without
.to_string().
Nullability is declared, not guessed
A field is T or Option<T>, and the derive checks it against the resolved
mapping. Nullability comes straight from the schema with no database round-trip: a
leaf states it with required, and joins, objects, and aggregates carry it
structurally. ResolvedField records the resulting nullable: bool; the derive
requires nullable: false → T, nullable: true → Option<T>.
| Field source | nullable |
Why |
|---|---|---|
root primary_key column |
false |
forced non-null — it backs the document id |
join primary_key field |
false |
forced non-null, just like the root key |
leaf column, required: true |
false |
declared non-null |
leaf column, required: false |
true |
nullable by default |
object |
false |
always assembled from the same row |
join belongs_to / has_one (object) |
true |
there may be no related row |
join has_many / many_to_many |
false |
a Vec, empty when there are none, never null |
aggregate count |
false |
a non-null long — zero rows is 0, not null |
aggregate avg |
true |
a nullable double — null over zero rows |
aggregate sum / min / max |
true |
null over zero rows; the result mirrors the column |
required is rejected by the schema on joins and aggregates precisely because
their nullability is structural — so there's nothing for the author to declare.
The escape hatch
Anything the typed builder can't express stays reachable, and still deserializes into the typed struct:
let page: = query
.raw
.send
.await?;
raw takes the OpenSearch query DSL verbatim — the pressure-release valve for the
few types with no flusso field (knn/vector, geo_shape, span and parent/child
queries) and for percolators — without dropping to an untyped client or losing
typed results. Most of what used to need it (function_score, script,
constant_score, query_string, search_after, …) is now in the typed surface.
Resolving the index name
The physical index carries the hash suffix (users_3f2a1b9c) — exactly what
the OpenSearch sink writes — and rotates on a structural schema change.
Because the binding is generated from the schema at compile time, the derive
knows that hash and emits it as a const: User::INDEX is the physical name, and
get/query use it. So User::query() addresses the right index directly, with
the hash hidden from the caller — no convenience alias required.
This is self-correcting: a structural schema change rotates the hash and changes
the resolved mapping, so the next cargo build regenerates the binding against the
new physical index. (User::INDEX and User::SCHEMA_HASH are exposed for logging,
admin, or a hand-built Search.)
The convenience alias (
users→ current generation) is still worthwhile for clients that don't recompile against the schema — dynamic/scripting use, dashboards. For a derived binding it's unnecessary: the compile-time hash is the stable name.
Reading a prefixed deployment
If flusso runs with an index prefix (FLUSSO_INDEX_PREFIX,
e.g. dev_ so it writes dev_users_<hash>), tell the client the same prefix:
let client = connect?
.index_prefix;
The prefix is applied at runtime, on the transport — the derive still bakes the
unprefixed User::INDEX/SCHEMA_HASH, and the client prepends the prefix to every
request path. So one compiled consumer binary serves every environment: point it
at dev or staging by setting FLUSSO_INDEX_PREFIX, no rebuild. It must match the
writer's prefix exactly, or queries hit an empty (or wrong) index.
Out of scope for the first cut
- Search aggregations (facets, histograms, cardinality). The typed surface is
filter/query/sort + typed hits first; aggregations need their own typed result
tree, and the
rawhatch covers them in the meantime. - Writes. flusso owns the index; the client never upserts or deletes.
- Correlating hits across indexes. Both multi-index shapes ship — one blended
result list and independent searches
via
_msearch— but correlating hits across indexes remains the caller's job. - Scroll pagination.
from/sizeandsearch_after(deep pagination) ship; a scroll cursor is a follow-on. - Generating the document struct. By design — the developer owns the struct.
Where this lands in the workspace
| Crate | Role |
|---|---|
flusso-query |
Runtime: the Client transport, the field-handle/Query/Search builder, SearchResponse. Generic over the developer's document types. Targets OpenSearch / Elasticsearch (shared DSL). Re-exports the derive behind a derive feature, so callers use flusso_query::FlussoDocument. |
flusso-query-derive |
The #[derive(FlussoDocument)] proc-macro crate. At compile time it discovers flusso.toml, resolves the named index's IndexMapping from the self-describing schema (no database), validates the annotated struct, and emits the field handles, entry points, and schema hash. Reuses schema-config-toml, schema-index-yaml, and schema-core. |
Both crates sit above schema-core and depend only downward — no dependency on the
engine, the sources, or the sinks. They share only the domain model, the one thing
the read and write sides must agree on.