One field of a document: a name, optional OpenSearch mapping options passed
through to the index, and a source saying where its value
comes from. A leaf field’s type is declared on its source (a
Column’s ty, an Aggregate’s
value_type) so the document shape is known without
a database.
A geographic point built from two same-row columns. Resolves to an
OpenSearch geo_point; the document carries { "lat": …, "lon": … }, or
SQL NULL when either column is null (so a nullable point is absent rather
than {lat: null, lon: null}, which OpenSearch would reject).
Folds rows from a related table into the document. The kind
names the relationship — which side carries the key, and whether one row or
many fold in; filters, order_by, and limit narrow and shape the rows
that come back.
Per-backend configuration for an OpenSearch destination. The Sink enum that
selects between this and StdoutSink is a composition concern and lives in
the schema crate; the backend sinks read these settings directly.
One field within an IndexMapping: the document key it lands under, its
resolved Mapping (the mapping_type is always present), whether the
value can be null, and the fields nested under it for object / nested
types.
How an aggregate’s related rows tie back to the parent — a direct FK on the
aggregated table, or a junction table. (Joins carry their key inside
JoinKind; aggregates are inherently over-many, so belongs_to has no
aggregate counterpart.)
How the source connection is specified: a full URL (literal or from env) or
the parts to assemble one. Resolution happens at runtime, so a configured
value can be overridden by SOURCE_URL_VAR in the running environment.
What the pipeline does when a sink rejects a document at the item level —
it accepted the batch but refused a specific document (a mapping conflict, a
malformed value). Distinct from a flush-wide failure, which always stops the
run. Set globally on the config and overridable per index (both live in the
schema crate’s Config/Index, which assemble this policy).
Where a field’s value comes from. The shapes are mutually exclusive — a field
is exactly one of them — which is why this is an enum rather than a bag of
optional column / relation / fields that can contradict each other.
A condition on which rows a join or aggregate sees. Either a structured
comparison (NullCheckFilter, ValueOpFilter) or a RawFilter of
verbatim SQL for cases the structured forms don’t cover.
A relation’s key, viewed uniformly across joins and aggregates — the three
physical shapes a “these tables connect” fact can take. Traversal code
(document SQL, reverse resolution) matches on this instead of caring whether
the relation is a join or an aggregate.
A value resolved at runtime: either a literal baked into the config or a
reference to an environment variable read when the pipeline runs. Deferring
resolution is what lets a compiled config travel without its secrets — a
literal is carried as-is, an Env reference carries only the
variable name, and the real value is read in the environment that runs it.
Tells the engine to treat a row as deleted rather than present, keyed off a
mapped field or a raw column. The
optional when narrows it to rows matching a set of filters.
The reserved environment variable that supplies / overrides the source
connection URL. The source is a singleton, so one well-known name (the
12-factor convention) is unambiguous.
Resolve a required sink value, with reserved as the deployment override
variable. Same precedence as resolve_connection_url: an explicit Env
reference wins; otherwise reserved overrides the literal; otherwise the
literal.