Skip to main content

Crate schema

Crate schema 

Source
Expand description

§flusso-schema

The front door to the configuration layer: load a flusso configuration into a validated model.

load takes the path to a flusso.toml, reads the source and sinks from it, resolves and parses every *.schema.yml the file references, and hands back a single Config.

The format-specific crates (schema-config-toml, schema-index-yaml) and the core model (schema-core) sit underneath. Downstream code depends only on this crate and reaches the core types through its re-exports.

§Example

let config = schema::load("flusso.toml")?;

for (name, index) in &config.indexes {
    println!("{name}: table {} ({} fields)", index.schema.table, index.schema.fields.len());
}

§2-schema — config & schema loading

This group turns config files into the validated Config.

CratePathRole
schema.The front door: load() reads a flusso.toml + its *.schema.yml files into one validated Config, and re-exports the schema-core vocabulary.
schema-config-toml1-config-tomlParses flusso.toml → neutral entities.
schema-index-yaml1-index-yamlParses *.schema.yml → core types.

Each parser crate works in two stages: parse (serde → neutral entities), then convert (entities → model). The flusso.tomlConfig conversion lives in schema itself — a composition step — so the toml parser stays free of Config.

Part of the flusso library crates.

Modules§

common
config
traits

Structs§

Aggregate
Reduces rows from a related table to a single value — a count, sum, or extreme. The key connects the tables; filters restrict which rows count.
Column
A column-backed field: the column to read, its declared type and nullability, the transforms to apply, and a default to coalesce nulls to.
ColumnName
Compiled
A compiled configuration: the validated Config plus the provenance needed to read it safely.
Config
A whole deployment: where data comes from, where it goes, and what to build.
ConnectionUrl
ConnectionUrlFromPartsBuilder
Use builder syntax to set the inputs and finish with call().
ContentHash
DatabaseSchema
Field
One field of a document: a name, optional OpenSearch mapping options passed through to the index, and a source saying where its value comes from. A leaf field’s type is declared on its source (a Column’s ty, an Aggregate’s value_type) so the document shape is known without a database.
FieldName
Geo
A geographic point built from two same-row columns. Resolves to an OpenSearch geo_point; the document carries { "lat": …, "lon": … }, or SQL NULL when either column is null (so a nullable point is absent rather than {lat: null, lon: null}, which OpenSearch would reject).
HttpUrl
Index
One index in a Config, paired with whether it is built on this run.
IndexMapping
A fully-resolved mapping for one index: every field typed and ready for a sink to translate into its native mapping format.
IndexName
IndexSchema
The shape of a single search document: a root table and the fields built from its columns and related tables.
Join
Folds rows from a related table into the document. The kind names the relationship — which side carries the key, and whether one row or many fold in; filters, order_by, and limit narrow and shape the rows that come back.
Mapping
OpenSearch mapping. mapping_type is required; all other properties are passed through as-is.
NullCheckFilter
is_null / is_not_null — no value operand.
OpensearchSink
Per-backend configuration for an OpenSearch destination. The Sink enum that selects between this and StdoutSink is a composition concern and lives in the schema crate; the backend sinks read these settings directly.
OrderBy
RawFilter
RawFilterValue
ResolvedField
One field within an IndexMapping: the document key it lands under, its resolved Mapping (the mapping_type is always present), whether the value can be null, and the fields nested under it for object / nested types.
SchemaPath
ServerConfig
Bind addresses for the two operational HTTP surfaces, as configured in flusso.toml’s [server] table. Parsed and validated at config-read time (see schema_config_toml’s BindAddress), so these are real socket addresses by the time they reach the binary, which layers FLUSSO_* env vars and CLI flags on top (which win).
SinkName
SoftDeleteColumn
SoftDeleteField
Source
The database documents are read from. Today that’s always Postgres.
StdoutSink
TableName
Through
A junction table linking two sides of a many-to-many relation.
ValueOpFilter
All other filter ops — value operand matches the operator’s arity.

Enums§

AggregateKey
How an aggregate’s related rows tie back to the parent — a direct FK on the aggregated table, or a junction table. (Joins carry their key inside JoinKind; aggregates are inherently over-many, so belongs_to has no aggregate counterpart.)
AggregateOp
ColumnNameError
CompileError
ConnectionSpec
How the source connection is specified: a full URL (literal or from env) or the parts to assemble one. Resolution happens at runtime, so a configured value can be overridden by SOURCE_URL_VAR in the running environment.
ConnectionUrlError
DatabaseSchemaError
Direction
FailurePolicy
What the pipeline does when a sink rejects a document at the item level — it accepted the batch but refused a specific document (a mapping conflict, a malformed value). Distinct from a flush-wide failure, which always stops the run. Set globally on the config and overridable per index (both live in the schema crate’s Config/Index, which assemble this policy).
FieldNameError
FieldSource
Where a field’s value comes from. The shapes are mutually exclusive — a field is exactly one of them — which is why this is an enum rather than a bag of optional column / relation / fields that can contradict each other.
Filter
A condition on which rows a join or aggregate sees. Either a structured comparison (NullCheckFilter, ValueOpFilter) or a RawFilter of verbatim SQL for cases the structured forms don’t cover.
FilterOp
FilterValue
Typed filter value — arity matches the operator. In/NotInList, BetweenRange, everything else → Single.
FlussoType
The declared type of a leaf field — the single vocabulary that bridges a Postgres column type and an OpenSearch mapping type.
GenericValue
The canonical value vocabulary every layer trades in — the middle type between a source and a sink.
HttpUrlError
IndexNameError
JoinKind
The relationship a join expresses. The verb carries both the cardinality and — the part that matters — which table holds the key:
LoadError
MappingType
NullOp
RawFilterValueError
Relation
How a field draws on a related table: either folding its rows in as nested documents (Join) or reducing them to a single value (Aggregate).
RelationKey
A relation’s key, viewed uniformly across joins and aggregates — the three physical shapes a “these tables connect” fact can take. Traversal code (document SQL, reverse resolution) matches on this instead of caring whether the relation is a join or an aggregate.
ResolveError
Scheme
Secret
A value resolved at runtime: either a literal baked into the config or a reference to an environment variable read when the pipeline runs. Deferring resolution is what lets a compiled config travel without its secrets — a literal is carried as-is, an Env reference carries only the variable name, and the real value is read in the environment that runs it.
Sink
A destination for built documents: an OpenSearch cluster, or stdout for inspecting output during development.
SinkNameError
SoftDelete
Tells the engine to treat a row as deleted rather than present, keyed off a mapped field or a raw column. The optional when narrows it to rows matching a set of filters.
SourceType
TableNameError
TextAnalysis
Which analyzer toolkit the sink wires its flusso_* analyzers onto.
Transform

Constants§

CONFIG_SCHEMA
The JSON Schema describing a flusso.toml config file, embedded from this crate’s schemas/ directory for editor assist and programmatic access (both re-exported from schema and emitted by flusso schema config). Kept in lockstep with this parser by schema’s schema_drift test.
FORMAT_VERSION
The artifact format version. Bumped on any incompatible change to the serialized shape so a binary refuses an artifact it can’t read, rather than misinterpreting it.
INDEX_SCHEMA
The JSON Schema (authored as YAML) describing a *.schema.yml index file, embedded from this crate’s schemas/ directory for editor assist and programmatic access (both re-exported from schema and emitted by flusso schema index). Kept in lockstep with this parser by schema’s schema_drift test.
SOURCE_URL_VAR
The reserved environment variable that supplies / overrides the source connection URL. The source is a singleton, so one well-known name (the 12-factor convention) is unambiguous.

Traits§

ParseFrom

Functions§

compile
Compile a flusso.toml (and the schemas it references) into a Compiled envelope. Needs neither a database nor any secret to be set — schemas are self-describing and secrets are deferred.
from_bytes
Decode a Compiled envelope from MessagePack bytes, checking the format version.
http_url
Build an HttpUrl from a resolved string, mapping the validation error into a ResolveError. Used by sink resolution.
load
Loads a full Config from a TOML config file at config_path.
load_compiled
Read a compiled artifact from path and return its Config.
resolve_connection_url
Resolve the source connection URL, with SOURCE_URL_VAR as the deployment override. Precedence, highest first:
resolve_optional
Resolve an optional sink value. Same precedence as resolve_required, plus: when the config omits it, reserved fills it if set, otherwise None.
resolve_required
Resolve a required sink value, with reserved as the deployment override variable. Same precedence as resolve_connection_url: an explicit Env reference wins; otherwise reserved overrides the literal; otherwise the literal.
sink_var_prefix
The per-sink reserved-variable prefix: the sink’s name, uppercased, so several OpenSearch sinks never collide (<NAME>_OPENSEARCH_URL, …).
to_bytes
Serialize a Compiled envelope to its MessagePack bytes.
validate_index_prefix
Check that prefix is a legal leading fragment of an OpenSearch index name.
write
Write a Compiled envelope to path as MessagePack.