Skip to main content

Module schema

Module schema 

Source
Expand description

Schema classification: recognize the structure of a parsed event.

Real-world streams mix log schemas: one feed can carry ECS-normalized events, raw (rendered) Windows Event Log, flat Sysmon JSON, CEF, OCSF, or vendor-specific shapes, and the wire format is often still JSON while only the field names differ. This module recognizes which schema a parsed event belongs to from its content (marker fields and values), not from the input format, so it works regardless of how the event arrived.

Classification is declarative: each SchemaSignature is a set of SchemaPredicates that must all hold (logical AND). The SchemaClassifier returns the highest-specificity signature that matches, breaking ties by name for determinism. Returning None means the event matched no signature (“unknown”), which is the actionable signal for surfacing unsupported schemas.

Built-in signatures cover ecs, ocsf, windows_eventlog, sysmon, cef, and a low-specificity generic_json fallback for structured events that match no specific security schema. Users extend the set with their own signatures loaded from YAML (see parse_schema_signatures).

Detection-side only: this recognizes events so callers can route them to the right field-mapping pipeline. It does not collect, transport, or normalize events.

Structs§

FieldValueConfig
A { field: ..., value: ... } pair used by the equals and matches predicate forms.
RoutingConfig
The routing: section of a schema config file.
RoutingPlan
A resolved routing plan: the deduplicated pipeline-sets to build one engine each, plus the schema-to-set mapping and the unknown-handling policy.
SchemaBinding
A schema -> pipelines binding: events recognized as schema are evaluated against the engine built from pipelines.
SchemaClassifier
Recognizes the schema of parsed events from a set of signatures.
SchemaCountEntry
One per-schema counter as exposed via SchemaObserver::snapshot.
SchemaMatch
The result of classifying an event: the matched schema name and the specificity of the signature that matched.
SchemaObservation
Immutable snapshot of a SchemaObserver at one moment.
SchemaObserver
Opt-in counter that classifies each observed event and tallies per-schema (and unknown) totals. Mirrors the design of FieldObserver: shared behind an Arc, cheap repeated snapshots, monotonic lifetime counters for a Prometheus bridge. The schema set is small and bounded, so there is no key cap.
SchemaPredicateConfig
A predicate as written in YAML: a single-key map, for example field_present: ecs.version or equals: { field: type, value: alert }. Exactly one form must be set per list entry.
SchemaSignature
A named schema recognizer: every predicate must hold for the signature to match. Higher specificity wins when several signatures match the same event. Multiple signatures may share a name (for example several distinct ways to recognize Sysmon); the classifier reports the name.
SchemaSignatureConfig
A signature as written in YAML.
SchemaSignaturesFile
Top-level YAML document holding a schemas: list and an optional routing: section.

Enums§

OnUnknown
What to do with an event whose schema matched no signature.
RouteDecision
The decision for one event, produced by RoutingPlan::decide.
SchemaError
Errors raised while loading user schema signatures.
SchemaPredicate
A single condition over a parsed event used to recognize a schema.

Functions§

builtin_schema_names
Distinct built-in schema names, most specific first.
load_schema_config
Load both the user signatures and the optional routing section from a YAML file path.
load_schema_signatures
Load user schema signatures from a YAML file path.
parse_schema_config
Parse both the user signatures and the optional routing section from a YAML string.
parse_schema_signatures
Parse user schema signatures from a YAML string.