Expand description
Schema classification: recognize the structure of a parsed event.
Real-world streams mix log schemas: one feed can carry ECS-normalized events, raw (rendered) Windows Event Log, flat Sysmon JSON, CEF, OCSF, or vendor-specific shapes, and the wire format is often still JSON while only the field names differ. This module recognizes which schema a parsed event belongs to from its content (marker fields and values), not from the input format, so it works regardless of how the event arrived.
Classification is declarative: each SchemaSignature is a set of
SchemaPredicates that must all hold (logical AND). The
SchemaClassifier returns the highest-specificity
signature that matches, breaking ties by name for determinism. Returning
None means the event matched no signature (“unknown”), which is the
actionable signal for surfacing unsupported schemas.
Built-in signatures cover ecs, ocsf, windows_eventlog, sysmon,
cef, and a low-specificity generic_json fallback for structured events
that match no specific security schema. Users extend the set with their own
signatures loaded from YAML (see parse_schema_signatures).
Detection-side only: this recognizes events so callers can route them to the right field-mapping pipeline. It does not collect, transport, or normalize events.
Structs§
- Field
Value Config - A
{ field: ..., value: ... }pair used by theequalsandmatchespredicate forms. - Routing
Config - The
routing:section of a schema config file. - Routing
Plan - A resolved routing plan: the deduplicated pipeline-sets to build one engine each, plus the schema-to-set mapping and the unknown-handling policy.
- Schema
Binding - A
schema -> pipelinesbinding: events recognized asschemaare evaluated against the engine built frompipelines. - Schema
Classifier - Recognizes the schema of parsed events from a set of signatures.
- Schema
Count Entry - One per-schema counter as exposed via
SchemaObserver::snapshot. - Schema
Match - The result of classifying an event: the matched schema name and the specificity of the signature that matched.
- Schema
Observation - Immutable snapshot of a
SchemaObserverat one moment. - Schema
Observer - Opt-in counter that classifies each observed event and tallies per-schema
(and unknown) totals. Mirrors the design of
FieldObserver: shared behind anArc, cheap repeated snapshots, monotonic lifetime counters for a Prometheus bridge. The schema set is small and bounded, so there is no key cap. - Schema
Predicate Config - A predicate as written in YAML: a single-key map, for example
field_present: ecs.versionorequals: { field: type, value: alert }. Exactly one form must be set per list entry. - Schema
Signature - A named schema recognizer: every predicate must hold for the signature to
match. Higher
specificitywins when several signatures match the same event. Multiple signatures may share aname(for example several distinct ways to recognize Sysmon); the classifier reports the name. - Schema
Signature Config - A signature as written in YAML.
- Schema
Signatures File - Top-level YAML document holding a
schemas:list and an optionalrouting:section.
Enums§
- OnUnknown
- What to do with an event whose schema matched no signature.
- Route
Decision - The decision for one event, produced by
RoutingPlan::decide. - Schema
Error - Errors raised while loading user schema signatures.
- Schema
Predicate - A single condition over a parsed event used to recognize a schema.
Functions§
- builtin_
schema_ names - Distinct built-in schema names, most specific first.
- load_
schema_ config - Load both the user signatures and the optional routing section from a YAML file path.
- load_
schema_ signatures - Load user schema signatures from a YAML file path.
- parse_
schema_ config - Parse both the user signatures and the optional routing section from a YAML string.
- parse_
schema_ signatures - Parse user schema signatures from a YAML string.