quamina-rs
Rust port of quamina, a pattern-matching library for filtering JSON objects.
In Quamina, you add Patterns to a quamina instance, then match Events against it. Quamina tells you which Patterns matched. It does this fast—millions of events per second, regardless of how many patterns you have.
Try it online in the playground to test patterns against JSON events in your browser.
Contents
Quick Start
use Quamina;
let mut q = new;
q.add_pattern?;
q.add_pattern?;
let event = r#"{"status": "error", "level": 2}"#;
let matches = q.matches_for_event?;
// matches: ["p1", "p2"]
Patterns
A Pattern is a JSON object. Field values are arrays—if any element matches, it's a match. All fields mentioned must match (AND), but only one value per field needs to match (OR).
Given this event:
These patterns match though quamina:
Pattern types
Exact match — value must equal exactly:
Prefix/Suffix — string starts or ends with:
Wildcard — wild-card matching:
* matches any sequence. Use \* to match a literal asterisk, \\ for a literal backslash.
We also have Quamina's legacy shellstyle based matcher but you should avoid it. Shellstyle doesn't support \* or \\ escapes. It may go away entirely in the long run. In fact, prefer to use regex whenever you are comfortable with its performance and syntax.
Exists — field presence:
Anything-but — match unless value is in list:
Equals-ignore-case — case-insensitive:
Numeric — comparisons:
CIDR — IP address ranges:
Regexp — I-Regexp (RFC 9485):
Regexp uses ~ as the escape character to stay compliant with Quamina. There's also ~d for digits, ~p{L} for Unicode letters, and ~b/~B for word boundaries where you will pay a performance penalty depending on the patterns and events in use.
APIs
Creating and configuring
use ;
// Simple
let q = new;
// With options
let q = new
.with_media_type?
.with_auto_rebuild
.build?;
// With custom ID type
let q = new;
// With custom pattern complexity limits
let q = new
.with_max_pattern_depth
.with_max_fields_per_pattern
.with_arena_byte_budget
.with_max_states_per_pattern
.build?;
Adding and removing patterns
q.add_pattern?;
q.delete_patterns?;
q.clear;
Matching
let matches = q.matches_for_event?; // Vec of matching IDs
let matched = q.has_matches?; // bool
let count = q.count_matches?; // number of matches
Errors
add_pattern returns an error if the pattern JSON is malformed, uses invalid syntax, or exceeds complexity limits. matches_for_event returns an error if the event isn't valid JSON.
match q.add_pattern
Concurrency
A single Quamina instance can be safely shared across threads via Arc. Matching uses thread-local buffers, so multiple threads calling matches_for_event() on the same Arc<Quamina> run in parallel without contention.
Pattern addition (add_pattern) requires &mut self. For concurrent writes, wrap in a lock:
let q = new;
clone() rebuilds the automaton from stored patterns. It's not a cheap operation for instances with many patterns.
Performance
Matching time is nearly independent of pattern count. All patterns compile into a single automaton, so 10 patterns and 10,000 patterns have similar matching speed.
Pattern count scaling
On an M4 Max:
| Patterns | Match time |
|---|---|
| 100 | ~110 ns |
| 10,000 | ~90 ns |
Matching time is sublinear in pattern count because all patterns share one automaton.
Event benchmarks
| Benchmark | Time | Description |
|---|---|---|
| citylots | ~1,400 ns | 4 patterns, 206 KB of GeoJSON |
| nested field match | ~4,400 ns | 9 KB JSON, deeply nested field |
| early field exit | ~180 ns | 9 KB JSON, matching field near the top |
Pattern type benchmarks
| Benchmark | Time | Description |
|---|---|---|
| exact_match | ~56 ns | Single exact match |
| nested_match | ~83 ns | Exact match on a nested key |
| regex_match | ~48 ns | Simple regex (eager DFA after compile) |
| anything_but_match | ~65 ns | anything-but with 3 excluded values |
| numeric_range_two_sided | ~72 ns | Two-sided range (>= 0, < 100) |
| 100_prefix_patterns | ~117 ns | 100 prefix patterns merged into one automaton |
| shellstyle_26_patterns | ~97 ns | 26 shellstyle patterns (A*–Z*) |
| regexp_plus_long | ~260 ns | [a-z]+ on a 100-char value |
What affects performance
- Unique fields: More unique field paths across patterns = more work per event
- Event size: Larger JSON takes longer to parse and flatten
- Pattern complexity: Regexps with Unicode categories (e.g.,
~p{L}) are slower to compile
Running benchmarks
Limitations
Patterns are subject to complexity limits to prevent resource exhaustion from deeply nested or extremely wide patterns:
| Limit | Default | Builder method |
|---|---|---|
| Max nesting depth | 256 | with_max_pattern_depth |
| Max fields per pattern | 256 | with_max_fields_per_pattern |
| Arena byte budget | 10 MB | with_arena_byte_budget |
| Max states per pattern | 1024 | with_max_states_per_pattern |
Patterns exceeding these limits return QuaminaError::PatternTooComplex. The defaults are generous enough for any realistic use case; they're primarily a safety net against adversarial input.
Other limitations:
- Only JSON events are supported (media type
application/json) - Pattern field names are case-sensitive
shellstylepatterns don't support\*or\\escapes — preferwildcardorregexpinstead
Credits
All credits should go to Tim and other contributors in the original Go version. Tim's Quamina Diary also explains how automata-based matching works.
The last-synced upstream Go commit is tracked in .go-upstream-sync. Run just upstream to check for new changes.
License
Apache 2.0