strmatch only.Expand description
Regex-shaped patterns, fast-path dispatch.
Operators write a regex (the most familiar pattern language).
strmatch classifies it into one of four tiers and dispatches at
match time via the cheapest engine that’s correct:
- Byte (≤ 30 ns) – direct byte ops:
memchr/memchr2/memchr3/ single-bytestarts_with/ends_with/==. - Literal (≤ 200 ns) – single multi-byte literal:
memmem::Finder/ multi-bytestarts_with/ends_with/==. - LiteralSet (≤ 500 ns) –
aho-corasickover ≥ 2 literals (optional uniform anchor checked after the AC scan). Regex engine is never invoked. - Regex (engine-bounded) – fall through to
regex-automata::meta::Regex. Has its own internal prefilter pipeline; cost depends on pattern and haystack.
Budgets are typical for a modern x86 server on a ~200-byte
haystack; see MatcherTier::typical_budget_ns and
benches/strmatch.rs.
§Anti-spam discipline
When a pattern compiles to MatcherTier::Regex (the engine
fall-back), strmatch emits one WARN per distinct pattern per
process, capped at 10 distinct WARNs total. After the cap, further
fall-through patterns log at DEBUG. A counter
hyperi_strmatch_regex_fallback_total increments on every
fall-through regardless of log level – operators can scrape that
rather than rely on logs.
§Quality gates
Use StrMatcher::builder with StrMatcherBuilder::min_tier to
reject (or loudly warn about) patterns that fall below an
operator-chosen tier. Useful for hot-path configs where regex
fall-through is unacceptable.
§Example
use hyperi_rustlib::strmatch::{MatcherTier, OnBelowMin, StrMatcher};
// Byte tier -- anchored single byte, dispatches to hay.first() == Some(b)
let m = StrMatcher::new(r"^/")?;
assert_eq!(m.tier(), MatcherTier::Byte);
assert!(m.is_match(b"/api/v1/orders"));
// Literal tier -- multi-byte literal, dispatches to memmem
let m = StrMatcher::new(r"AKIA")?;
assert_eq!(m.tier(), MatcherTier::Literal);
assert!(m.is_match(b"... AKIA1234 ..."));
// LiteralSet tier -- alternation, dispatches to AhoCorasick
let m = StrMatcher::new(r"AKIA|ghp_|sk_live_")?;
assert_eq!(m.tier(), MatcherTier::LiteralSet);
assert!(m.is_match(b"github token: ghp_abcdef"));
// Regex tier -- falls through to engine; refuse the build instead
let err = StrMatcher::builder()
.min_tier(MatcherTier::LiteralSet)
.on_below_min(OnBelowMin::Reject)
.build(r"\w+@\w+")
.unwrap_err();
assert!(err.to_string().contains("tier"));Structs§
- Match
- Byte offsets of a match. End is exclusive:
&hay[start..end]is the matched slice. - SetMatch
- Like
Matchbut also identifies which input pattern matched in aStrMatcherSet. - StrMatcher
- Compiled pattern with tier-aware dispatch.
- StrMatcher
Builder - Builder for
StrMatcher. Carries minimum-tier policy and the case-insensitivity flag. - StrMatcher
Set - Multi-pattern matcher.
Enums§
- Build
Error - Failure modes during construction.
- Matcher
Tier - Which engine class a
StrMatcheris dispatching to. Tiers are ordered by cost –Byte > Literal > LiteralSet > Regex(higher means faster). UseSelf::rankformin_tiercomparisons. - OnBelow
Min - What to do when a pattern’s classification falls below the
builder’s
min_tier.