multimatch 0.1.0

Multi-pattern matching engine — Aho-Corasick + regex with optional Hyperscan SIMD acceleration
Documentation

multimatch — Multi-pattern matching engine

License Tests Crates.io

Why

Most scans spend most of their time matching long text against many signatures. multimatch gives a single abstraction for literal and regex signature sets with one API, optimized for batch scanning and reuse.

It can scan secrets, response patterns, shellcode markers, and sink hints across byte streams without forcing your code to juggle multiple matcher implementations.

The API composes naturally with scanclient response bytes, secreport findings, and custom scanners.

Quick Start

use multimatch::{from_literals, MatchResult, Scanner};

fn main() {
    let engine = from_literals(&["password", "api_key"]).unwrap();
    let matches: Vec<MatchResult> = engine.scan(b"This request contains api_key=abc123");

    for m in matches {
        println!("match id {} at {}..{}", m.pattern_id, m.start, m.end);
    }
}

Features

  • Convenience constructors: from_literals, from_regexes, from_pairs.
  • PatternSetBuilder for rich pattern configuration (literal/regex + case-insensitive).
  • Scanner trait with scan, is_match, and pattern_count.
  • Shared engine for both literals and regex with compile-time error handling (MatchError).
  • Lightweight, dependency-light matching for high-throughput scanners.

TOML Configuration

multimatch does not use TOML.

API Overview

  • PatternSet: compiled matcher set (implements Scanner).
  • PatternSetBuilder: add patterns and compile.
  • PatternDef / PatternKind: individual pattern representation.
  • MatchEngine: compiled execution engine.
  • MatchError: parser/compile errors.
  • MatchResult: matched pattern_id and byte span.

Examples

1) Quick scanner for literals

use multimatch::{PatternSet, Scanner};

let patterns = PatternSet::builder()
    .add_literal("token", 0)
    .add_literal_ci("bearer", 1)
    .build()
    .unwrap();

let all = patterns.scan_str("Authorization: BEARER abc");
assert!(patterns.is_match("abc".as_bytes()));
println!("patterns: {}", patterns.pattern_count());

2) Mixed regex set for secret-format detection

use multimatch::{PatternSet, MatchResult};

let patterns = PatternSet::builder()
    .add_regex(r"AKIA[0-9A-Z]{16}", 0)
    .add_literal("api_key", 1)
    .build()
    .unwrap();

let matches: Vec<MatchResult> = patterns.scan("key=AKIA1234EXAMPLE0000".as_bytes());

3) Implement Scanner for a custom matcher wrapper

use multimatch::{MatchResult, PatternSet, Scanner};

struct CountingScanner {
    inner: PatternSet,
    seen: usize,
}

impl Scanner for CountingScanner {
    fn scan(&self, input: &[u8]) -> Vec<MatchResult> {
        self.inner.scan(input)
    }

    fn is_match(&self, input: &[u8]) -> bool {
        self.inner.is_match(input)
    }

    fn pattern_count(&self) -> usize {
        self.inner.pattern_count()
    }
}

Traits

multimatch defines the Scanner trait for any scan implementation. Wrap or replace the core matcher and still consume your scanner with the same downstream interfaces.

Related Crates

License

MIT, Corum Collective LLC

Docs: https://docs.rs/multimatch

Santh ecosystem: https://santh.io