Skip to main content

Module regex_engine

Module regex_engine 

Source
Expand description

NFA-based regex engine — zero-dependency, deterministic, linear-time.

Compiles regex patterns into NFA states, then executes via Thompson NFA simulation. No backtracking, no per-match heap allocation beyond the initial state-set swap buffers. Guarantees O(n×m) worst-case matching.

§Public API

FunctionPurpose
is_matchTest whether the pattern matches anywhere
findReturn the first match span (start, end)
find_allReturn all non-overlapping match spans
splitSplit input by pattern, returning segment spans

§Supported syntax (Perl-spirit subset)

  .         any byte (or any byte except \n without `s` flag)
  \d        ASCII digit [0-9]
  \w        ASCII word  [a-zA-Z0-9_]
  \s        ASCII whitespace [\t\n\r\x0C\x20]
  \D \W \S  negated classes
  [abc]     character class
  [^abc]    negated character class
  [a-z]     character range
  a|b       alternation
  (...)     grouping
  *         zero or more (greedy)
  +         one or more (greedy)
  ?         zero or one (greedy)
  *? +? ??  non-greedy (lazy) variants
  ^         start of input (or line in `m` mode)
  $         end of input (or line in `m` mode)
  \b        word boundary
  \\        literal backslash
  \xNN      hex byte

§Flags

FlagMeaning
iCase-insensitive (ASCII only)
mMultiline (^/$ match line boundaries)
sDotall (. matches \n)
xExtended (whitespace ignored, # comments)

Functions§

find
Find the byte-offset span of the first match in haystack.
find_all
Find all non-overlapping match spans in haystack.
is_match
Test whether pattern matches anywhere inside haystack.
split
Split haystack by a regex pattern, returning byte-ranges of non-matching segments.