Crate rules [−] [src]
Rules
Rules uses regular expressions to do pattern matching using syntax based upon the Perl 6 regex grammar. The Perl 6 grammar has been heavily revised from Perl 5 and should not be equated with it. This may look nothing like any regex you have seen before.
Note
The only real currently available method is is_match()
.
Syntax
Currently, this is designed for ASCII and may not behave properly with Unicode.
Whitespace is generally ignored so that a regex can be more readable and less dense.
r"fred" // Normal way
r"f r e d" // Completely equivalent
// Will match `apples_oranges` or any other deliminator
r"apples . oranges"
Literals
Alphanumerics, underscores (_
), and everything enclosed within
quotes ("
) and ticks ('
) are the only literals.
hello_world // Matches `hello_world`.
"carrot cake" // Matches `carrot cake`.
'apple pie' // Matches `apple pie`.
Everything else must be escaped with a backslash (\*
) to literally match.
it\'s\ my\ birthday // Matches `it's my birthday`.
Chevrons: <>
Chevrons are considered a metacharacter grouping operator whose behaviour changes depending on the first character found inside. The behavior for each different character is:
First character | Example | Result |
---|---|---|
Whitespace | < big small > |
Alternative quotes matches [ 'big' | 'small' ] |
alphabetic | <alpha> |
Named character class which capture |
? |
<?before foo> |
A positive zero width assertion |
! |
<!before foo> |
A negative zero width assertion |
[ |
<[ ab ]> |
A character class matches [ 'a' | 'b' ] |
- |
<-[a] + [b]> |
Negated character class: [ab] negated |
+ |
<+ [a] > |
Doesn't modify the class. |
Lookaround
- lookahead -
foo <?after bar>
matchesfoo
infoobar
- negative lookahead -
foo <!after bar>
matchesfoo
infoobaz
- lookbehind -
<?before foo> bar
matchesbar
infoobar
- negative lookbehind -
<!before foo> bar
matchesbar
insushibar
An example with both:
<?before foo> bar <?after baz>
matches bar
in foobarbaz
Set operators
These operators can be applied to groups which will be analyzed later:
+ Union // [123] + [345] = [12345]
| Union // Same
& Intersection // [123] & [345] = [3]
- Difference // [123] - [345] = [12]
^ Symmetric difference // [123] ^ [345] = [1245]
Character classes
Default character classes
Character | Matches | Inverse |
---|---|---|
. |
Any character | N/A |
\d |
Digit | \D |
\h |
Horizontal whitespace | \H |
\n |
Newline | \N |
\s |
Any whitespace | \S |
\t |
Tab | \T |
\w |
Alphanumeric or _ |
\W |
Custom character classes
Characters inside a set of <[ ]>
form a custom character
class:
// Matches `a` or `b` or `c`
<[ a b c ]>
// `..` expresses a range so this matches
// from `a` to `g` or a digit
<[ a .. g \d ]>
// The `[]` bind the sets together into (non-capturing)
// groups so set operators can be used.
<[0-9] - [13579]> // Matches an even number
<\d - [13579]> // Same
Comments
Comments are allowed inside a regex.
// This matches `myregex`
r"my // This is a comment which goes to the end of the line
regex"
Modules
re |