1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
// What: `fn requires_resharp(src: &str) -> bool` returns `true` when
// `src` contains any feature the `regex` crate cannot parse
// OR would parse with semantics different from resharp's.
// Three feature families trigger true:
// 1. Set-algebra operators: unescaped `&` or `~(` outside a
// character class (resharp's intersection / complement).
// 2. Lookaround groups: `(?=`, `(?!`, `(?<=`, `(?<!`. The
// `regex` crate rejects these with "look-around, including
// look-ahead and look-behind, is not supported"; resharp
// accepts them.
// 3. Bare `_` outside a character class. Resharp treats `_`
// as a universal wildcard (matches any single character),
// while the `regex` crate treats it as a literal underscore.
// Routing a rule like `pre_post` to the `regex` crate
// would silently change its meaning -- the rule author
// wrote a wildcard pattern, the matcher searched for a
// literal seven-byte string. Escaped (`\_`) and class-
// internal `_` ([_], [A-Z_]) stay literal in both engines
// and do not trigger this branch.
// Conservative: any of the above triggers true, even if the
// resharp parser would have accepted a sequence the regex
// crate also accepts (no false-positive cost beyond using the
// slower engine).
// Why: We need to dispatch each rule to its engine at compile time.
// This shallow string scan avoids invoking either engine's
// parser; the actual parse happens once via the chosen
// engine. Regex character classes can contain `&` and parens
// as literal bytes (e.g. `[&a-z]`, `[()]`) without those
// characters carrying their group/algebra meaning, so we
// track class membership and skip class interiors. Named
// captures `(?<name>` / `(?P<name>` and non-capturing groups
// `(?:` must NOT trigger -- the regex crate handles them --
// so the lookbehind discriminator is "the byte after `(?<`
// is `=` or `!`", not "the regex contains `(?<`".
// TS map: `function requiresResharp(src: string): boolean`.
//
// In TS you'd write (pseudocode):
// ```ts
// function requiresResharp(src: string): boolean {
// // walk bytes, skip \X escapes, track class membership,
// // return true on outside-class `&`, `~(`, or any of
// // `(?=`, `(?!`, `(?<=`, `(?<!`.
// }
// ```