pub struct RegexOptionsBuilder { /* private fields */ }Expand description
A builder for a Regex to allow configuring options.
Implementations§
Source§impl RegexOptionsBuilder
impl RegexOptionsBuilder
Sourcepub fn build(&self, pattern: String) -> Result<Regex>
pub fn build(&self, pattern: String) -> Result<Regex>
Build a Regex from the given pattern.
Returns an Error if the pattern could not be parsed.
Sourcepub fn case_insensitive(&mut self, yes: bool) -> &mut Self
pub fn case_insensitive(&mut self, yes: bool) -> &mut Self
Override default case insensitive this is to enable/disable casing via builder instead of a flag within the raw string pattern which will be parsed
Default is false
Sourcepub fn multi_line(&mut self, yes: bool) -> &mut Self
pub fn multi_line(&mut self, yes: bool) -> &mut Self
Enable multi-line regex
Sourcepub fn ignore_whitespace(&mut self, yes: bool) -> &mut Self
pub fn ignore_whitespace(&mut self, yes: bool) -> &mut Self
Allow ignore whitespace
Sourcepub fn dot_matches_new_line(&mut self, yes: bool) -> &mut Self
pub fn dot_matches_new_line(&mut self, yes: bool) -> &mut Self
Enable or disable the “dot matches any character” flag.
When this is enabled, . will match any character. When it’s disabled, then . will match any character
except for a new line character.
Sourcepub fn crlf(&mut self, yes: bool) -> &mut Self
pub fn crlf(&mut self, yes: bool) -> &mut Self
Enable or disable the CRLF mode flag (R).
When enabled, \r\n is treated as a single line ending for the purposes of
^ and $ in multi-line mode, instead of treating \r and \n as separate
line endings.
By default, this is disabled. It may be selectively enabled in the regular
expression by using the R flag, e.g. (?mR) or (?Rm).
Sourcepub fn verbose_mode(&mut self, yes: bool) -> &mut Self
pub fn verbose_mode(&mut self, yes: bool) -> &mut Self
Enable verbose mode in the regular expression.
The same as ignore_whitespace
When enabled, verbose mode permits insigificant whitespace in many
places in the regular expression, as well as comments. Comments are
started using # and continue until the end of the line.
By default, this is disabled. It may be selectively enabled in the
regular expression by using the x flag regardless of this setting.
Sourcepub fn unicode_mode(&mut self, yes: bool) -> &mut Self
pub fn unicode_mode(&mut self, yes: bool) -> &mut Self
Enable or disable the Unicode flag (u) by default.
By default this is enabled. It may alternatively be selectively
disabled in the regular expression itself via the u flag.
Note that unless “allow invalid UTF-8” is enabled (it’s disabled by default), a regular expression will fail to parse if Unicode mode is disabled and a sub-expression could possibly match invalid UTF-8.
WARNING: Unicode mode can greatly increase the size of the compiled
DFA, which can noticeably impact both memory usage and compilation
time. This is especially noticeable if your regex contains character
classes like \w that are impacted by whether Unicode is enabled or
not. If Unicode is not necessary, you are encouraged to disable it.
Sourcepub fn backtrack_limit(&mut self, limit: usize) -> &mut Self
pub fn backtrack_limit(&mut self, limit: usize) -> &mut Self
Limit for how many times backtracking should be attempted for fancy regexes (where
backtracking is used). If this limit is exceeded, execution returns an error with
Error::BacktrackLimitExceeded.
This is for preventing a regex with catastrophic backtracking to run for too long.
Default is 1_000_000 (1 million).
Sourcepub fn delegate_size_limit(&mut self, limit: usize) -> &mut Self
pub fn delegate_size_limit(&mut self, limit: usize) -> &mut Self
Set the approximate size limit of the compiled regular expression.
This option is forwarded from the wrapped regex crate. Note that depending on the used
regex features there may be multiple delegated sub-regexes fed to the regex crate. As
such the actual limit is closer to <number of delegated regexes> * delegate_size_limit.
Sourcepub fn delegate_dfa_size_limit(&mut self, limit: usize) -> &mut Self
pub fn delegate_dfa_size_limit(&mut self, limit: usize) -> &mut Self
Set the approximate size of the cache used by the DFA.
This option is forwarded from the wrapped regex crate. Note that depending on the used
regex features there may be multiple delegated sub-regexes fed to the regex crate. As
such the actual limit is closer to <number of delegated regexes> * delegate_dfa_size_limit.
Sourcepub fn find_not_empty(&mut self, yes: bool) -> &mut Self
pub fn find_not_empty(&mut self, yes: bool) -> &mut Self
Require that matches are non-empty (i.e. match at least one character).
When this is enabled, any match attempt that would result in a zero-length match is rejected.
Default is false.
N.B. When find_not_empty is set and analysis determines the pattern will only ever
produce an empty match, compiling the regex will return
CompileError::PatternCanNeverMatch instead of silently constructing a regex that can never
return a result. This catches the user error at compile time rather than allowing the
combination to execute pointlessly at runtime.
Sourcepub fn ignore_numbered_groups_when_named_groups_exist(
&mut self,
yes: bool,
) -> &mut Self
pub fn ignore_numbered_groups_when_named_groups_exist( &mut self, yes: bool, ) -> &mut Self
Treat unnamed capture groups as non-capturing when named groups exist. Prevents accessing capture groups by number from within the pattern (backrefs, subroutine calls) when named groups are present.
Sourcepub fn oniguruma_mode(&mut self, yes: bool) -> &mut Self
pub fn oniguruma_mode(&mut self, yes: bool) -> &mut Self
Attempts to better match Oniguruma’s default behavior
Currently this amounts to changing behavior with:
§Left and right word bounds
fancy-regex follows the default of other regex engines such as the regex crate itself
where \< and \> correspond to a left and right word-bound respectively. This
differs from Oniguruma’s defaults which treat them as matching the literals < and >.
When this option is set using \< and \> in the pattern will match the literals
< and > instead of word bounds.
§Repetition/Quantifiers on empty groups
fancy-regex would normally reject patterns like (?:)+ because the + has nothing
to target. In Oniguruma mode, the empty repeat is silently dropped at parse time.
§Example
use fancy_regex::{Regex, RegexBuilder};
let haystack = "turbo::<Fish>";
let regex = r"\<\w*\>";
// By default `\<` and `\>` will match the start and end of a word boundary
let word_bounds_regex = Regex::new(regex).unwrap();
let word_bounds = word_bounds_regex.find(haystack).unwrap().unwrap();
assert_eq!(word_bounds.as_str(), "turbo");
// With the option set they instead match the literal `<` and `>` characters
let literals_regex = RegexBuilder::new(regex).oniguruma_mode(true).build().unwrap();
let literals = literals_regex.find(haystack).unwrap().unwrap();
assert_eq!(literals.as_str(), "<Fish>");