Struct pcre2::bytes::RegexBuilder

source ·

pub struct RegexBuilder { /* private fields */ }

Expand description

A builder for configuring the compilation of a PCRE2 regex.

Implementations§

source §

impl RegexBuilder

source

pub fn new() -> RegexBuilder

Create a new builder with a default configuration.

source

pub fn build(&self, pattern: &str) -> Result<Regex, Error>

Compile the given pattern into a PCRE regex using the current configuration.

If there was a problem compiling the pattern, then an error is returned.

source

pub fn caseless(&mut self, yes: bool) -> &mut RegexBuilder

Enables case insensitive matching.

If the utf option is also set, then Unicode case folding is used to determine case insensitivity. When the utf option is not set, then only standard ASCII case insensitivity is considered.

This option corresponds to the i flag.

source

pub fn dotall(&mut self, yes: bool) -> &mut RegexBuilder

Enables “dot all” matching.

When enabled, the . metacharacter in the pattern matches any character, include \n. When disabled (the default), . will match any character except for \n.

This option corresponds to the s flag.

source

pub fn extended(&mut self, yes: bool) -> &mut RegexBuilder

Enable “extended” mode in the pattern, where whitespace is ignored.

This option corresponds to the x flag.

source

pub fn multi_line(&mut self, yes: bool) -> &mut RegexBuilder

Enable multiline matching mode.

When enabled, the ^ and $ anchors will match both at the beginning and end of a subject string, in addition to matching at the start of a line and the end of a line. When disabled, the ^ and $ anchors will only match at the beginning and end of a subject string.

This option corresponds to the m flag.

source

pub fn crlf(&mut self, yes: bool) -> &mut RegexBuilder

Enable matching of CRLF as a line terminator.

When enabled, anchors such as ^ and $ will match any of the following as a line terminator: \r, \n or \r\n.

This is disabled by default, in which case, only \n is recognized as a line terminator.

source

pub fn ucp(&mut self, yes: bool) -> &mut RegexBuilder

Enable Unicode matching mode.

When enabled, the following patterns become Unicode aware: \b, \B, \d, \D, \s, \S, \w, \W.

When set, this implies UTF matching mode. It is not possible to enable Unicode matching mode without enabling UTF matching mode.

This is disabled by default.

source

pub fn utf(&mut self, yes: bool) -> &mut RegexBuilder

Enable UTF matching mode.

When enabled, characters are treated as sequences of code units that make up a single codepoint instead of as single bytes. For example, this will cause . to match any single UTF-8 encoded codepoint, where as when this is disabled, . will any single byte (except for \n in both cases, unless “dot all” mode is enabled).

This is disabled by default.

source

pub fn disable_utf_check(&mut self) -> &mut RegexBuilder

👎Deprecated since 0.2.4: now a no-op due to new PCRE2 features

This is now deprecated and is a no-op.

Previously, this option permitted disabling PCRE2’s UTF-8 validity check, which could result in undefined behavior if the haystack was not valid UTF-8. But PCRE2 introduced a new option, PCRE2_MATCH_INVALID_UTF, in 10.34 which this crate always sets. When this option is enabled, PCRE2 claims to not have undefined behavior when the haystack is invalid UTF-8.

Therefore, disabling the UTF-8 check is not something that is exposed by this crate.

source

pub fn jit(&mut self, yes: bool) -> &mut RegexBuilder

Enable PCRE2’s JIT and return an error if it’s not available.

This generally speeds up matching quite a bit. The downside is that it can increase the time it takes to compile a pattern.

If the JIT isn’t available or if JIT compilation returns an error, then regex compilation will fail with the corresponding error.

This is disabled by default, and always overrides jit_if_available.

source

pub fn jit_if_available(&mut self, yes: bool) -> &mut RegexBuilder

Enable PCRE2’s JIT if it’s available.

This generally speeds up matching quite a bit. The downside is that it can increase the time it takes to compile a pattern.

If the JIT isn’t available or if JIT compilation returns an error, then a debug message with the error will be emitted and the regex will otherwise silently fall back to non-JIT matching.

This is disabled by default, and always overrides jit.

source

pub fn max_jit_stack_size(&mut self, bytes: Option<usize>) -> &mut RegexBuilder

Set the maximum size of PCRE2’s JIT stack, in bytes. If the JIT is not enabled, then this has no effect.

When None is given, no custom JIT stack will be created, and instead, the default JIT stack is used. When the default is used, its maximum size is 32 KB.

When this is set, then a new JIT stack will be created with the given maximum size as its limit.

Increasing the stack size can be useful for larger regular expressions.

By default, this is set to None.