[][src]Struct grep_pcre2::RegexMatcherBuilder

pub struct RegexMatcherBuilder { /* fields omitted */ }

A builder for configuring the compilation of a PCRE2 regex.

Methods

impl RegexMatcherBuilder[src]

pub fn new() -> RegexMatcherBuilder[src]

Create a new matcher builder with a default configuration.

pub fn build(&self, pattern: &str) -> Result<RegexMatcher, Error>[src]

Compile the given pattern into a PCRE matcher using the current configuration.

If there was a problem compiling the pattern, then an error is returned.

pub fn caseless(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enables case insensitive matching.

If the utf option is also set, then Unicode case folding is used to determine case insensitivity. When the utf option is not set, then only standard ASCII case insensitivity is considered.

This option corresponds to the i flag.

pub fn case_smart(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Whether to enable "smart case" or not.

When smart case is enabled, the builder will automatically enable case insensitive matching based on how the pattern is written. Namely, case insensitive mode is enabled when both of the following things are believed to be true:

  1. The pattern contains at least one literal character. For example, a\w contains a literal (a) but \w does not.
  2. Of the literals in the pattern, none of them are considered to be uppercase according to Unicode. For example, foo\pL has no uppercase literals but Foo\pL does.

Note that the implementation of this is not perfect. Namely, \p{Ll} will prevent case insensitive matching even though it is part of a meta sequence. This bug will probably never be fixed.

pub fn dotall(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enables "dot all" matching.

When enabled, the . metacharacter in the pattern matches any character, include \n. When disabled (the default), . will match any character except for \n.

This option corresponds to the s flag.

pub fn extended(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enable "extended" mode in the pattern, where whitespace is ignored.

This option corresponds to the x flag.

pub fn multi_line(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enable multiline matching mode.

When enabled, the ^ and $ anchors will match both at the beginning and end of a subject string, in addition to matching at the start of a line and the end of a line. When disabled, the ^ and $ anchors will only match at the beginning and end of a subject string.

This option corresponds to the m flag.

pub fn crlf(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enable matching of CRLF as a line terminator.

When enabled, anchors such as ^ and $ will match any of the following as a line terminator: \r, \n or \r\n.

This is disabled by default, in which case, only \n is recognized as a line terminator.

pub fn word(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Require that all matches occur on word boundaries.

Enabling this option is subtly different than putting \b assertions on both sides of your pattern. In particular, a \b assertion requires that one side of it match a word character while the other match a non-word character. This option, in contrast, merely requires that one side match a non-word character.

For example, \b-2\b will not match foo -2 bar since - is not a word character. However, -2 with this word option enabled will match the -2 in foo -2 bar.

pub fn ucp(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enable Unicode matching mode.

When enabled, the following patterns become Unicode aware: \b, \B, \d, \D, \s, \S, \w, \W.

When set, this implies UTF matching mode. It is not possible to enable Unicode matching mode without enabling UTF matching mode.

This is disabled by default.

pub fn utf(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enable UTF matching mode.

When enabled, characters are treated as sequences of code units that make up a single codepoint instead of as single bytes. For example, this will cause . to match any single UTF-8 encoded codepoint, where as when this is disabled, . will any single byte (except for \n in both cases, unless "dot all" mode is enabled).

Note that when UTF matching mode is enabled, every search performed will do a UTF-8 validation check, which can impact performance. The UTF-8 check can be disabled via the disable_utf_check option, but it is undefined behavior to enable UTF matching mode and search invalid UTF-8.

This is disabled by default.

pub unsafe fn disable_utf_check(&mut self) -> &mut RegexMatcherBuilder[src]

When UTF matching mode is enabled, this will disable the UTF checking that PCRE2 will normally perform automatically. If UTF matching mode is not enabled, then this has no effect.

UTF checking is enabled by default when UTF matching mode is enabled. If UTF matching mode is enabled and UTF checking is enabled, then PCRE2 will return an error if you attempt to search a subject string that is not valid UTF-8.

Safety

It is undefined behavior to disable the UTF check in UTF matching mode and search a subject string that is not valid UTF-8. When the UTF check is disabled, callers must guarantee that the subject string is valid UTF-8.

pub fn jit(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enable PCRE2's JIT and return an error if it's not available.

This generally speeds up matching quite a bit. The downside is that it can increase the time it takes to compile a pattern.

If the JIT isn't available or if JIT compilation returns an error, then regex compilation will fail with the corresponding error.

This is disabled by default, and always overrides jit_if_available.

pub fn jit_if_available(&mut self, yes: bool) -> &mut RegexMatcherBuilder[src]

Enable PCRE2's JIT if it's available.

This generally speeds up matching quite a bit. The downside is that it can increase the time it takes to compile a pattern.

If the JIT isn't available or if JIT compilation returns an error, then a debug message with the error will be emitted and the regex will otherwise silently fall back to non-JIT matching.

This is disabled by default, and always overrides jit.

pub fn max_jit_stack_size(
    &mut self,
    bytes: Option<usize>
) -> &mut RegexMatcherBuilder
[src]

Set the maximum size of PCRE2's JIT stack, in bytes. If the JIT is not enabled, then this has no effect.

When None is given, no custom JIT stack will be created, and instead, the default JIT stack is used. When the default is used, its maximum size is 32 KB.

When this is set, then a new JIT stack will be created with the given maximum size as its limit.

Increasing the stack size can be useful for larger regular expressions.

By default, this is set to None.

Trait Implementations

impl Clone for RegexMatcherBuilder[src]

default fn clone_from(&mut self, source: &Self)
1.0.0
[src]

Performs copy-assignment from source. Read more

impl Debug for RegexMatcherBuilder[src]

Auto Trait Implementations

Blanket Implementations

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

impl<T> From for T[src]

impl<T, U> Into for T where
    U: From<T>, 
[src]

impl<T, U> TryFrom for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T> Borrow for T where
    T: ?Sized
[src]

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> BorrowMut for T where
    T: ?Sized
[src]

impl<T, U> TryInto for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.