Struct pcre2::bytes::Regex[][src]

pub struct Regex { /* fields omitted */ }

A compiled PCRE2 regular expression.

This regex is safe to use from multiple threads simultaneously. For top performance, it is better to clone a new regex for each thread.

Methods

impl Regex
[src]

Compiles a regular expression using the default configuration.

Once compiled, it can be used repeatedly to search, split or replace text in a string.

If an invalid expression is given, then an error is returned.

To configure compilation options for the regex, use the RegexBuilder.

Returns true if and only if the regex matches the subject string given.

Example

Test if some text contains at least one word with exactly 13 ASCII word bytes:

use pcre2::bytes::Regex;

let text = b"I categorically deny having triskaidekaphobia.";
assert!(Regex::new(r"\b\w{13}\b")?.is_match(text)?);

Returns the start and end byte range of the leftmost-first match in subject. If no match exists, then None is returned.

Example

Find the start and end location of the first word with exactly 13 ASCII word bytes:

use pcre2::bytes::Regex;

let text = b"I categorically deny having triskaidekaphobia.";
let mat = Regex::new(r"\b\w{13}\b")?.find(text)?.unwrap();
assert_eq!((mat.start(), mat.end()), (2, 15));

Important traits for Matches<'r, 's>

Returns an iterator for each successive non-overlapping match in subject, returning the start and end byte indices with respect to subject.

Example

Find the start and end location of every word with exactly 13 ASCII word bytes:

use pcre2::bytes::Regex;

let text = b"Retroactively relinquishing remunerations is reprehensible.";
for result in Regex::new(r"\b\w{13}\b")?.find_iter(text) {
    let mat = result?;
    println!("{:?}", mat);
}

Returns the capture groups corresponding to the leftmost-first match in subject. Capture group 0 always corresponds to the entire match. If no match is found, then None is returned.

Examples

Say you have some text with movie names and their release years, like "'Citizen Kane' (1941)". It'd be nice if we could search for text looking like that, while also extracting the movie name and its release year separately.

use pcre2::bytes::Regex;

let re = Regex::new(r"'([^']+)'\s+\((\d{4})\)")?;
let text = b"Not my favorite movie: 'Citizen Kane' (1941).";
let caps = re.captures(text)?.unwrap();
assert_eq!(&caps[1], &b"Citizen Kane"[..]);
assert_eq!(&caps[2], &b"1941"[..]);
assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]);
// You can also access the groups by index using the Index notation.
// Note that this will panic on an invalid index.
assert_eq!(&caps[1], b"Citizen Kane");
assert_eq!(&caps[2], b"1941");
assert_eq!(&caps[0], b"'Citizen Kane' (1941)");

Note that the full match is at capture group 0. Each subsequent capture group is indexed by the order of its opening (.

We can make this example a bit clearer by using named capture groups:

use pcre2::bytes::Regex;

let re = Regex::new(r"'(?P<title>[^']+)'\s+\((?P<year>\d{4})\)")?;
let text = b"Not my favorite movie: 'Citizen Kane' (1941).";
let caps = re.captures(text)?.unwrap();
assert_eq!(&caps["title"], &b"Citizen Kane"[..]);
assert_eq!(&caps["year"], &b"1941"[..]);
assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]);
// You can also access the groups by name using the Index notation.
// Note that this will panic on an invalid group name.
assert_eq!(&caps["title"], b"Citizen Kane");
assert_eq!(&caps["year"], b"1941");
assert_eq!(&caps[0], b"'Citizen Kane' (1941)");

Here we name the capture groups, which we can access with the name method or the Index notation with a &str. Note that the named capture groups are still accessible with get or the Index notation with a usize.

The 0th capture group is always unnamed, so it must always be accessed with get(0) or [0].

Important traits for CaptureMatches<'r, 's>

Returns an iterator over all the non-overlapping capture groups matched in subject. This is operationally the same as find_iter, except it yields information about capturing group matches.

Example

We can use this to find all movie titles and their release years in some text, where the movie is formatted like "'Title' (xxxx)":

use std::str;

use pcre2::bytes::Regex;

let re = Regex::new(r"'(?P<title>[^']+)'\s+\((?P<year>\d{4})\)")?;
let text = b"'Citizen Kane' (1941), 'The Wizard of Oz' (1939), 'M' (1931).";
for result in re.captures_iter(text) {
    let caps = result?;
    let title = str::from_utf8(&caps["title"]).unwrap();
    let year = str::from_utf8(&caps["year"]).unwrap();
    println!("Movie: {:?}, Released: {:?}", title, year);
}
// Output:
// Movie: Citizen Kane, Released: 1941
// Movie: The Wizard of Oz, Released: 1939
// Movie: M, Released: 1931

impl Regex
[src]

Advanced or "lower level" search methods.

Returns the same as is_match, but starts the search at the given offset.

The significance of the starting point is that it takes the surrounding context into consideration. For example, the \A anchor can only match when start == 0.

Returns the same as find, but starts the search at the given offset.

The significance of the starting point is that it takes the surrounding context into consideration. For example, the \A anchor can only match when start == 0.

This is like captures, but uses CaptureLocations instead of Captures in order to amortize allocations.

To create a CaptureLocations value, use the Regex::capture_locations method.

This returns the overall match if this was successful, which is always equivalent to the 0th capture group.

Returns the same as captures_read, but starts the search at the given offset and populates the capture locations given.

The significance of the starting point is that it takes the surrounding context into consideration. For example, the \A anchor can only match when start == 0.

impl Regex
[src]

Auxiliary methods.

Returns the original pattern string for this regex.

Returns a sequence of all capturing groups and their names, if present.

The length of the slice returned is always equal to the result of captures_len, which is the number of capturing groups (including the capturing group for the entire pattern).

Each entry in the slice is the name of the corresponding capturing group, if one exists. The first capturing group (at index 0) is always unnamed.

Capturing groups are indexed by the order of the opening parenthesis.

Returns the number of capturing groups in the pattern.

This is always 1 more than the number of syntactic groups in the pattern, since the first group always corresponds to the entire match.

Returns an empty set of capture locations that can be reused in multiple calls to captures_read or captures_read_at.

Trait Implementations

impl Clone for Regex
[src]

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

impl Debug for Regex
[src]

Formats the value using the given formatter. Read more

Auto Trait Implementations

impl Send for Regex

impl Sync for Regex