[−][src]Struct pcre2::bytes::Regex
A compiled PCRE2 regular expression.
This regex is safe to use from multiple threads simultaneously. For top performance, it is better to clone a new regex for each thread.
Methods
impl Regex
[src]
pub fn new(pattern: &str) -> Result<Regex, Error>
[src]
Compiles a regular expression using the default configuration.
Once compiled, it can be used repeatedly to search, split or replace text in a string.
If an invalid expression is given, then an error is returned.
To configure compilation options for the regex, use the
RegexBuilder
.
pub fn is_match(&self, subject: &[u8]) -> Result<bool, Error>
[src]
Returns true if and only if the regex matches the subject string given.
Example
Test if some text contains at least one word with exactly 13 ASCII word bytes:
use pcre2::bytes::Regex; let text = b"I categorically deny having triskaidekaphobia."; assert!(Regex::new(r"\b\w{13}\b")?.is_match(text)?);
pub fn find<'s>(&self, subject: &'s [u8]) -> Result<Option<Match<'s>>, Error>
[src]
Returns the start and end byte range of the leftmost-first match in
subject
. If no match exists, then None
is returned.
Example
Find the start and end location of the first word with exactly 13 ASCII word bytes:
use pcre2::bytes::Regex; let text = b"I categorically deny having triskaidekaphobia."; let mat = Regex::new(r"\b\w{13}\b")?.find(text)?.unwrap(); assert_eq!((mat.start(), mat.end()), (2, 15));
ⓘImportant traits for Matches<'r, 's>pub fn find_iter<'r, 's>(&'r self, subject: &'s [u8]) -> Matches<'r, 's>
[src]
Returns an iterator for each successive non-overlapping match in
subject
, returning the start and end byte indices with respect to
subject
.
Example
Find the start and end location of every word with exactly 13 ASCII word bytes:
use pcre2::bytes::Regex; let text = b"Retroactively relinquishing remunerations is reprehensible."; for result in Regex::new(r"\b\w{13}\b")?.find_iter(text) { let mat = result?; println!("{:?}", mat); }
pub fn captures<'s>(
&self,
subject: &'s [u8]
) -> Result<Option<Captures<'s>>, Error>
[src]
&self,
subject: &'s [u8]
) -> Result<Option<Captures<'s>>, Error>
Returns the capture groups corresponding to the leftmost-first
match in subject
. Capture group 0
always corresponds to the entire
match. If no match is found, then None
is returned.
Examples
Say you have some text with movie names and their release years, like "'Citizen Kane' (1941)". It'd be nice if we could search for text looking like that, while also extracting the movie name and its release year separately.
use pcre2::bytes::Regex; let re = Regex::new(r"'([^']+)'\s+\((\d{4})\)")?; let text = b"Not my favorite movie: 'Citizen Kane' (1941)."; let caps = re.captures(text)?.unwrap(); assert_eq!(&caps[1], &b"Citizen Kane"[..]); assert_eq!(&caps[2], &b"1941"[..]); assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]); // You can also access the groups by index using the Index notation. // Note that this will panic on an invalid index. assert_eq!(&caps[1], b"Citizen Kane"); assert_eq!(&caps[2], b"1941"); assert_eq!(&caps[0], b"'Citizen Kane' (1941)");
Note that the full match is at capture group 0
. Each subsequent
capture group is indexed by the order of its opening (
.
We can make this example a bit clearer by using named capture groups:
use pcre2::bytes::Regex; let re = Regex::new(r"'(?P<title>[^']+)'\s+\((?P<year>\d{4})\)")?; let text = b"Not my favorite movie: 'Citizen Kane' (1941)."; let caps = re.captures(text)?.unwrap(); assert_eq!(&caps["title"], &b"Citizen Kane"[..]); assert_eq!(&caps["year"], &b"1941"[..]); assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]); // You can also access the groups by name using the Index notation. // Note that this will panic on an invalid group name. assert_eq!(&caps["title"], b"Citizen Kane"); assert_eq!(&caps["year"], b"1941"); assert_eq!(&caps[0], b"'Citizen Kane' (1941)");
Here we name the capture groups, which we can access with the name
method or the Index
notation with a &str
. Note that the named
capture groups are still accessible with get
or the Index
notation
with a usize
.
The 0
th capture group is always unnamed, so it must always be
accessed with get(0)
or [0]
.
ⓘImportant traits for CaptureMatches<'r, 's>pub fn captures_iter<'r, 's>(
&'r self,
subject: &'s [u8]
) -> CaptureMatches<'r, 's>
[src]
&'r self,
subject: &'s [u8]
) -> CaptureMatches<'r, 's>
Returns an iterator over all the non-overlapping capture groups matched
in subject
. This is operationally the same as find_iter
, except it
yields information about capturing group matches.
Example
We can use this to find all movie titles and their release years in some text, where the movie is formatted like "'Title' (xxxx)":
use std::str; use pcre2::bytes::Regex; let re = Regex::new(r"'(?P<title>[^']+)'\s+\((?P<year>\d{4})\)")?; let text = b"'Citizen Kane' (1941), 'The Wizard of Oz' (1939), 'M' (1931)."; for result in re.captures_iter(text) { let caps = result?; let title = str::from_utf8(&caps["title"]).unwrap(); let year = str::from_utf8(&caps["year"]).unwrap(); println!("Movie: {:?}, Released: {:?}", title, year); } // Output: // Movie: Citizen Kane, Released: 1941 // Movie: The Wizard of Oz, Released: 1939 // Movie: M, Released: 1931
impl Regex
[src]
Advanced or "lower level" search methods.
pub fn is_match_at(&self, subject: &[u8], start: usize) -> Result<bool, Error>
[src]
Returns the same as is_match, but starts the search at the given offset.
The significance of the starting point is that it takes the surrounding
context into consideration. For example, the \A
anchor can only
match when start == 0
.
pub fn find_at<'s>(
&self,
subject: &'s [u8],
start: usize
) -> Result<Option<Match<'s>>, Error>
[src]
&self,
subject: &'s [u8],
start: usize
) -> Result<Option<Match<'s>>, Error>
Returns the same as find, but starts the search at the given offset.
The significance of the starting point is that it takes the surrounding
context into consideration. For example, the \A
anchor can only
match when start == 0
.
pub fn captures_read<'s>(
&self,
locs: &mut CaptureLocations,
subject: &'s [u8]
) -> Result<Option<Match<'s>>, Error>
[src]
&self,
locs: &mut CaptureLocations,
subject: &'s [u8]
) -> Result<Option<Match<'s>>, Error>
This is like captures
, but uses
CaptureLocations
instead of
Captures
in order to amortize allocations.
To create a CaptureLocations
value, use the
Regex::capture_locations
method.
This returns the overall match if this was successful, which is always
equivalent to the 0
th capture group.
pub fn captures_read_at<'s>(
&self,
locs: &mut CaptureLocations,
subject: &'s [u8],
start: usize
) -> Result<Option<Match<'s>>, Error>
[src]
&self,
locs: &mut CaptureLocations,
subject: &'s [u8],
start: usize
) -> Result<Option<Match<'s>>, Error>
Returns the same as captures_read
, but starts the search at the given
offset and populates the capture locations given.
The significance of the starting point is that it takes the surrounding
context into consideration. For example, the \A
anchor can only
match when start == 0
.
impl Regex
[src]
Auxiliary methods.
pub fn as_str(&self) -> &str
[src]
Returns the original pattern string for this regex.
pub fn capture_names(&self) -> &[Option<String>]
[src]
Returns a sequence of all capturing groups and their names, if present.
The length of the slice returned is always equal to the result of
captures_len
, which is the number of capturing groups (including the
capturing group for the entire pattern).
Each entry in the slice is the name of the corresponding capturing
group, if one exists. The first capturing group (at index 0
) is
always unnamed.
Capturing groups are indexed by the order of the opening parenthesis.
pub fn captures_len(&self) -> usize
[src]
Returns the number of capturing groups in the pattern.
This is always 1 more than the number of syntactic groups in the pattern, since the first group always corresponds to the entire match.
pub fn capture_locations(&self) -> CaptureLocations
[src]
Returns an empty set of capture locations that can be reused in
multiple calls to captures_read
or captures_read_at
.
Trait Implementations
Auto Trait Implementations
impl Sync for Regex
impl Send for Regex
impl Unpin for Regex
impl !RefUnwindSafe for Regex
impl UnwindSafe for Regex
Blanket Implementations
impl<T, U> Into<U> for T where
U: From<T>,
[src]
U: From<T>,
impl<T> From<T> for T
[src]
impl<T> ToOwned for T where
T: Clone,
[src]
T: Clone,
type Owned = T
The resulting type after obtaining ownership.
fn to_owned(&self) -> T
[src]
fn clone_into(&self, target: &mut T)
[src]
impl<T, U> TryFrom<U> for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = Infallible
The type returned in the event of a conversion error.
fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T, U> TryInto<U> for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,
type Error = <U as TryFrom<T>>::Error
The type returned in the event of a conversion error.
fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>
[src]
impl<T> BorrowMut<T> for T where
T: ?Sized,
[src]
T: ?Sized,
fn borrow_mut(&mut self) -> &mut T
[src]
impl<T> Borrow<T> for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,