`pub struct RegexSet { /* private fields */ }`

## Expand description

Match multiple, possibly overlapping, regexes in a single search.

A regex set corresponds to the union of zero or more regular expressions.
That is, a regex set will match a haystack when at least one of its
constituent regexes matches. A regex set as its formulated here provides a
touch more power: it will also report *which* regular expressions in the
set match. Indeed, this is the key difference between regex sets and a
single `Regex`

with many alternates, since only one alternate can match at
a time.

For example, consider regular expressions to match email addresses and
domains: `[a-z]+@[a-z]+\.(com|org|net)`

and `[a-z]+\.(com|org|net)`

. If a
regex set is constructed from those regexes, then searching the haystack
`foo@example.com`

will report both regexes as matching. Of course, one
could accomplish this by compiling each regex on its own and doing two
searches over the haystack. The key advantage of using a regex set is
that it will report the matching regexes using a *single pass through the
haystack*. If one has hundreds or thousands of regexes to match repeatedly
(like a URL router for a complex web application or a user agent matcher),
then a regex set *can* realize huge performance gains.

## Limitations

Regex sets are limited to answering the following two questions:

- Does any regex in the set match?
- If so, which regexes in the set match?

As with the main `Regex`

type, it is cheaper to ask (1)
instead of (2) since the matching engines can stop after the first match
is found.

You cannot directly extract `Match`

or
`Captures`

objects from a regex set. If you need these
operations, the recommended approach is to compile each pattern in the set
independently and scan the exact same haystack a second time with those
independently compiled patterns:

```
use regex::{Regex, RegexSet};
let patterns = ["foo", "bar"];
// Both patterns will match different ranges of this string.
let hay = "barfoo";
// Compile a set matching any of our patterns.
let set = RegexSet::new(patterns).unwrap();
// Compile each pattern independently.
let regexes: Vec<_> = set
.patterns()
.iter()
.map(|pat| Regex::new(pat).unwrap())
.collect();
// Match against the whole set first and identify the individual
// matching patterns.
let matches: Vec<&str> = set
.matches(hay)
.into_iter()
// Dereference the match index to get the corresponding
// compiled pattern.
.map(|index| ®exes[index])
// To get match locations or any other info, we then have to search the
// exact same haystack again, using our separately-compiled pattern.
.map(|re| re.find(hay).unwrap().as_str())
.collect();
// Matches arrive in the order the constituent patterns were declared,
// not the order they appear in the haystack.
assert_eq!(vec!["foo", "bar"], matches);
```

## Performance

A `RegexSet`

has the same performance characteristics as `Regex`

. Namely,
search takes `O(m * n)`

time, where `m`

is proportional to the size of the
regex set and `n`

is proportional to the length of the haystack.

## Trait implementations

The `Default`

trait is implemented for `RegexSet`

. The default value
is an empty set. An empty set can also be explicitly constructed via
`RegexSet::empty`

.

## Example

This shows how the above two regexes (for matching email addresses and domains) might work:

```
use regex::RegexSet;
let set = RegexSet::new(&[
r"[a-z]+@[a-z]+\.(com|org|net)",
r"[a-z]+\.(com|org|net)",
]).unwrap();
// Ask whether any regexes in the set match.
assert!(set.is_match("foo@example.com"));
// Identify which regexes in the set match.
let matches: Vec<_> = set.matches("foo@example.com").into_iter().collect();
assert_eq!(vec![0, 1], matches);
// Try again, but with a haystack that only matches one of the regexes.
let matches: Vec<_> = set.matches("example.com").into_iter().collect();
assert_eq!(vec![1], matches);
// Try again, but with a haystack that doesn't match any regex in the set.
let matches: Vec<_> = set.matches("example").into_iter().collect();
assert!(matches.is_empty());
```

Note that it would be possible to adapt the above example to using `Regex`

with an expression like:

```
(?P<email>[a-z]+@(?P<email_domain>[a-z]+[.](com|org|net)))|(?P<domain>[a-z]+[.](com|org|net))
```

After a match, one could then inspect the capture groups to figure out which alternates matched. The problem is that it is hard to make this approach scale when there are many regexes since the overlap between each alternate isn’t always obvious to reason about.

## Implementations§

source§### impl RegexSet

### impl RegexSet

source#### pub fn new<I, S>(exprs: I) -> Result<RegexSet, Error>where
S: AsRef<str>,
I: IntoIterator<Item = S>,

#### pub fn new<I, S>(exprs: I) -> Result<RegexSet, Error>where S: AsRef<str>, I: IntoIterator<Item = S>,

Create a new regex set with the given regular expressions.

This takes an iterator of `S`

, where `S`

is something that can produce
a `&str`

. If any of the strings in the iterator are not valid regular
expressions, then an error is returned.

##### Example

Create a new regex set from an iterator of strings:

```
use regex::RegexSet;
let set = RegexSet::new([r"\w+", r"\d+"]).unwrap();
assert!(set.is_match("foo"));
```

source#### pub fn empty() -> RegexSet

#### pub fn empty() -> RegexSet

Create a new empty regex set.

An empty regex never matches anything.

This is a convenience function for `RegexSet::new([])`

, but doesn’t
require one to specify the type of the input.

##### Example

```
use regex::RegexSet;
let set = RegexSet::empty();
assert!(set.is_empty());
// an empty set matches nothing
assert!(!set.is_match(""));
```

source#### pub fn is_match(&self, haystack: &str) -> bool

#### pub fn is_match(&self, haystack: &str) -> bool

Returns true if and only if one of the regexes in this set matches the haystack given.

This method should be preferred if you only need to test whether any
of the regexes in the set should match, but don’t care about *which*
regexes matched. This is because the underlying matching engine will
quit immediately after seeing the first match instead of continuing to
find all matches.

Note that as with searches using `Regex`

, the
expression is unanchored by default. That is, if the regex does not
start with `^`

or `\A`

, or end with `$`

or `\z`

, then it is permitted
to match anywhere in the haystack.

##### Example

Tests whether a set matches somewhere in a haystack:

```
use regex::RegexSet;
let set = RegexSet::new([r"\w+", r"\d+"]).unwrap();
assert!(set.is_match("foo"));
assert!(!set.is_match("☃"));
```

source#### pub fn is_match_at(&self, haystack: &str, start: usize) -> bool

#### pub fn is_match_at(&self, haystack: &str, start: usize) -> bool

Returns true if and only if one of the regexes in this set matches the haystack given, with the search starting at the offset given.

The significance of the starting point is that it takes the surrounding
context into consideration. For example, the `\A`

anchor can only
match when `start == 0`

.

##### Panics

This panics when `start >= haystack.len() + 1`

.

##### Example

This example shows the significance of `start`

. Namely, consider a
haystack `foobar`

and a desire to execute a search starting at offset
`3`

. You could search a substring explicitly, but then the look-around
assertions won’t work correctly. Instead, you can use this method to
specify the start position of a search.

```
use regex::RegexSet;
let set = RegexSet::new([r"\bbar\b", r"(?m)^bar$"]).unwrap();
let hay = "foobar";
// We get a match here, but it's probably not intended.
assert!(set.is_match(&hay[3..]));
// No match because the assertions take the context into account.
assert!(!set.is_match_at(hay, 3));
```

source#### pub fn matches(&self, haystack: &str) -> SetMatches

#### pub fn matches(&self, haystack: &str) -> SetMatches

Returns the set of regexes that match in the given haystack.

The set returned contains the index of each regex that matches in
the given haystack. The index is in correspondence with the order of
regular expressions given to `RegexSet`

’s constructor.

The set can also be used to iterate over the matched indices. The order of iteration is always ascending with respect to the matching indices.

Note that as with searches using `Regex`

, the
expression is unanchored by default. That is, if the regex does not
start with `^`

or `\A`

, or end with `$`

or `\z`

, then it is permitted
to match anywhere in the haystack.

##### Example

Tests which regular expressions match the given haystack:

```
use regex::RegexSet;
let set = RegexSet::new([
r"\w+",
r"\d+",
r"\pL+",
r"foo",
r"bar",
r"barfoo",
r"foobar",
]).unwrap();
let matches: Vec<_> = set.matches("foobar").into_iter().collect();
assert_eq!(matches, vec![0, 2, 3, 4, 6]);
// You can also test whether a particular regex matched:
let matches = set.matches("foobar");
assert!(!matches.matched(5));
assert!(matches.matched(6));
```

source#### pub fn matches_at(&self, haystack: &str, start: usize) -> SetMatches

#### pub fn matches_at(&self, haystack: &str, start: usize) -> SetMatches

Returns the set of regexes that match in the given haystack.

The set returned contains the index of each regex that matches in
the given haystack. The index is in correspondence with the order of
regular expressions given to `RegexSet`

’s constructor.

The set can also be used to iterate over the matched indices. The order of iteration is always ascending with respect to the matching indices.

The significance of the starting point is that it takes the surrounding
context into consideration. For example, the `\A`

anchor can only
match when `start == 0`

.

##### Panics

This panics when `start >= haystack.len() + 1`

.

##### Example

Tests which regular expressions match the given haystack:

```
use regex::RegexSet;
let set = RegexSet::new([r"\bbar\b", r"(?m)^bar$"]).unwrap();
let hay = "foobar";
// We get matches here, but it's probably not intended.
let matches: Vec<_> = set.matches(&hay[3..]).into_iter().collect();
assert_eq!(matches, vec![0, 1]);
// No matches because the assertions take the context into account.
let matches: Vec<_> = set.matches_at(hay, 3).into_iter().collect();
assert_eq!(matches, vec![]);
```

source#### pub fn len(&self) -> usize

#### pub fn len(&self) -> usize

Returns the total number of regexes in this set.

##### Example

```
use regex::RegexSet;
assert_eq!(0, RegexSet::empty().len());
assert_eq!(1, RegexSet::new([r"[0-9]"]).unwrap().len());
assert_eq!(2, RegexSet::new([r"[0-9]", r"[a-z]"]).unwrap().len());
```

source#### pub fn is_empty(&self) -> bool

#### pub fn is_empty(&self) -> bool

Returns `true`

if this set contains no regexes.

##### Example

```
use regex::RegexSet;
assert!(RegexSet::empty().is_empty());
assert!(!RegexSet::new([r"[0-9]"]).unwrap().is_empty());
```

source#### pub fn patterns(&self) -> &[String]

#### pub fn patterns(&self) -> &[String]

Returns the regex patterns that this regex set was constructed from.

This function can be used to determine the pattern for a match. The slice returned has exactly as many patterns givens to this regex set, and the order of the slice is the same as the order of the patterns provided to the set.

##### Example

```
use regex::RegexSet;
let set = RegexSet::new(&[
r"\w+",
r"\d+",
r"\pL+",
r"foo",
r"bar",
r"barfoo",
r"foobar",
]).unwrap();
let matches: Vec<_> = set
.matches("foobar")
.into_iter()
.map(|index| &set.patterns()[index])
.collect();
assert_eq!(matches, vec![r"\w+", r"\pL+", r"foo", r"bar", r"foobar"]);
```