[−][src]Struct tantivy_fst::Regex
A regular expression for searching FSTs with Unicode support.
Regular expressions are compiled down to a deterministic finite automaton that can efficiently search any finite state transducer. Notably, most regular expressions only need to explore a small portion of a finite state transducer without loading all of it into memory.
Syntax
Regex
supports fully featured regular expressions. Namely, it supports
all of the same constructs as the standard regex
crate except for the
following things:
- Lazy quantifiers, since a regular expression automaton only reports
whether a key matches at all, and not its location. Namely, lazy
quantifiers such as
+?
only modify the location of a match, but never change a non-match into a match or a match into a non-match. - Word boundaries (i.e.,
\b
). Because such things are hard to do in a deterministic finite automaton, but not impossible. As such, these may be allowed some day. - Other zero width assertions like
^
and$
. These are easier to support than word boundaries, but are still tricky and usually aren't as useful when searching dictionaries.
Otherwise, the full syntax of the regex
crate
is supported. This includes all Unicode support and relevant flags.
(The U
and m
flags are no-ops because of (1) and (3) above,
respectively.)
Matching semantics
A regular expression matches a key in a finite state transducer if and only
if it matches from the start of a key all the way to end. Stated
differently, every regular expression (re)
is matched as if it were
^(re)$
. This means that if you want to do a substring match, then you
must use .*substring.*
.
Caution: Starting a regular expression with .*
means that it could
potentially match any key in a finite state transducer. This implies that
all keys could be visited, which could be slow. It is possible that this
crate will grow facilities for detecting regular expressions that will
scan a large portion of a transducer and optionally disallow them.
Methods
impl Regex
[src]
pub fn new(re: &str) -> Result<Regex, Error>
[src]
Create a new regular expression query.
The query finds all terms matching the regular expression.
If the regular expression is malformed or if it results in an automaton that is too big, then an error is returned.
A Regex
value satisfies the Automaton
trait, which means it can be
used with the search
method of any finite state transducer.
Trait Implementations
impl Automaton for Regex
[src]
type State = Option<usize>
The type of the state used in the automaton.
fn start(&self) -> Option<usize>
[src]
fn is_match(&self, state: &Option<usize>) -> bool
[src]
fn can_match(&self, state: &Option<usize>) -> bool
[src]
fn accept(&self, state: &Option<usize>, byte: u8) -> Option<usize>
[src]
fn will_always_match(&self, _state: &Self::State) -> bool
[src]
Returns true if and only if state
matches and must match no matter what steps are taken. Read more
fn starts_with(self) -> StartsWith<Self> where
Self: Sized,
[src]
Self: Sized,
Returns an automaton that matches the strings that start with something this automaton matches. Read more
fn union<Rhs: Automaton>(self, rhs: Rhs) -> Union<Self, Rhs> where
Self: Sized,
[src]
Self: Sized,
Returns an automaton that matches the strings matched by either this or the other automaton. Read more
fn intersection<Rhs: Automaton>(self, rhs: Rhs) -> Intersection<Self, Rhs> where
Self: Sized,
[src]
Self: Sized,
Returns an automaton that matches the strings matched by both this and the other automaton. Read more
fn complement(self) -> Complement<Self> where
Self: Sized,
[src]
Self: Sized,
Returns an automaton that matches the strings not matched by this automaton. Read more
impl Debug for Regex
[src]
Auto Trait Implementations
Blanket Implementations
impl<T> From for T
[src]
impl<T, U> Into for T where
U: From<T>,
[src]
U: From<T>,
impl<T, U> TryFrom for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = !
try_from
)The type returned in the event of a conversion error.
fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T> Borrow for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,
impl<T> BorrowMut for T where
T: ?Sized,
[src]
T: ?Sized,
fn borrow_mut(&mut self) -> &mut T
[src]
impl<T, U> TryInto for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,