pub trait Automaton {
    type State;

    // Required methods
    fn start(&self) -> Self::State;
    fn is_match(&self, state: &Self::State) -> bool;
    fn accept(&self, state: &Self::State, byte: u8) -> Self::State;

    // Provided methods
    fn can_match(&self, _state: &Self::State) -> bool { ... }
    fn will_always_match(&self, _state: &Self::State) -> bool { ... }
    fn starts_with(self) -> StartsWith<Self>
       where Self: Sized { ... }
    fn union<Rhs: Automaton>(self, rhs: Rhs) -> Union<Self, Rhs>
       where Self: Sized { ... }
    fn intersection<Rhs: Automaton>(self, rhs: Rhs) -> Intersection<Self, Rhs>
       where Self: Sized { ... }
    fn complement(self) -> Complement<Self>
       where Self: Sized { ... }
}
Expand description

Automaton describes types that behave as a finite automaton.

All implementors of this trait are represented by byte based automata. Stated differently, all transitions in the automata correspond to a single byte in the input.

This implementation choice is important for a couple reasons:

  1. The set of possible transitions in each node is small, which may make efficient memory usage easier.
  2. The finite state transducers in this crate are all byte based, so any automata used on them must also be byte based.

In practice, this does present somewhat of a problem, for example, if you’re storing UTF-8 encoded strings in a finite state transducer. Consider using a Levenshtein automaton, which accepts a query string and an edit distance. The edit distance should apply to some notion of character, which could be represented by at least 1-4 bytes in a UTF-8 encoding (for some definition of “character”). Therefore, the automaton must have UTF-8 decoding built into it. This can be tricky to implement, so you may find the utf8-ranges crate useful.

Required Associated Types§

source

type State

The type of the state used in the automaton.

Required Methods§

source

fn start(&self) -> Self::State

Returns a single start state for this automaton.

This method should always return the same value for each implementation.

source

fn is_match(&self, state: &Self::State) -> bool

Returns true if and only if state is a match state.

source

fn accept(&self, state: &Self::State, byte: u8) -> Self::State

Return the next state given state and an input.

Provided Methods§

source

fn can_match(&self, _state: &Self::State) -> bool

Returns true if and only if state can lead to a match in zero or more steps.

If this returns false, then no sequence of inputs from this state should ever produce a match. If this does not follow, then those match states may never be reached. In other words, behavior may be incorrect.

If this returns true even when no match is possible, then behavior will be correct, but callers may be forced to do additional work.

source

fn will_always_match(&self, _state: &Self::State) -> bool

Returns true if and only if state matches and must match no matter what steps are taken.

If this returns true, then every sequence of inputs from this state produces a match. If this does not follow, then those match states may never be reached. In other words, behavior may be incorrect.

If this returns false even when every sequence of inputs will lead to a match, then behavior will be correct, but callers may be forced to do additional work.

source

fn starts_with(self) -> StartsWith<Self>where Self: Sized,

Returns an automaton that matches the strings that start with something this automaton matches.

source

fn union<Rhs: Automaton>(self, rhs: Rhs) -> Union<Self, Rhs>where Self: Sized,

Returns an automaton that matches the strings matched by either this or the other automaton.

source

fn intersection<Rhs: Automaton>(self, rhs: Rhs) -> Intersection<Self, Rhs>where Self: Sized,

Returns an automaton that matches the strings matched by both this and the other automaton.

source

fn complement(self) -> Complement<Self>where Self: Sized,

Returns an automaton that matches the strings not matched by this automaton.

Implementations on Foreign Types§

source§

impl<'a, T: Automaton> Automaton for &'a T

§

type State = <T as Automaton>::State

source§

fn start(&self) -> Self::State

source§

fn is_match(&self, state: &Self::State) -> bool

source§

fn can_match(&self, state: &Self::State) -> bool

source§

fn will_always_match(&self, state: &Self::State) -> bool

source§

fn accept(&self, state: &Self::State, byte: u8) -> Self::State

Implementors§