Struct pidgin::Pidgin [−][src]
pub struct Pidgin { /* fields omitted */ }
This is a grammar builder. It keeps track of the rules defined, the alternates participating in the rule currently being defined, whether these alternates should be bounded left and right by word boundaries, string boundaries, or line boundaries, and the set of regex flags -- case-sensitivity, for instance, that will govern the rule it produces.
Defined rules will be used to process the new rule's alternates. If there is
a "foo" rule, the alternate "foo foo"
will be understood to require that
this "foo" rule match twice with a space between matches.
Because rule names can overlap, they are applied longest to shortest. If there is both a "foo" rule and a "f" rule, "f foo" will be understood to involve one match for each -- the "f" rule only gets the single "f".
In addition to rules identified like this by a name, there are also regex rules. These are substituted into alternates wherever their definitional pattern matches. Regex rules are sought in alternates only in what is left over after ordinary rules are found. Regex rules are applied in inverse order by the length of their string representation and then in alphabetical order. They may optionally also have names.
Pidgin
has numerous configuration methods which consume and return their
invocant.
let mut p = Pidgin::new() .enclosed(true) .word_bound() .case_insensitive(true);
Methods
impl Pidgin
[src]
impl Pidgin
pub fn new() -> Pidgin
[src]
pub fn new() -> Pidgin
Constructs a new Pidgin
with the default state: no rules, no alternates
for the current rule, case-sensitive, not multiline, not dot-all (.
matches a newline), unicode-compliant, and not enclosed.
pub fn add(&mut self, phrases: &[&str]) -> &mut Pidgin
[src]
pub fn add(&mut self, phrases: &[&str]) -> &mut Pidgin
Adds the given list of alternates to the rule currently under construction.
This method is chainable.
pub fn add_str(&mut self, s: &str) -> &mut Pidgin
[src]
pub fn add_str(&mut self, s: &str) -> &mut Pidgin
Adds the given alternate to the rule currently under construction.
This method is chainable.
pub fn compile(&mut self) -> Grammar
[src]
pub fn compile(&mut self) -> Grammar
Compiles the current rule, clearing the alternate list in preparation for constructing the next rule.
pub fn grammar(&mut self, words: &[&str]) -> Grammar
[src]
pub fn grammar(&mut self, words: &[&str]) -> Grammar
A convenience method equivalent to add(&words).compile()
.
pub fn rule(&mut self, name: &str, g: &Grammar)
[src]
pub fn rule(&mut self, name: &str, g: &Grammar)
Define the rule name
.
NOTE Multiple rules defined with the same name are treated as
alternates. The order of their adding will define the order in which
they are tried. See remove_rule
.
pub fn rx_rule(
&mut self,
rx: &str,
g: &Grammar,
name: Option<&str>
) -> Result<(), Error>
[src]
pub fn rx_rule(
&mut self,
rx: &str,
g: &Grammar,
name: Option<&str>
) -> Result<(), Error>
Defines a rule replacing matched portion's of the rule's alternates with the given regex.
The rx
argument finds matched portions of an alternate. The g
argument defines the rule. The name
argument
provides the optional name for the rule.
pidgin.rx_rule(r"\s+", &g, Some("whitespace_is_special"))?;
Errors
rx_foreign_rule
returns an error if rx
fails to compile.
pub fn foreign_rule(&mut self, name: &str, pattern: &str) -> Result<(), Error>
[src]
pub fn foreign_rule(&mut self, name: &str, pattern: &str) -> Result<(), Error>
Defines a rule based on an ad hoc regular expression.
Currently foreign_rule
is the only way to define a rule with unbounded
repetition.
pidgin.foreign_rule("us_local_phone", r"\b[0-9]{3}-?[0-9]{4}\b")?;
Errors
foreign_rule
returns an error if the foreign regex fails to compile.
pub fn rx_foreign_rule(
&mut self,
rx: &str,
pattern: &str,
name: Option<&str>
) -> Result<(), Error>
[src]
pub fn rx_foreign_rule(
&mut self,
rx: &str,
pattern: &str,
name: Option<&str>
) -> Result<(), Error>
Defines a rule, optionally named, replacing matched portion's of the rule's alternates with the given regex.
The rx
argument finds matched portions of an alternate. The pattern
argument defines the regular expression of the rule. The name
argument
provides the optional name for the rule.
pidgin.rx_foreign_rule(r"\s+", r"\t+", Some("whitespace_means_tabs"))?;
Errors
rx_foreign_rule
returns an error if either rx
or pattern
fails to compile.
pub fn remove_rule(&mut self, name: &str)
[src]
pub fn remove_rule(&mut self, name: &str)
Removes a rule from the list known to the Pidgin
.
pub fn remove_rx_rule(&mut self, name: &str) -> Result<(), Error>
[src]
pub fn remove_rx_rule(&mut self, name: &str) -> Result<(), Error>
Like remove_rule
but the rule identifier is a regex rather than a
rule name.
pub fn clear(&mut self)
[src]
pub fn clear(&mut self)
Removes all alternates and rule definitions from the Pidgin
. Flags
controlling case sensitivity and such remain.
pub fn case_insensitive(self, case: bool) -> Pidgin
[src]
pub fn case_insensitive(self, case: bool) -> Pidgin
Toggles whether Pidgin
creates case-insensitive rules.
By default this is false.
pub fn multi_line(self, case: bool) -> Pidgin
[src]
pub fn multi_line(self, case: bool) -> Pidgin
Toggles whether Pidgin
creates multi-line rules. This governs the
behavior of ^
and $
anchors, whether they match string boundaries
or after and before newline characters.
By default this is false.
pub fn dot_all(self, case: bool) -> Pidgin
[src]
pub fn dot_all(self, case: bool) -> Pidgin
Toggles whether Pidgin
creates rules wherein .
can match newline
characters. This is the so-called "single line" mode of Perl-compatible
regular expressions.
By default this is false.
pub fn unicode(self, case: bool) -> Pidgin
[src]
pub fn unicode(self, case: bool) -> Pidgin
Toggles whether Pidgin
creates Unicode-compliant rules.
By default this is true.
pub fn enclosed(self, case: bool) -> Pidgin
[src]
pub fn enclosed(self, case: bool) -> Pidgin
Toggles whether Pidgin
creates rules that can safely be modified by
a repetition expression. (?:ab)
is enclosed. ab
is not.
This parameter is generally of interest only when using Pidgin
to
create elements of other regular expressions.
By default this is false.
pub fn reverse_greed(self, case: bool) -> Pidgin
[src]
pub fn reverse_greed(self, case: bool) -> Pidgin
Toggles the U flag of Rust regexen. Per the documentation, U "swap[s] the meaning of x* and x*?", thus turning a stingy match greedy and a greedy match stingy.
By default this is false.
pub fn normalize_whitespace(self, required: bool) -> Pidgin
[src]
pub fn normalize_whitespace(self, required: bool) -> Pidgin
Treat any white space found in an alternate as "some amount of white space".
if the required
parameter is true
, it means "at least some white
space". If it is false, it means "maybe some white space".
pub fn word_bound(self) -> Pidgin
[src]
pub fn word_bound(self) -> Pidgin
The left and right edges of all alternates, when applicable, should be
word boundaries -- \b
. If the alternate has a non-word character at the
boundary in question, such as "@" or "(", then it is left alone, but if
it is a word character, it should be bounded by a \b
in the regular
expression generated.
pub fn left_word_bound(self) -> Pidgin
[src]
pub fn left_word_bound(self) -> Pidgin
Alternates should have word boundaries, where applicable, on the left margin.
pub fn right_word_bound(self) -> Pidgin
[src]
pub fn right_word_bound(self) -> Pidgin
Alternates should have word boundaries, where applicable, on the right margin.
pub fn line_bound(self) -> Pidgin
[src]
pub fn line_bound(self) -> Pidgin
Alternates should match entire lines.
NOTE This turns multi-line matching on for the rule.
pub fn left_line_bound(self) -> Pidgin
[src]
pub fn left_line_bound(self) -> Pidgin
Alternates should match at the beginning of the line on their left margin.
NOTE This turns multi-line matching on for the rule.
pub fn right_line_bound(self) -> Pidgin
[src]
pub fn right_line_bound(self) -> Pidgin
Alternates should match at the beginning of the line on their right margin.
NOTE This turns multi-line matching on for the rule.
pub fn string_bound(self) -> Pidgin
[src]
pub fn string_bound(self) -> Pidgin
The rule should match the entire string.
pub fn left_string_bound(self) -> Pidgin
[src]
pub fn left_string_bound(self) -> Pidgin
The left margin of every alternate should be the beginning of the line.
pub fn right_string_bound(self) -> Pidgin
[src]
pub fn right_string_bound(self) -> Pidgin
The right margin of every alternate should be the beginning of the line.
pub fn unbound(self) -> Pidgin
[src]
pub fn unbound(self) -> Pidgin
Clears any expectation that alternates have boundary anchors.
pub fn compile_non_capturing(&self) -> Grammar
[src]
pub fn compile_non_capturing(&self) -> Grammar
pub fn rx(words: &[&str]) -> String
[src]
pub fn rx(words: &[&str]) -> String
Convenience method for generating non-backtracking regular expressions.
Pidgin::rx(&vec!["cat", "camel", "aaaabbbbaaaabbbb"]); // (?:ca(?:t|mel)|(?:a{4}b{4}){2})
pub fn matcher(&self) -> Result<Matcher, Error>
[src]
pub fn matcher(&self) -> Result<Matcher, Error>
Convenience method equivalent to compile().matcher()
Errors
matcher
throws errors where Grammar::matcher
throws errors.
Trait Implementations
impl Clone for Pidgin
[src]
impl Clone for Pidgin
fn clone(&self) -> Pidgin
[src]
fn clone(&self) -> Pidgin
Returns a copy of the value. Read more
fn clone_from(&mut self, source: &Self)
1.0.0[src]
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from source
. Read more
impl Debug for Pidgin
[src]
impl Debug for Pidgin