Struct regex_syntax::hir::Hir [−][src]
pub struct Hir { /* fields omitted */ }A high-level intermediate representation (HIR) for a regular expression.
The HIR of a regular expression represents an intermediate step between its
abstract syntax (a structured description of the concrete syntax) and
compiled byte codes. The purpose of HIR is to make regular expressions
easier to analyze. In particular, the AST is much more complex than the
HIR. For example, while an AST supports arbitrarily nested character
classes, the HIR will flatten all nested classes into a single set. The HIR
will also "compile away" every flag present in the concrete syntax. For
example, users of HIR expressions never need to worry about case folding;
it is handled automatically by the translator (e.g., by translating (?i)A
to [aA]).
If the HIR was produced by a translator that disallows invalid UTF-8, then the HIR is guaranteed to match UTF-8 exclusively.
This type defines its own destructor that uses constant stack space and heap space proportional to the size of the HIR.
The specific type of an HIR expression can be accessed via its kind
or into_kind methods. This extra level of indirection exists for two
reasons:
- Construction of an HIR expression must use the constructor methods
on this
Hirtype instead of building theHirKindvalues directly. This permits construction to enforce invariants like "concatenations always consist of two or more sub-expressions." - Every HIR expression contains attributes that are defined inductively, and can be computed cheaply during the construction process. For example, one such attribute is whether the expression must match at the beginning of the text.
Also, an Hir's fmt::Display implementation prints an HIR as a regular
expression pattern string, and uses constant stack space and heap space
proportional to the size of the Hir.
Methods
impl Hir[src]
impl Hirpub fn kind(&self) -> &HirKind[src]
pub fn kind(&self) -> &HirKindReturns a reference to the underlying HIR kind.
pub fn into_kind(self) -> HirKind[src]
pub fn into_kind(self) -> HirKindConsumes ownership of this HIR expression and returns its underlying
HirKind.
pub fn empty() -> Hir[src]
pub fn empty() -> HirReturns an empty HIR expression.
An empty HIR expression always matches, including the empty string.
pub fn literal(lit: Literal) -> Hir[src]
pub fn literal(lit: Literal) -> HirCreates a literal HIR expression.
If the given literal has a Byte variant with an ASCII byte, then this
method panics. This enforces the invariant that Byte variants are
only used to express matching of invalid UTF-8.
pub fn class(class: Class) -> Hir[src]
pub fn class(class: Class) -> HirCreates a class HIR expression.
pub fn anchor(anchor: Anchor) -> Hir[src]
pub fn anchor(anchor: Anchor) -> HirCreates an anchor assertion HIR expression.
pub fn word_boundary(word_boundary: WordBoundary) -> Hir[src]
pub fn word_boundary(word_boundary: WordBoundary) -> HirCreates a word boundary assertion HIR expression.
pub fn repetition(rep: Repetition) -> Hir[src]
pub fn repetition(rep: Repetition) -> HirCreates a repetition HIR expression.
pub fn group(group: Group) -> Hir[src]
pub fn group(group: Group) -> HirCreates a group HIR expression.
pub fn concat(exprs: Vec<Hir>) -> Hir[src]
pub fn concat(exprs: Vec<Hir>) -> HirReturns the concatenation of the given expressions.
This flattens the concatenation as appropriate.
pub fn alternation(exprs: Vec<Hir>) -> Hir[src]
pub fn alternation(exprs: Vec<Hir>) -> HirReturns the alternation of the given expressions.
This flattens the alternation as appropriate.
pub fn dot(bytes: bool) -> Hir[src]
pub fn dot(bytes: bool) -> HirBuild an HIR expression for ..
A . expression matches any character except for \n. To build an
expression that matches any character, including \n, use the any
method.
If bytes is true, then this assumes characters are limited to a
single byte.
pub fn any(bytes: bool) -> Hir[src]
pub fn any(bytes: bool) -> HirBuild an HIR expression for (?s)..
A (?s). expression matches any character, including \n. To build an
expression that matches any character except for \n, then use the
dot method.
If bytes is true, then this assumes characters are limited to a
single byte.
pub fn is_always_utf8(&self) -> bool[src]
pub fn is_always_utf8(&self) -> boolReturn true if and only if this HIR will always match valid UTF-8.
When this returns false, then it is possible for this HIR expression to match invalid UTF-8.
pub fn is_all_assertions(&self) -> bool[src]
pub fn is_all_assertions(&self) -> boolReturns true if and only if this entire HIR expression is made up of zero-width assertions.
This includes expressions like ^$\b\A\z and even ((\b)+())*^, but
not ^a.
pub fn is_anchored_start(&self) -> bool[src]
pub fn is_anchored_start(&self) -> boolReturn true if and only if this HIR is required to match from the
beginning of text. This includes expressions like ^foo, ^(foo|bar),
^foo|^bar but not ^foo|bar.
pub fn is_anchored_end(&self) -> bool[src]
pub fn is_anchored_end(&self) -> boolReturn true if and only if this HIR is required to match at the end
of text. This includes expressions like foo$, (foo|bar)$,
foo$|bar$ but not foo$|bar.
pub fn is_any_anchored_start(&self) -> bool[src]
pub fn is_any_anchored_start(&self) -> boolReturn true if and only if this HIR contains any sub-expression that
is required to match at the beginning of text. Specifically, this
returns true if the ^ symbol (when multiline mode is disabled) or the
\A escape appear anywhere in the regex.
pub fn is_any_anchored_end(&self) -> bool[src]
pub fn is_any_anchored_end(&self) -> boolReturn true if and only if this HIR contains any sub-expression that is
required to match at the end of text. Specifically, this returns true
if the $ symbol (when multiline mode is disabled) or the \z escape
appear anywhere in the regex.
pub fn is_match_empty(&self) -> bool[src]
pub fn is_match_empty(&self) -> boolReturn true if and only if the empty string is part of the language matched by this regular expression.
This includes a*, a?b*, a{0}, (), ()+, ^$, a|b?, \B,
but not a, a+ or \b.
Trait Implementations
impl Clone for Hir[src]
impl Clone for Hirfn clone(&self) -> Hir[src]
fn clone(&self) -> HirReturns a copy of the value. Read more
fn clone_from(&mut self, source: &Self)1.0.0[src]
fn clone_from(&mut self, source: &Self)Performs copy-assignment from source. Read more
impl Debug for Hir[src]
impl Debug for Hirfn fmt(&self, f: &mut Formatter) -> Result[src]
fn fmt(&self, f: &mut Formatter) -> ResultFormats the value using the given formatter. Read more
impl Eq for Hir[src]
impl Eq for Hirimpl PartialEq for Hir[src]
impl PartialEq for Hirfn eq(&self, other: &Hir) -> bool[src]
fn eq(&self, other: &Hir) -> boolThis method tests for self and other values to be equal, and is used by ==. Read more
fn ne(&self, other: &Hir) -> bool[src]
fn ne(&self, other: &Hir) -> boolThis method tests for !=.
impl Display for Hir[src]
impl Display for HirPrint a display representation of this Hir.
The result of this is a valid regular expression pattern string.
This implementation uses constant stack space and heap space proportional
to the size of the Hir.
fn fmt(&self, f: &mut Formatter) -> Result[src]
fn fmt(&self, f: &mut Formatter) -> ResultFormats the value using the given formatter. Read more
impl Drop for Hir[src]
impl Drop for HirA custom Drop impl is used for HirKind such that it uses constant stack
space but heap space proportional to the depth of the total Hir.