Structex

Struct Structex 

Source
pub struct Structex<R>
where R: RegexEngine,
{ /* private fields */ }
Expand description

A compiled structural regular expression backed by an underlying regular expression engine.

A Structex can be used to search for tagged substrings within a haystack supported by the regular expression engine it is backed by. The primary API for making use of a Structex is the Structex::iter_tagged_captures method which will iterate over the TaggedCaptures within a given haystack as it is searched.

Implementations§

Source§

impl<R> Structex<R>
where R: RegexEngine,

Source

pub fn new(se: &str) -> Result<Self, Error>

Compiles a structural regular expression. Once compiled it may be used repeatedly and cloned cheaply, but note that compilation can be an expensive process so Structex instances should be reused wherever possible.

To configure how given Structex is compiled, see StructexBuilder.

§Error

If an invalid expression is given then an error is returned. The exact expressions that are valid to compile will depend on the underlying regular expression engine being used.

§Example
// A Structex backed by the regex crate
type Structex = structex::Structex<regex::Regex>;

// An empty expression is always invalid
assert!(Structex::new("").is_err());

// The top level expression must not be a bare action
assert!(Structex::new("P/I am invalid/").is_err());

// A valid expression with a named action
assert!(Structex::new("x/hello, (world|sailor)!/ p").is_ok());
Source

pub fn as_str(&self) -> &str

Returns the original string of this structex.

§Example
type Structex = structex::Structex<regex::Regex>;

let se = Structex::new("x/foo.*bar/ p").unwrap();
assert_eq!(se.as_str(), "x/foo.*bar/ p");
Source

pub fn actions(&self) -> &[Action]

Returns the registered actions that were parsed from the compiled expression.

§Example
use structex::Action;

type Structex = structex::Structex<regex::Regex>;

let se = Structex::new("x/foo.*bar/ { p; a/baz/; }").unwrap();
let actions = se.actions();

assert_eq!(actions.len(), 2);

assert_eq!(actions[0].id(), 0);
assert_eq!(actions[0].tag(), 'p');
assert_eq!(actions[0].arg(), None);

assert_eq!(actions[1].id(), 1);
assert_eq!(actions[1].tag(), 'a');
assert_eq!(actions[1].arg(), Some("baz"));
Source

pub fn tags(&self) -> &[char]

Returns the registered tags that were parsed from the compiled expression.

§Example
type Structex = structex::Structex<regex::Regex>;

let se = Structex::new("x/foo.*bar/ { p; a/baz/; }").unwrap();
assert_eq!(se.tags(), &['a', 'p']);
Source

pub fn iter_tagged_captures<'s, 'h, H>( &'s self, haystack: &'h H, ) -> TaggedCapturesIter<'s, 'h, R, H>
where H: Haystack<R> + ?Sized,

Iterate over all TaggedCaptures within the given haystack in order.

§Examples

By default, matches will be emitted without an associated action attached to them, allowing you to write simple expressions that filter and refine regions of the haystack to locate the structure you are looking for.

type Structex = structex::Structex<regex::Regex>;

let se = Structex::new(r#"
  x/(.|\n)*?\./   # split into sentences
  g/Alice/        # if the sentence contains "Alice"
  n/(\w+)\./      # extract the last word of the sentence
"#).unwrap();

let haystack = r#"This is a multi-line
string that mentions peoples names.
People like Alice and Bob. People
like Claire and David, but really
we're here to talk about Alice.
Alice is everyone's friend."#;

let last_words: Vec<String> = se
    .iter_tagged_captures(haystack)
    .map(|m| m.submatch_text(1).unwrap().to_string())
    .collect();

assert_eq!(&last_words, &["Bob", "Alice", "friend"]);

When writing more complex expressions you will want to assign tagged actions to each matching branch in order to distinguish them:

use structex::TaggedCaptures;

type Structex = structex::Structex<regex::Regex>;

let se = Structex::new(r#"
  # split into sentences
  x/(.|\n)*?\./ {
    # if the sentence contains "Alice" extract the last word of the sentence
    g/Alice/ n/(\w+)\./ A;
    # if it doesn't, extract the first word of the sentence
    v/Alice/ n/(\w+)/ B;
  }
"#).unwrap();

let haystack = r#"This is a multi-line
string that mentions peoples names.
People like Alice and Bob. People
like Claire and David, but really
we're here to talk about Alice.
Alice is everyone's friend."#;

let captures: Vec<TaggedCaptures<str>> = se
    .iter_tagged_captures(haystack)
    .collect();

let words: Vec<(char, &str)> = captures
    .iter()
    .map(|m| (m.tag().unwrap(), m.submatch_text(1).unwrap()))
    .collect();

assert_eq!(
    &words,
    &[('B', "This"), ('A', "Bob"), ('A', "Alice"), ('A', "friend")]
);
Source

pub fn iter_tagged_captures_between<'s, 'h, H>( &'s self, byte_from: usize, byte_to: usize, haystack: &'h H, ) -> TaggedCapturesIter<'s, 'h, R, H>
where H: Haystack<R> + ?Sized,

Iterate over all TaggedCaptures within the given haystack between the given byte offsets in order.

See iter_tagged_captures for details of semantics.

§Example
type Structex = structex::Structex<regex::Regex>;

let se = Structex::new(r#"
  x/(.|\n)*?\./   # split into sentences
  g/Alice/        # if the sentence contains "Alice"
  n/(\w+)\./      # extract the last word of the sentence
"#).unwrap();

let haystack = r#"This is a multi-line
string that mentions peoples names.
People like Alice and Bob. People
like Claire and David, but really
we're here to talk about Alice.
Alice is everyone's friend."#;

// The byte range 57..156 removes the first an last sentences from the initial haystack.
assert_eq!(
    &haystack[57..156],
    r"People like Alice and Bob. People
like Claire and David, but really
we're here to talk about Alice."
);

let last_words: Vec<String> = se
    .iter_tagged_captures_between(57, 156, haystack)
    .map(|m| m.submatch_text(1).unwrap().to_string())
    .collect();

assert_eq!(&last_words, &["Bob", "Alice"]);

Trait Implementations§

Source§

impl<R> Clone for Structex<R>
where R: RegexEngine + Clone,

Source§

fn clone(&self) -> Structex<R>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<R> Debug for Structex<R>
where R: RegexEngine,

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<R> Display for Structex<R>
where R: RegexEngine,

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<R> Freeze for Structex<R>

§

impl<R> RefUnwindSafe for Structex<R>
where R: RefUnwindSafe,

§

impl<R> Send for Structex<R>
where R: Sync + Send,

§

impl<R> Sync for Structex<R>
where R: Sync + Send,

§

impl<R> Unpin for Structex<R>

§

impl<R> UnwindSafe for Structex<R>
where R: RefUnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.