Crate string_iter

Source
Expand description

An overly designed &str iterator made with zero-copy parsing in mind, with an emphasis on ergonomics.

§Usage

StringIter offers iteration and pattern matching methods as well as methods normally found in string types that would make sense for an iterator.

The standard StringIter yields a char in both its char and &str representations, allowing easily storage in its &str or Cow<str> form.

  • Trimming
let mut iter = "  !#$@!foo&*  ".str_iter();
iter.trim();
assert_eq!(iter.as_str(), "!#$@!foo&*");
iter.trim_start_by(|x: char| !x.is_alphabetic());
assert_eq!(iter.as_str(), "foo&*");
iter.trim_end_by(|x: char| !x.is_alphabetic());
assert_eq!(iter.as_str(), "foo");
  • Peeking
let mut iter = "bar".str_iter();
assert_eq!(iter.peek(), Some(('b', "b")));
assert_eq!(iter.peek_back(), Some(('r', "r")));
assert_eq!(iter.peekn(2), Ok("ba"));
assert_eq!(iter.peekn_back(2), Ok("ar"));
assert_eq!(iter.peekn(4), Err("bar"));
assert_eq!(iter.peekn_back(4), Err("bar"));
  • Iterating
let chars = [('😀', "😀"), ('🙁', "🙁"), ('😡', "😡"), ('😱', "😱")];
for (a, b) in "😀🙁😡😱".str_iter().zip(chars.into_iter()) {
    assert_eq!(a, b);
}
  • Look-ahead
let mut iter = "蟹🦀a🚀𓄇ë".str_iter().look_ahead(2).strs();
assert_eq!(iter.next(), Some("蟹🦀"));
assert_eq!(iter.next(), Some("🦀a"));
assert_eq!(iter.next(), Some("a🚀"));
assert_eq!(iter.next(), Some("🚀𓄇"));
assert_eq!(iter.next(), Some("𓄇ë"));
assert_eq!(iter.next(), Some("ë"));
assert_eq!(iter.next(), None);
  • Slice by pattern
let mut iter = "{{foo}bar}baz".str_iter();
let mut count = 0;
let s = iter.next_slice((|x| {
    match x {
        '{' => count += 1,
        '}' => count -= 1,
        _ => (),
    };
    count == 0
}).sep_with(Sep::Yield));
assert_eq!(s, Some("{{foo}bar}"));
assert_eq!(iter.as_str(), "baz");
  • Splitting
let mut iter = "thisIsCamelCase"
    .str_iter()
    .into_substrs(|c: char| c.is_uppercase());
assert_eq!(iter.next(), Some("this"));
assert_eq!(iter.next(), Some("Is"));
assert_eq!(iter.next(), Some("Camel"));
assert_eq!(iter.next(), Some("Case"));
assert_eq!(iter.next(), None);

§Patterns

We use Patterns in trim, slice and split.

In trim, the pattern matches until a false value is found.

In slice and split, the pattern matches until a true value is found.

See Sep and sep_with() for dealing with the corner case.

§Supported Patterns

Matches once on the nth char.

  • ..isize

Matches the first n chars. This is useful with trim.

Matches a char.

Matching an &str by looking ahead.

  • &[char] or [char;N]

Matches any char in the set.

  • char..=char

Matches a char in range, we only support inclusive ranges to avoid errors.

  • FnMut(char) -> FallibleBool

Matches any char that makes the function return true.

FallibleBool can be bool, Option<bool> or Result<bool, E: Debug>

  • (FnMut(&str) -> FallibleBool).expecting(n)

Matches any &str that makes the function return true by looking ahead for n chars.

  • (FnMut(char, &str) -> FallibleBool).expecting(n)

Matches any &str that makes the function return true by looking ahead for n chars.

char is the first char in &str

Match repeatedly by an interval.

A macro that turns match patterns into Patterns.

  • Custom implementations of Pattern

You can write your own pattern types!

§Examples

Getting an ascii identifier from a string

let foo = r#"  ferris123@crab.io "#;
let mut iter = foo.str_iter();
iter.trim_start();
let mut quotes = 0;
let slice = match iter.peek() {
    Some(('a'..='z'|'A'..='Z'|'_', _)) => {
        iter.next_slice(pat!(!'a'..='z'|'A'..='Z'|'0'..='9'|'_'))
    }
    _ => panic!("expected ident")
};
assert_eq!(slice, Some("ferris123"));
 
// note @ is still in the iterator
assert_eq!(iter.as_str(), "@crab.io ");

Getting a string literal “foo” from a string:

let foo = r#"    "foo"  bar "#;
let mut iter = foo.str_iter();
iter.trim_start();
let mut quotes = 0;
let slice = iter.next_slice((|c| match c {
    '"' =>  {
        quotes += 1;
        quotes == 2
    }
    _ => false,
}).sep_with(Sep::Yield));
assert_eq!(slice, Some("\"foo\""));
assert_eq!(iter.as_str(), "  bar ");

§Performance

This crate is comparable in speed to str::chars().

If operating on chars alone, str::chars() is faster.

But StringIter can be faster than str::chars() if you need to convert the char back into UTF-8.

§Safety

This crate uses a lot of unsafe code to take advantage of the UTF-8 invarient and bypass some bounds checks and UTF-8 checks.

In addition we do not guarantee memory safety if given invalid UTF-8 input.

Please file an issue if you find any soundness problem.

Modules§

iter
Misallenious iterators used in this crate.
patterns
Misallenious patterns used in this crate.
prelude
Convenience re-export of common members

Macros§

interval
Defines a repeating pattern Interval
pat
Convert a char or &str match pattern into a Pattern

Structs§

StringIter
A double ended, UTF-8 char based Iterator for &strs that supports iterating, looking ahead, trimming, pattern matching, splitting and other common string operations.

Enums§

Never
A never type that cannot be instanciated.
Sep
Determines what to do with a matched char on string separation. By default Retain.

Traits§

CharStrPredicate
Convert FnMut(char, &str) -> FalliableBool into a pattern by specifying a look-ahead length.
FallibleBool
A generalized fallible boolean result.
Merge
Iterators for merging substrings.
Pattern
A pattern for use in slice, split and trim functions.
SetSep
Allows a pattern to edit its sep_method.
StrPredicate
Convert FnMut(&str) -> FalliableBool into a pattern by specifying a look-ahead length.
StringExt
Extension methods for strings
StringIndex
A usize or a range representing a slice of chars in a string.
StringIterable
A struct that can be iterated with a StringIter