pub struct StrTokenizer<'t> {
    pub keep_whitespace: bool,
    pub keep_newline: bool,
    pub keep_comment: bool,
    pub line_positions: Vec<usize>,
    /* private fields */
}
Expand description

General-purpose, zero-copy lexical analyzer that produces RawTokens from an str. This tokenizer uses regex, although not for everything. For example, to allow for string literals that contain escaped quotations, a direct loop is implemented. The tokenizer gives the option of returning newlines, whitespaces (with count) and comments as special tokens. It recognizes mult-line string literals, multi-line as well as single-line comments, and returns the starting line and column positions of each token.

Example:

  let mut scanner = StrTokenizer::from_str("while (1) fork();//run at your own risk");
  scanner.set_line_comment("//");
  scanner.keep_comment=true;
  scanner.add_single(';'); // separates ; from following symbols
  while let Some(token) = scanner.next() {
     println!("Token,line,column: {:?}",&token);
  }

this code produces output

  Token,line,column: (Alphanum("while"), 1, 1)
  Token,line,column: (Symbol("("), 1, 7)
  Token,line,column: (Num(1), 1, 8)
  Token,line,column: (Symbol(")"), 1, 9)
  Token,line,column: (Alphanum("fork"), 1, 11)
  Token,line,column: (Symbol("("), 1, 15)
  Token,line,column: (Symbol(")"), 1, 16)
  Token,line,column: (Symbol(";"), 1, 17)
  Token,line,column: (Verbatim("//run at your own risk"), 1, 18)

Fields

keep_whitespace: bool

flag to toggle whether whitespaces should be returned as Whitespace tokens, default is false.

keep_newline: bool

flag to toggle whether newline characters (‘\n’) are returned as Newline tokens. Default is false. Note that if this flag is set to true then newline characters are treated differently from other whitespaces. For example, when parsing languages like Python, both keep_whitespace and keep_newline should be set to true.

keep_comment: bool

flag to determine if comments are kept and returned as Verbatim tokens, default is false.

line_positions: Vec<usize>

vector of starting byte position of each line, position 0 not used.

Implementations

creats a new tokenizer with defaults, does not set input.

adds a symbol of exactly length two. If the length is not two the function has no effect. Note that these symbols override all other types except for leading whitespaces and comments markers, e.g. “//” will have precedence over “/” and “==” will have precedence over “=”.

add a single-character symbol. The type of the symbol overrides other types except for whitespaces, comments and double-character symbols.

add a 3-character symbol

add custom defined regex, will correspond to RawToken::Custom variant. Custom regular expressions should not start with whitespaces and will override all others. Multiple Custom types will be matched by the order in which they where declared in the grammar file.

sets the input str to be parsed, resets position information. Note: trailing whitespaces are always trimmed from the input.

sets the symbol that begins a single-line comment. The default is “//”. If this is set to the empty string then no line-comments are recognized.

sets the symbols used to delineate multi-line comments using a whitespace separated string such as “/* */”. These symbols are also the default. Set this to the empty string to disable multi-line comments.

the current line that the tokenizer is on

the current column of the tokenizer

returns the current absolute byte position of the Tokenizer

returns the previous absolute byte position of the Tokenizer

returns the source of the tokenizer such as URL or filename

gets the current line of the source input

Retrieves the ith line of the raw input, if line index i is valid. This function is intended to be called once the tokenizer has completed its task of scanning and tokenizing the entire input. Otherwise, it may return None if the tokenizer has not yet scanned up to the line indicated. That is, it is intended for error message generation when evaluating the AST post-parsing.

Retrieves the source string slice at the indicated indices; returns the empty string if indices are invalid. The default implementation returns the empty string.

reset tokenizer to parse from beginning of input

returns next token, along with starting line and column numbers. This function will return None at end of stream or LexError along with a message printed to stderr if a tokenizer error occured.

creates a StrTokenizer from a LexSource structure that contains a string representing the contents of the source, and calls StrTokenizer::set_input to reference that string. To create a tokenizer that reads from, for example, a file is:

let source = LexSource::new(source_path).unwrap();
let mut tokenizer = StrTokenizer::from_source(&source);

creates a string tokenizer and sets input to give str.

Trait Implementations

The type of the elements being iterated over.

Advances the iterator and returns the next value. Read more

🔬 This is a nightly-only experimental API. (iter_next_chunk)

Advances the iterator and returns an array containing the next N values. Read more

Returns the bounds on the remaining length of the iterator. Read more

Consumes the iterator, counting the number of iterations and returning it. Read more

Consumes the iterator, returning the last element. Read more

🔬 This is a nightly-only experimental API. (iter_advance_by)

Advances the iterator by n elements. Read more

Returns the nth element of the iterator. Read more

Creates an iterator starting at the same point, but stepping by the given amount at each iteration. Read more

Takes two iterators and creates a new iterator over both in sequence. Read more

‘Zips up’ two iterators into a single iterator of pairs. Read more

🔬 This is a nightly-only experimental API. (iter_intersperse)

Creates a new iterator which places a copy of separator between adjacent items of the original iterator. Read more

🔬 This is a nightly-only experimental API. (iter_intersperse)

Creates a new iterator which places an item generated by separator between adjacent items of the original iterator. Read more

Takes a closure and creates an iterator which calls that closure on each element. Read more

Calls a closure on each element of an iterator. Read more

Creates an iterator which uses a closure to determine if an element should be yielded. Read more

Creates an iterator that both filters and maps. Read more

Creates an iterator which gives the current iteration count as well as the next value. Read more

Creates an iterator which can use the peek and peek_mut methods to look at the next element of the iterator without consuming it. See their documentation for more information. Read more

Creates an iterator that skips elements based on a predicate. Read more

Creates an iterator that yields elements based on a predicate. Read more

Creates an iterator that both yields elements based on a predicate and maps. Read more

Creates an iterator that skips the first n elements. Read more

Creates an iterator that yields the first n elements, or fewer if the underlying iterator ends sooner. Read more

An iterator adapter similar to fold that holds internal state and produces a new iterator. Read more

Creates an iterator that works like map, but flattens nested structure. Read more

Creates an iterator that flattens nested structure. Read more

Creates an iterator which ends after the first None. Read more

Does something with each element of an iterator, passing the value on. Read more

Borrows an iterator, rather than consuming it. Read more

Transforms an iterator into a collection. Read more

🔬 This is a nightly-only experimental API. (iterator_try_collect)

Fallibly transforms an iterator into a collection, short circuiting if a failure is encountered. Read more

🔬 This is a nightly-only experimental API. (iter_collect_into)

Collects all the items from an iterator into a collection. Read more

Consumes an iterator, creating two collections from it. Read more

🔬 This is a nightly-only experimental API. (iter_partition_in_place)

Reorders the elements of this iterator in-place according to the given predicate, such that all those that return true precede all those that return false. Returns the number of true elements found. Read more

🔬 This is a nightly-only experimental API. (iter_is_partitioned)

Checks if the elements of this iterator are partitioned according to the given predicate, such that all those that return true precede all those that return false. Read more

An iterator method that applies a function as long as it returns successfully, producing a single, final value. Read more

An iterator method that applies a fallible function to each item in the iterator, stopping at the first error and returning that error. Read more

Folds every element into an accumulator by applying an operation, returning the final result. Read more

Reduces the elements to a single one, by repeatedly applying a reducing operation. Read more

🔬 This is a nightly-only experimental API. (iterator_try_reduce)

Reduces the elements to a single one by repeatedly applying a reducing operation. If the closure returns a failure, the failure is propagated back to the caller immediately. Read more

Tests if every element of the iterator matches a predicate. Read more

Tests if any element of the iterator matches a predicate. Read more

Searches for an element of an iterator that satisfies a predicate. Read more

Applies function to the elements of iterator and returns the first non-none result. Read more

🔬 This is a nightly-only experimental API. (try_find)

Applies function to the elements of iterator and returns the first true result or the first error. Read more

Searches for an element in an iterator, returning its index. Read more

Searches for an element in an iterator from the right, returning its index. Read more

Returns the maximum element of an iterator. Read more

Returns the minimum element of an iterator. Read more

Returns the element that gives the maximum value from the specified function. Read more

Returns the element that gives the maximum value with respect to the specified comparison function. Read more

Returns the element that gives the minimum value from the specified function. Read more

Returns the element that gives the minimum value with respect to the specified comparison function. Read more

Reverses an iterator’s direction. Read more

Converts an iterator of pairs into a pair of containers. Read more

Creates an iterator which copies all of its elements. Read more

Creates an iterator which clones all of its elements. Read more

Repeats an iterator endlessly. Read more

Sums the elements of an iterator. Read more

Iterates over the entire iterator, multiplying all the elements Read more

Lexicographically compares the elements of this Iterator with those of another. Read more

🔬 This is a nightly-only experimental API. (iter_order_by)

Lexicographically compares the elements of this Iterator with those of another with respect to the specified comparison function. Read more

Lexicographically compares the elements of this Iterator with those of another. Read more

🔬 This is a nightly-only experimental API. (iter_order_by)

Lexicographically compares the elements of this Iterator with those of another with respect to the specified comparison function. Read more

Determines if the elements of this Iterator are equal to those of another. Read more

🔬 This is a nightly-only experimental API. (iter_order_by)

Determines if the elements of this Iterator are equal to those of another with respect to the specified equality function. Read more

Determines if the elements of this Iterator are unequal to those of another. Read more

Determines if the elements of this Iterator are lexicographically less than those of another. Read more

Determines if the elements of this Iterator are lexicographically less or equal to those of another. Read more

Determines if the elements of this Iterator are lexicographically greater than those of another. Read more

Determines if the elements of this Iterator are lexicographically greater than or equal to those of another. Read more

🔬 This is a nightly-only experimental API. (is_sorted)

Checks if the elements of this iterator are sorted. Read more

🔬 This is a nightly-only experimental API. (is_sorted)

Checks if the elements of this iterator are sorted using the given comparator function. Read more

🔬 This is a nightly-only experimental API. (is_sorted)

Checks if the elements of this iterator are sorted using the given key extraction function. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The type of the elements being iterated over.

Which kind of iterator are we turning this into?

Creates an iterator from a value. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.