Crate libreda_stream_parser

Source
Expand description

A simple library for parsing data streams.

Parsing is splitted into two tasks

  • Splitting an iterator into tokens. This is done by a Lexer.
  • Processing the tokens: The ‘Tokenized’ struct provides helper functions for processing the stream of tokens.

§Example

use itertools::{Itertools, PeekingNext};
use libreda_stream_parser::*;

struct ArrayLexer {}

impl Lexer for ArrayLexer {
    type Char = char;

    fn consume_next_token(
        &mut self,
        input: &mut (impl Iterator<Item = Self::Char> + PeekingNext),
        mut output: impl FnMut(Self::Char),
    ) -> Result<(), ParserError<char>> {
        // Skip whitespace.
        let _n = input.peeking_take_while(|c| c.is_whitespace()).count();

        let is_terminal_char = |c: char| -> bool {
            let terminals = "[],";
            c.is_whitespace() || terminals.contains(c)
        };

        if let Some(c) = input.next() {
            output(c);
            // Continue reading token if `c` was no terminal character.
            if !is_terminal_char(c) {
                input
                    .peeking_take_while(|&c| !is_terminal_char(c))
                    .for_each(output);
            }
        }

        Ok(())
    }
}

/// Parse an array of the form `[1.0, 2, 3.1324]`.
fn parse_array(data: &str) -> Result<Vec<f64>, ParserError<char>> {
    let mut tk = tokenize(data.chars(), ArrayLexer {});

    tk.advance()?;

    let mut arr: Vec<f64> = vec![];

    tk.expect_str("[")?;

    loop {
        if tk.test_str("]")? {
            break;
        }

        let num = tk.take_and_parse()?;
        arr.push(num);

        tk.expect_str(",")?;
    }

    Ok(arr)
}

let data = r#"
    [
        1.23,
        2.34,
        3.456,
    ]
"#;

let arr = parse_array(data).expect("parsing failed");

assert_eq!(arr, vec![1.23, 2.34, 3.456]);

Structs§

Tokenized
Provide sequential access to tokens that are created on the fly by splitting characters at whitespace.

Enums§

ParserError
Error type issued from lexer and parser.

Traits§

Lexer
Partition an input stream into tokens. The lexer consumes one token from an input stream in each call of consume_next_token.

Functions§

tokenize
Split a stream of characters into tokens separated by whitespace. Comments are ignored.