CharStream

Trait CharStream 

Source
pub trait CharStream<P>{
Show 14 methods // Required methods fn range_as_bytes(&self, ofs: usize, n: usize) -> &[u8] ; fn matches_bytes(&self, state: &P, s: &[u8]) -> bool; fn get_text_span(&self, span: &StreamCharSpan<P>) -> &str where P: PosnInCharStream; fn get_text(&self, start: P, end: P) -> &str; fn matches_str(&self, pos: &P, pat: &str) -> bool; fn peek_at(&self, state: &P) -> Option<char>; fn consumed(&self, state: P, num_chars: usize) -> P; // Provided methods fn do_while<F: Fn(usize, char) -> bool>( &self, state: P, ch: char, f: &F, ) -> (P, Option<(P, usize)>) { ... } fn fold<T, F: Fn(&Self, T, &P, usize, char) -> (T, Option<P>)>( &self, state: P, ch: char, acc: T, f: &F, ) -> (P, Option<(P, usize, T)>) { ... } fn consumed_char(&self, state: P, ch: char) -> P where P: PosnInCharStream { ... } unsafe fn consumed_newline(&self, state: P, num_bytes: usize) -> P where P: PosnInCharStream { ... } fn consumed_ascii_str(&self, state: P, s: &str) -> P where P: PosnInCharStream { ... } unsafe fn consumed_chars( &self, state: P, num_bytes: usize, num_chars: usize, ) -> P where P: PosnInCharStream { ... } fn commit_consumed(&self, _up_to: &P) { ... }
}
Expand description

The CharStream trait allows a stream of char to provide extraa methods

Requires P : PosnInCharStream

Required Methods§

Source

fn range_as_bytes(&self, ofs: usize, n: usize) -> &[u8]

Retrieve a range of bytes from the stream

Source

fn matches_bytes(&self, state: &P, s: &[u8]) -> bool

Return true if the content of the stream at ‘state’ matches the byte slice

Source

fn get_text_span(&self, span: &StreamCharSpan<P>) -> &str

Get the text between the start of a span (inclusive) and the end of the span (exclusive).

Source

fn get_text(&self, start: P, end: P) -> &str

Get the text between the start (inclusive) and the end (exclusive).

Source

fn matches_str(&self, pos: &P, pat: &str) -> bool

Match the text at the offset with a str; return true if it matches, else false

Source

fn peek_at(&self, state: &P) -> Option<char>

Peek at the next character in the stream, returning None if the state is the end of the stream

Source

fn consumed(&self, state: P, num_chars: usize) -> P

Move the stream state forward by the specified number of characters

The characters MUST NOT inclulde newlines

Provided Methods§

Source

fn do_while<F: Fn(usize, char) -> bool>( &self, state: P, ch: char, f: &F, ) -> (P, Option<(P, usize)>)

Steps along the stream starting at the provided state (and character) while the provided function returns true; the function is provided with the index and character (starting at 0 / ch), and it returns true if the token continues, otherwise false

If the first invocation of ‘f’ returns false then the token is said to not match, and ‘do_while’ returns the stream state and Ok(None).

If the first N (more than zero) invocations match then the result is the stream state after the matched characters, and Some(initial state, N)

This can be used to match whitespace (where N is probably discarded), or user ‘id’ values in a language. The text can be retrieved with the ‘get_text’ method

Examples found in repository?
examples/calc.rs (line 95)
93fn parse_value_fn(stream: &TextStream, state: TextPos, ch: char) -> CalcLexResult {
94    let is_digit = |_, ch| ('0'..='9').contains(&ch);
95    let (state, opt_x) = stream.do_while(state, ch, &is_digit);
96    if let Some((start, _n)) = opt_x {
97        let s = stream.get_text(start, state);
98        let value: f64 = s.parse().unwrap();
99        Ok(Some((state, CalcToken::Value(value))))
100    } else {
101        Ok(None)
102    }
103}
104
105//fi parse_whitespace_fn
106/// Parser function to return a Token if whitespace
107fn parse_whitespace_fn(stream: &TextStream, state: TextPos, ch: char) -> CalcLexResult {
108    let is_whitespace = |_n, ch| ch == ' ' || ch == '\t' || ch == '\n';
109    let (state, opt_x) = stream.do_while(state, ch, &is_whitespace);
110    if let Some((_start, _n)) = opt_x {
111        Ok(Some((state, CalcToken::Whitespace)))
112    } else {
113        Ok(None)
114    }
115}
More examples
Hide additional examples
examples/simple.rs (lines 70-72)
61    pub fn parse_comment_line<L>(
62        stream: &L,
63        state: L::State,
64        ch: char,
65    ) -> LexerParseResult<P, Self, L::Error>
66    where
67        L: CharStream<P>,
68        L: Lexer<Token = Self, State = P>,
69    {
70        match stream.do_while(state, ch, &|n, ch| {
71            ((n < 2) && (ch == '/')) || ((n >= 2) && ch != '\n')
72        }) {
73            (state, Some((start, _n))) => {
74                let span = StreamCharSpan::new(start, state);
75                Ok(Some((state, SimpleToken::CommentLine(span))))
76            }
77            (_, None) => Ok(None),
78        }
79    }
80
81    //fp parse_digits
82    pub fn parse_digits<L>(
83        stream: &L,
84        state: L::State,
85        ch: char,
86    ) -> LexerParseResult<P, Self, L::Error>
87    where
88        L: CharStream<P>,
89        L: Lexer<Token = Self, State = P>,
90    {
91        match stream.do_while(state, ch, &|_, ch| ch.is_ascii_digit()) {
92            (state, Some((start, _n))) => {
93                let span = StreamCharSpan::new(start, state);
94                Ok(Some((state, SimpleToken::Digits(span))))
95            }
96            (_, None) => Ok(None),
97        }
98    }
99
100    //fp parse_whitespace
101    pub fn parse_whitespace<L>(
102        stream: &L,
103        state: L::State,
104        ch: char,
105    ) -> LexerParseResult<P, Self, L::Error>
106    where
107        L: CharStream<P>,
108        L: Lexer<Token = Self, State = P>,
109    {
110        match stream.do_while(state, ch, &|_, ch| (ch == ' ' || ch == '\t')) {
111            (state, Some((start, _))) => {
112                let span = StreamCharSpan::new(start, state);
113                Ok(Some((state, SimpleToken::Whitespace(span))))
114            }
115            (_, None) => Ok(None),
116        }
117    }
118
119    //fp parse_id
120    pub fn parse_id<L, F1, F2>(
121        stream: &L,
122        state: L::State,
123        ch: char,
124        is_id_start: F1,
125        is_id: F2,
126    ) -> LexerParseResult<P, Self, L::Error>
127    where
128        L: CharStream<P>,
129        L: Lexer<Token = Self, State = P>,
130        F1: Fn(char) -> bool,
131        F2: Fn(char) -> bool,
132    {
133        match stream.do_while(state, ch, &|n, ch| {
134            (n == 0 && is_id_start(ch)) || ((n > 0) && is_id(ch))
135        }) {
136            (state, Some((start, _))) => {
137                let span = StreamCharSpan::new(start, state);
138                Ok(Some((state, SimpleToken::Id(span))))
139            }
140            (_, None) => Ok(None),
141        }
142    }
Source

fn fold<T, F: Fn(&Self, T, &P, usize, char) -> (T, Option<P>)>( &self, state: P, ch: char, acc: T, f: &F, ) -> (P, Option<(P, usize, T)>)

Steps along the stream starting at the provided state, character and accumulator value while the provided function returns (true, new accumulator); the function is provided with the latest accumulator, index, character (starting at 0 / ch), and it returns true and a new accumulator if the token continues, otherwise false and the final accumulator value

If the first invocation of ‘f’ returns false then the token is said to not match, and ‘fold’ returns the stream state and Ok(None).

If the first N (more than zero) invocations match then the result is the stream state after the matched characters, and Some(initial state, N, final accumulator)

This can be used to accumulate significant state about a token as it is parsed, in excess of the simple number of characters.

Source

fn consumed_char(&self, state: P, ch: char) -> P

Get a stream state after consuming the specified character at its current state

Examples found in repository?
examples/calc.rs (line 85)
73fn parse_char_fn(stream: &TextStream, state: TextPos, ch: char) -> CalcLexResult {
74    if let Some(t) = {
75        match ch {
76            '+' => Some(CalcToken::Op(CalcOp::Plus)),
77            '-' => Some(CalcToken::Op(CalcOp::Minus)),
78            '*' => Some(CalcToken::Op(CalcOp::Times)),
79            '/' => Some(CalcToken::Op(CalcOp::Divide)),
80            '(' => Some(CalcToken::Open),
81            ')' => Some(CalcToken::Close),
82            _ => None,
83        }
84    } {
85        Ok(Some((stream.consumed_char(state, ch), t)))
86    } else {
87        Ok(None)
88    }
89}
More examples
Hide additional examples
examples/simple.rs (line 56)
42    pub fn parse_char<L>(
43        stream: &L,
44        state: L::State,
45        ch: char,
46    ) -> LexerParseResult<P, Self, L::Error>
47    where
48        L: CharStream<P>,
49        L: Lexer<Token = Self, State = P>,
50    {
51        let pos = state;
52        match ch {
53            '\n' => Ok(Some((stream.consumed(state, 1), Self::Newline(pos)))),
54            '(' | '[' | '{' => Ok(Some((stream.consumed(state, 1), Self::OpenBra(pos, ch)))),
55            ')' | ']' | '}' => Ok(Some((stream.consumed(state, 1), Self::CloseBra(pos, ch)))),
56            ch => Ok(Some((stream.consumed_char(state, ch), Self::Char(pos, ch)))),
57        }
58    }
Source

unsafe fn consumed_newline(&self, state: P, num_bytes: usize) -> P

Get a stream state after consuming a newline at its current state

§Safety

num_bytes must correspond to the number of bytes that the newline character consists of, and state must point to the bytes offset of that character

Source

fn consumed_ascii_str(&self, state: P, s: &str) -> P

Get the state after consuming a particular ascii string without newlines

This is safe as there is no unsafe handling of byte offsets within state; however, there is no check that the provided string is ASCII and that it does not contain newlines. If these API rules are broke then the lie and column held by state may be incorrect (which is not unsafe, but potentially a bug)

Source

unsafe fn consumed_chars( &self, state: P, num_bytes: usize, num_chars: usize, ) -> P

Become the span after consuming a particular string of known character length

§Safety

num_bytes must correspond to the number of bytes that ‘num_chars’ indicates start at state. If this constraint is not met then the byte offset indicated by the returned value may not correspond to a UTF8 character boundary within the stream.

Source

fn commit_consumed(&self, _up_to: &P)

Invoked by the Lexer to indicate that the stream has been consumed up to a certain point, and that (for parsing) no state earlier in the stream will be requested in the future

A truly streaming source can drop earlier data in the stream if this fits the application

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementors§

Source§

impl<'a, P, T, E> CharStream<P> for LexerOfStr<'a, P, T, E>