pub struct TokenStream<'a> { /* private fields */ }Expand description
Token types and token stream for lexer output. Token stream that wraps perl-lexer or a pre-lexed token buffer.
Provides three-token lookahead, transparent trivia skipping (in lexer mode), and statement-boundary state management used by the recursive-descent parser.
Implementations§
Source§impl<'a> TokenStream<'a>
impl<'a> TokenStream<'a>
Sourcepub fn new(input: &'a str) -> TokenStream<'a>
pub fn new(input: &'a str) -> TokenStream<'a>
Create a new token stream from source code.
Sourcepub fn from_vec(tokens: Vec<Token>) -> TokenStream<'a>
pub fn from_vec(tokens: Vec<Token>) -> TokenStream<'a>
Create a token stream from a pre-lexed token list.
This constructor skips lexing entirely and feeds tokens directly from the
provided Vec. It is intended for the incremental parsing pipeline where
tokens from a prior parse run can be reused for unchanged regions.
§Behaviour differences from TokenStream::new
on_stmt_boundary: clears lookahead cache only; no lexer mode reset (tokens are already classified).relex_as_term: clears lookahead cache only; no re-lexing (token kinds are fixed from the original lex pass).enter_format_mode: no-op.
§Arguments
tokens— Pre-lexed tokens. AnEoftoken does not need to be included; the stream synthesises one when the buffer is exhausted.
§Examples
use perl_tokenizer::{Token, TokenKind, TokenStream};
let tokens = vec![
Token::new(TokenKind::My, "my", 0, 2),
Token::new(TokenKind::Eof, "", 2, 2),
];
let mut stream = TokenStream::from_vec(tokens);
assert!(matches!(stream.peek(), Ok(t) if t.kind == TokenKind::My));Sourcepub fn lexer_tokens_to_parser_tokens(tokens: Vec<Token>) -> Vec<Token>
pub fn lexer_tokens_to_parser_tokens(tokens: Vec<Token>) -> Vec<Token>
Convert a slice of raw LexerTokens to parser Tokens, filtering out trivia.
This is a convenience method for the incremental parsing pipeline where the
token cache stores raw lexer tokens (including whitespace and comments) and
needs to convert them to parser tokens before feeding to Self::from_vec.
Trivia token types (whitespace, newlines, comments, EOF) are discarded.
All other token types are converted using the same mapping as the live
TokenStream would apply.
§Examples
use perl_tokenizer::{TokenKind, TokenStream};
use perl_lexer::{PerlLexer, TokenType};
// Collect raw lexer tokens
let mut lexer = PerlLexer::new("my $x = 1;");
let mut raw = Vec::new();
while let Some(t) = lexer.next_token() {
if matches!(t.token_type, TokenType::EOF) { break; }
raw.push(t);
}
// Convert to parser tokens and build a stream
let parser_tokens = TokenStream::lexer_tokens_to_parser_tokens(raw);
let mut stream = TokenStream::from_vec(parser_tokens);
assert!(matches!(stream.peek(), Ok(t) if t.kind == TokenKind::My));Sourcepub fn peek(&mut self) -> Result<&Token, ParseError>
pub fn peek(&mut self) -> Result<&Token, ParseError>
Peek at the next token without consuming it
Sourcepub fn next(&mut self) -> Result<Token, ParseError>
pub fn next(&mut self) -> Result<Token, ParseError>
Consume and return the next token
Sourcepub fn peek_second(&mut self) -> Result<&Token, ParseError>
pub fn peek_second(&mut self) -> Result<&Token, ParseError>
Peek at the second token (two tokens ahead)
Sourcepub fn peek_third(&mut self) -> Result<&Token, ParseError>
pub fn peek_third(&mut self) -> Result<&Token, ParseError>
Peek at the third token (three tokens ahead)
Sourcepub fn enter_format_mode(&mut self)
pub fn enter_format_mode(&mut self)
Enter format body parsing mode in the lexer.
No-op when operating in buffered (pre-lexed) mode — the tokens are already fully classified.
Sourcepub fn on_stmt_boundary(&mut self)
pub fn on_stmt_boundary(&mut self)
Called at statement boundaries to reset lexer state and clear cached lookahead.
In buffered mode only the lookahead cache is cleared; no lexer mode reset is performed because the tokens are already fully classified.
Sourcepub fn relex_as_term(&mut self)
pub fn relex_as_term(&mut self)
Re-lex the current peeked token in ExpectTerm mode.
This is needed for context-sensitive constructs like split /regex/
where the / was lexed as division (Slash) but should be a regex
delimiter. Rolls the lexer back to the peeked token’s start position,
switches to ExpectTerm mode, and clears the peek cache so the next
peek() or next() re-lexes it as a regex.
In buffered mode the peek cache is cleared but no re-lexing occurs — token kinds are fixed from the original lex pass.
Sourcepub fn invalidate_peek(&mut self)
pub fn invalidate_peek(&mut self)
Pure peek cache invalidation - no mode changes
Sourcepub fn peek_fresh_kind(&mut self) -> Option<TokenKind>
pub fn peek_fresh_kind(&mut self) -> Option<TokenKind>
Convenience method for a one-shot fresh peek