pub struct Lexer<'a, 'i> { /* private fields */ }Expand description
The Lexer struct is responsible for tokenizing input source code into discrete tokens
based on PHP language syntax. It is designed to work with PHP code from version 7.0 up to 8.4.
The lexer reads through the provided input and processes it accordingly.
It identifies PHP-specific tokens, including operators, keywords, comments, strings, and other syntax elements,
and produces a sequence of Token objects that are used in further stages of compilation or interpretation.
The lexer is designed to be used in a streaming fashion, where it reads the input source code in chunks and produces tokens incrementally. This allows for efficient processing of large source files and minimizes memory usage.
Implementations§
Source§impl<'a, 'i> Lexer<'a, 'i>
impl<'a, 'i> Lexer<'a, 'i>
Sourcepub fn new(interner: &'i ThreadedInterner, input: Input<'a>) -> Lexer<'a, 'i>
pub fn new(interner: &'i ThreadedInterner, input: Input<'a>) -> Lexer<'a, 'i>
Sourcepub fn scripting(
interner: &'i ThreadedInterner,
input: Input<'a>,
) -> Lexer<'a, 'i>
pub fn scripting( interner: &'i ThreadedInterner, input: Input<'a>, ) -> Lexer<'a, 'i>
Sourcepub fn has_reached_eof(&self) -> bool
pub fn has_reached_eof(&self) -> bool
Check if the lexer has reached the end of the input.
If this method returns true, the lexer will not produce any more tokens.
Sourcepub fn get_position(&self) -> Position
pub fn get_position(&self) -> Position
Get the current position of the lexer in the input source code.
Sourcepub fn advance(&mut self) -> Option<Result<Token, SyntaxError>>
pub fn advance(&mut self) -> Option<Result<Token, SyntaxError>>
Tokenizes the next input from the source code.
This method reads from the input and produces the next Token based on the current [LexerMode].
It handles various lexical elements such as inline text, script code, strings with interpolation,
comments, and different PHP-specific constructs.
§Returns
Some(Ok(Token))if a token was successfully parsed.Some(Err(SyntaxError))if a syntax error occurred while parsing the next token.Noneif the end of the input has been reached.
§Examples
use mago_interner::ThreadedInterner;
use mago_syntax::lexer::Lexer;
use mago_source::SourceIdentifier;
use mago_syntax_core::input::Input;
let interner = ThreadedInterner::new();
let source = SourceIdentifier::dummy();
let input = Input::new(source, b"<?php echo 'Hello, World!'; ?>");
let mut lexer = Lexer::new(&interner, input);
while let Some(result) = lexer.advance() {
match result {
Ok(token) => println!("Token: {:?}", token),
Err(error) => eprintln!("Syntax error: {:?}", error),
}
}§Notes
- It efficiently handles tokenization by consuming input based on patterns specific to PHP syntax.
- The lexer supports complex features like string interpolation and different numeric formats.
§Errors
Returns Some(Err(SyntaxError)) in cases such as:
- Unrecognized tokens that do not match any known PHP syntax.
- Unexpected tokens in a given context, such as an unexpected end of string.
§Panics
This method should not panic under normal operation. If it does, it indicates a bug in the lexer implementation.
§See Also
Token: Represents a lexical token with its kind, value, and span.SyntaxError: Represents errors that can occur during lexing.
Trait Implementations§
Auto Trait Implementations§
impl<'a, 'i> Freeze for Lexer<'a, 'i>
impl<'a, 'i> !RefUnwindSafe for Lexer<'a, 'i>
impl<'a, 'i> Send for Lexer<'a, 'i>
impl<'a, 'i> Sync for Lexer<'a, 'i>
impl<'a, 'i> Unpin for Lexer<'a, 'i>
impl<'a, 'i> !UnwindSafe for Lexer<'a, 'i>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more