Struct Lexer

Source
pub struct Lexer<'a, 'i> { /* private fields */ }
Expand description

The Lexer struct is responsible for tokenizing input source code into discrete tokens based on PHP language syntax. It is designed to work with PHP code from version 7.0 up to 8.4.

The lexer reads through the provided input and processes it accordingly.

It identifies PHP-specific tokens, including operators, keywords, comments, strings, and other syntax elements, and produces a sequence of Token objects that are used in further stages of compilation or interpretation.

The lexer is designed to be used in a streaming fashion, where it reads the input source code in chunks and produces tokens incrementally. This allows for efficient processing of large source files and minimizes memory usage.

Implementations§

Source§

impl<'a, 'i> Lexer<'a, 'i>

Source

pub fn new(interner: &'i ThreadedInterner, input: Input<'a>) -> Lexer<'a, 'i>

Creates a new Lexer instance.

§Parameters
  • interner: The interner to use for string interning.
  • input: The input source code to tokenize.
§Returns

A new Lexer instance that reads from the provided byte slice.

Source

pub fn scripting( interner: &'i ThreadedInterner, input: Input<'a>, ) -> Lexer<'a, 'i>

Creates a new Lexer instance for parsing a script block.

§Parameters
  • interner: The interner to use for string interning.
  • input: The input source code to tokenize.
§Returns

A new Lexer instance that reads from the provided byte slice.

Source

pub fn has_reached_eof(&self) -> bool

Check if the lexer has reached the end of the input.

If this method returns true, the lexer will not produce any more tokens.

Source

pub fn get_position(&self) -> Position

Get the current position of the lexer in the input source code.

Source

pub fn advance(&mut self) -> Option<Result<Token, SyntaxError>>

Tokenizes the next input from the source code.

This method reads from the input and produces the next Token based on the current [LexerMode]. It handles various lexical elements such as inline text, script code, strings with interpolation, comments, and different PHP-specific constructs.

§Returns
  • Some(Ok(Token)) if a token was successfully parsed.
  • Some(Err(SyntaxError)) if a syntax error occurred while parsing the next token.
  • None if the end of the input has been reached.
§Examples
use mago_interner::ThreadedInterner;
use mago_syntax::lexer::Lexer;
use mago_source::SourceIdentifier;
use mago_syntax_core::input::Input;

let interner = ThreadedInterner::new();

let source = SourceIdentifier::dummy();
let input = Input::new(source, b"<?php echo 'Hello, World!'; ?>");

let mut lexer = Lexer::new(&interner, input);

while let Some(result) = lexer.advance() {
    match result {
        Ok(token) => println!("Token: {:?}", token),
        Err(error) => eprintln!("Syntax error: {:?}", error),
    }
}
§Notes
  • It efficiently handles tokenization by consuming input based on patterns specific to PHP syntax.
  • The lexer supports complex features like string interpolation and different numeric formats.
§Errors

Returns Some(Err(SyntaxError)) in cases such as:

  • Unrecognized tokens that do not match any known PHP syntax.
  • Unexpected tokens in a given context, such as an unexpected end of string.
§Panics

This method should not panic under normal operation. If it does, it indicates a bug in the lexer implementation.

§See Also
  • Token: Represents a lexical token with its kind, value, and span.
  • SyntaxError: Represents errors that can occur during lexing.

Trait Implementations§

Source§

impl<'a, 'i> Debug for Lexer<'a, 'i>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<'a, 'i> Freeze for Lexer<'a, 'i>

§

impl<'a, 'i> !RefUnwindSafe for Lexer<'a, 'i>

§

impl<'a, 'i> Send for Lexer<'a, 'i>

§

impl<'a, 'i> Sync for Lexer<'a, 'i>

§

impl<'a, 'i> Unpin for Lexer<'a, 'i>

§

impl<'a, 'i> !UnwindSafe for Lexer<'a, 'i>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more