Struct Input

Source
pub struct Input<'a> { /* private fields */ }
Expand description

A struct representing the input code being lexed.

The Input struct provides methods to read, peek, consume, and skip characters from the bytes input code while keeping track of the current position (line, column, offset).

Implementations§

Source§

impl<'a> Input<'a>

Source

pub fn new(source: SourceIdentifier, bytes: &'a [u8]) -> Self

Creates a new Input instance from the given input.

§Arguments
  • input - A byte slice representing the input code to be processed.
§Returns

A new Input instance initialized at the beginning of the input.

Source

pub fn anchored_at(bytes: &'a [u8], anchor_position: Position) -> Self

Creates a new Input instance representing a byte slice that is “anchored” at a specific absolute position within a larger source file.

This is useful when lexing a subset (slice) of a source file, as it allows generated tokens to retain accurate absolute positions and spans relative to the original file.

The internal cursor (offset) starts at 0 relative to the bytes slice, but the absolute position is calculated relative to the anchor_position.

§Arguments
  • bytes - A byte slice representing the input code subset to be lexed.
  • anchor_position - The absolute Position in the original source file where the provided bytes slice begins.
§Returns

A new Input instance ready to lex the bytes, maintaining positions relative to anchor_position.

Source

pub const fn source_identifier(&self) -> SourceIdentifier

Returns the source identifier of the input code.

Source

pub const fn current_position(&self) -> Position

Returns the absolute current Position of the lexer within the original source file.

It calculates this by adding the internal offset (progress within the current byte slice) to the starting_position the Input was initialized with.

Source

pub const fn current_offset(&self) -> usize

Returns the current internal byte offset relative to the start of the input slice.

This indicates how many bytes have been consumed from the current bytes slice. To get the absolute position in the original source file, use current_position().

Source

pub const fn is_empty(&self) -> bool

Returns true if the input slice is empty (length is zero).

Source

pub const fn len(&self) -> usize

Returns the total length in bytes of the input slice being processed.

Source

pub const fn has_reached_eof(&self) -> bool

Checks if the current position is at the end of the input.

§Returns

true if the current offset is greater than or equal to the input length; false otherwise.

Source

pub fn next(&mut self)

Advances the current position by one character, updating line and column numbers.

Handles different line endings (\n, \r, \r\n) and updates line and column counters accordingly.

If the end of input is reached, no action is taken.

Source

pub fn skip(&mut self, count: usize)

Skips the next count characters, advancing the position accordingly.

Updates line and column numbers as it advances.

§Arguments
  • count - The number of characters to skip.
Source

pub fn consume(&mut self, count: usize) -> &'a [u8]

Consumes the next count characters and returns them as a slice.

Advances the position by count characters.

§Arguments
  • count - The number of characters to consume.
§Returns

A byte slice containing the consumed characters.

Source

pub fn consume_remaining(&mut self) -> &'a [u8]

Consumes all remaining characters from the current position to the end of input.

Advances the position to EOF.

§Returns

A byte slice containing the remaining characters.

Source

pub fn consume_until( &mut self, search: &[u8], ignore_ascii_case: bool, ) -> &'a [u8]

Consumes characters until the given byte slice is found.

Advances the position to the start of the search slice if found, or to EOF if not found.

§Arguments
  • search - The byte slice to search for.
  • ignore_ascii_case - Whether to ignore ASCII case when comparing characters.
§Returns

A byte slice containing the consumed characters.

Source

pub fn consume_through(&mut self, search: u8) -> &'a [u8]

Source

pub fn consume_whitespaces(&mut self) -> &'a [u8]

Consumes whitespaces until a non-whitespace character is found.

§Returns

A byte slice containing the consumed whitespaces.

Source

pub fn read(&self, n: usize) -> &'a [u8]

Reads the next n characters without advancing the position.

§Arguments
  • n - The number of characters to read.
§Returns

A byte slice containing the next n characters.

Source

pub fn read_at(&self, at: usize) -> &'a u8

Reads a single byte at a specific byte offset within the input slice, without advancing the internal cursor.

This provides direct, low-level access to the underlying byte data.

§Arguments
  • at - The zero-based byte offset within the input slice (self.bytes) from which to read the byte.
§Returns

A reference to the byte located at the specified offset at.

§Panics

This method panics if the provided at offset is out of bounds for the input byte slice (i.e., if at >= self.bytes.len()).

Source

pub fn is_at(&self, search: &[u8], ignore_ascii_case: bool) -> bool

Checks if the input at the current position matches the given byte slice.

§Arguments
  • search - The byte slice to compare against the input.
  • ignore_ascii_case - Whether to ignore ASCII case when comparing.
§Returns

true if the next bytes match search; false otherwise.

Source

pub const fn match_sequence_ignore_whitespace( &self, search: &[u8], ignore_ascii_case: bool, ) -> Option<usize>

Attempts to match the given byte sequence at the current position, ignoring whitespace in the input.

This method tries to match the provided byte slice search against the input starting from the current position, possibly ignoring ASCII case. Whitespace characters in the input are skipped during matching, but their length is included in the returned length.

Importantly, the method does not include any trailing whitespace after the matched sequence in the returned length.

For example, to match the sequence (string), the input could be (string), ( string ), ( string ), etc., and this method would return the total length of the input consumed to match (string), including any whitespace within the matched sequence, but excluding any whitespace after it.

§Arguments
  • search - The byte slice to match against the input.
  • ignore_ascii_case - If true, ASCII case is ignored during comparison.
§Returns
  • Some(length) - If the input matches search (ignoring whitespace within the sequence), returns the total length of the input consumed to match search, including any skipped whitespace within the matched sequence.
  • None - If the input does not match search.
§Examples
use mago_syntax_core::input::Input;
use mago_source::SourceIdentifier;

let source = SourceIdentifier::dummy();

// Given input "( string ) x", starting at offset 0:
let input = Input::new(source.clone(), b"( string ) x");
assert_eq!(input.match_sequence_ignore_whitespace(b"(string)", true), Some(10)); // 10 bytes consumed up to ')'

// Given input "(int)", with no whitespace:
let input = Input::new(source.clone(), b"(int)");
assert_eq!(input.match_sequence_ignore_whitespace(b"(int)", true), Some(5)); // 5 bytes consumed

// Given input "(  InT   )abc", ignoring ASCII case:
let input = Input::new(source.clone(), b"(  InT   )abc");
assert_eq!(input.match_sequence_ignore_whitespace(b"(int)", true), Some(10)); // 10 bytes consumed up to ')'

// Given input "(integer)", attempting to match "(int)":
let input = Input::new(source.clone(), b"(integer)");
assert_eq!(input.match_sequence_ignore_whitespace(b"(int)", false), None); // Does not match

// Trailing whitespace after ')':
let input = Input::new(source.clone(), b"(int)   x");
assert_eq!(input.match_sequence_ignore_whitespace(b"(int)", true), Some(5)); // Length up to ')', excludes spaces after ')'
Source

pub fn peek(&self, offset: usize, n: usize) -> &'a [u8]

Peeks ahead i characters and reads the next n characters without advancing the position.

§Arguments
  • offset - The number of characters to skip before reading.
  • n - The number of characters to read after skipping.
§Returns

A byte slice containing the peeked characters.

Trait Implementations§

Source§

impl<'a> Clone for Input<'a>

Source§

fn clone(&self) -> Input<'a>

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<'a> Debug for Input<'a>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'a> Hash for Input<'a>

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl<'a> Ord for Input<'a>

Source§

fn cmp(&self, other: &Input<'a>) -> Ordering

This method returns an Ordering between self and other. Read more
1.21.0 · Source§

fn max(self, other: Self) -> Self
where Self: Sized,

Compares and returns the maximum of two values. Read more
1.21.0 · Source§

fn min(self, other: Self) -> Self
where Self: Sized,

Compares and returns the minimum of two values. Read more
1.50.0 · Source§

fn clamp(self, min: Self, max: Self) -> Self
where Self: Sized,

Restrict a value to a certain interval. Read more
Source§

impl<'a> PartialEq for Input<'a>

Source§

fn eq(&self, other: &Input<'a>) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl<'a> PartialOrd for Input<'a>

Source§

fn partial_cmp(&self, other: &Input<'a>) -> Option<Ordering>

This method returns an ordering between self and other values if one exists. Read more
1.0.0 · Source§

fn lt(&self, other: &Rhs) -> bool

Tests less than (for self and other) and is used by the < operator. Read more
1.0.0 · Source§

fn le(&self, other: &Rhs) -> bool

Tests less than or equal to (for self and other) and is used by the <= operator. Read more
1.0.0 · Source§

fn gt(&self, other: &Rhs) -> bool

Tests greater than (for self and other) and is used by the > operator. Read more
1.0.0 · Source§

fn ge(&self, other: &Rhs) -> bool

Tests greater than or equal to (for self and other) and is used by the >= operator. Read more
Source§

impl<'a> Copy for Input<'a>

Source§

impl<'a> Eq for Input<'a>

Source§

impl<'a> StructuralPartialEq for Input<'a>

Auto Trait Implementations§

§

impl<'a> Freeze for Input<'a>

§

impl<'a> RefUnwindSafe for Input<'a>

§

impl<'a> Send for Input<'a>

§

impl<'a> Sync for Input<'a>

§

impl<'a> Unpin for Input<'a>

§

impl<'a> UnwindSafe for Input<'a>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

Source§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more