Struct WordLexer

Source

pub struct WordLexer<'a, 'b> {
    pub lexer: &'a mut Lexer<'b>,
    pub context: WordContext,
}

Expand description

Lexer with additional information for parsing texts and words

Fields§

§lexer: &'a mut Lexer<'b>§context: WordContext

Implementations§

Source §

impl WordLexer<'_, '_>

Source

pub async fn backquote(&mut self) -> Result<Option<TextUnit>>

Parses a command substitution of the form `...`.

If the next character is a backquote, the command substitution is parsed up to the closing backquote (inclusive). It is a syntax error if there is no closing backquote.

Between the backquotes, only backslashes can have special meanings. A backslash is an escape character if it precedes a dollar, backquote, or another backslash. If self.context is Text, double quotes can also be backslash-escaped.

Source §

impl WordLexer<'_, '_>

Source

pub async fn braced_param( &mut self, start_index: usize, ) -> Result<Option<BracedParam>>

Parses a parameter expansion that is enclosed in braces.

The initial $ must have been consumed before calling this function. This functions checks if the next character is an opening brace. If so, the following characters are parsed as a parameter expansion up to and including the closing brace. Otherwise, no characters are consumed and the return value is Ok(None).

The start_index parameter should be the index for the initial $. It is used to construct the result, but this function does not check if it actually points to the $.

Source §

impl WordLexer<'_, '_>

Source

pub async fn dollar_unit(&mut self) -> Result<Option<TextUnit>>

Parses a text unit that starts with $.

If the next character is $, a parameter expansion, command substitution, or arithmetic expansion is parsed. Otherwise, no characters are consumed and the return value is Ok(None).

This function does not parse dollar-single-quotes. They are handled in word_unit.

Source §

impl WordLexer<'_, '_>

Source

pub async fn suffix_modifier(&mut self) -> Result<Modifier>

Parses a suffix modifier, i.e., a modifier other than the length prefix.

If there is a switch, self.context affects how the word of the switch is parsed: If the context is Word, a tilde expansion is recognized at the beginning of the word and any character can be escaped by a backslash. If the context is Text, only $, ", `, \ and } can be escaped and single quotes are not recognized in the word.

Source §

impl WordLexer<'_, '_>

Source

pub async fn text_unit<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Option<TextUnit>>
where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

Parses a TextUnit.

This function parses a literal character, backslash-escaped character, dollar unit, or backquote.

is_delimiter is a function that decides if a character is a delimiter. An unquoted character is parsed only if is_delimiter returns false for it.

is_escapable decides if a character can be escaped by a backslash. When is_escapable returns false, the preceding backslash is considered literal.

If the text unit is a backquote, treatment of \" inside the backquote depends on self.context. If it is Text, \" is an escaped double-quote. If Word, \" is treated literally.

Source §

impl WordLexer<'_, '_>

Source

pub async fn word_unit<F>( &mut self, is_delimiter: F, ) -> Result<Option<WordUnit>>
where F: Fn(char) -> bool,

Parses a word unit.

is_delimiter is a function that decides a character is a delimiter. An unquoted character is parsed only if is_delimiter returns false for it.

The word context defines what characters can be escaped by a backslash. If self.context is Word, any character can be escaped. If Text, then $, ", ` and \ can be escaped as well as delimiters.

This function does not parse tilde expansion. See word.

Source

pub async fn word<F>(&mut self, is_delimiter: F) -> Result<Word>
where F: Fn(char) -> bool,

Parses a word token.

is_delimiter is a function that decides which character is a delimiter. The word ends when an unquoted delimiter is found. To parse a normal word token, you should pass is_token_delimiter_char as is_delimiter. Other functions can be passed to parse a word that ends with different delimiters.

This function does not parse any tilde expansions in the word. To parse them, you need to call Word::parse_tilde_front or Word::parse_tilde_everywhere on the resultant word.

Methods from Deref<Target = Lexer<'b>>§

Source

pub fn disable_line_continuation<'b>(&'b mut self) -> PlainLexer<'b, 'a>

Disables line continuation recognition onward.

By default, peek_char silently skips line continuation sequences. When line continuation is disabled, however, peek_char returns characters literally.

Call enable_line_continuation to switch line continuation recognition on.

This function will panic if line continuation has already been disabled.

Source

pub async fn peek_char(&mut self) -> Result<Option<char>>

Peeks the next character.

If the end of input is reached, Ok(None) is returned. On error, Err(_) is returned.

If line continuation recognition is enabled, combinations of a backslash and a newline are silently skipped before returning the next character. Call disable_line_continuation to switch off line continuation recognition.

This function requires a mutable reference to self since it may need to read the next line if needed.

Source

pub async fn location(&mut self) -> Result<&Location>

Returns the location of the next character.

If there is no more character (that is, it is the end of input), an imaginary location is returned that would be returned if a character existed.

This function requires a mutable reference to self since it needs to peek the next character.

Source

pub fn consume_char(&mut self)

Consumes the next character.

This function must be called after peek_char has successfully returned the character. Consuming a character that has not yet been peeked would result in a panic!

Source

pub fn index(&self) -> usize

Returns the position of the next character, counted from zero.

let mut lexer = Lexer::with_code("abc");
assert_eq!(lexer.index(), 0);
let _ = lexer.peek_char().await;
assert_eq!(lexer.index(), 0);
lexer.consume_char();
assert_eq!(lexer.index(), 1);

Source

pub fn rewind(&mut self, index: usize)

Moves the current position back to the given index so that characters that have been consumed can be read again.

The given index must not be larger than the current index, or this function would panic.

let mut lexer = Lexer::with_code("abc");
let saved_index = lexer.index();
assert_eq!(lexer.peek_char().await, Ok(Some('a')));
lexer.consume_char();
assert_eq!(lexer.peek_char().await, Ok(Some('b')));
lexer.rewind(saved_index);
assert_eq!(lexer.peek_char().await, Ok(Some('a')));

Source

pub fn pending(&self) -> bool

Checks if there is any character that has been read from the input source but not yet consumed.

Source

pub fn flush(&mut self)

Clears the internal buffer of the lexer.

Locations returned from location share a single code instance that is also retained by the lexer. The code grows long as the lexer reads more input. To prevent the code from getting too large, you can call this function that replaces the retained code with a new empty one. The new code’s start_line_number will be incremented by the number of lines in the previous.

Source

pub fn reset(&mut self)

Clears an end-of-input or error status so that the lexer can resume parsing.

This function will be useful only in an interactive shell where the user can continue entering commands even after (s)he sends an end-of-input or is interrupted by a syntax error.

Source

pub async fn consume_char_if<F>(&mut self, f: F) -> Result<Option<&SourceChar>>
where F: FnMut(char) -> bool,

Peeks the next character and, if the given decider function returns true for it, advances the position.

Returns the consumed character if the function returned true. Returns Ok(None) if it returned false or there is no more character.

Source

pub fn source_string(&self, range: Range<usize>) -> String

Extracts a string from the source code range.

This function returns the source code string for the range specified by the argument. The range must specify a valid index. If the index points to a character that have not yet read, this function will panic!.

§Panics

If the argument index is out of bounds, i.e., pointing to an unread character.

Source

pub fn location_range(&self, range: Range<usize>) -> Location

Returns a location for a given range of the source code.

All the characters in the range must have been consumed. If the range refers to an unconsumed character, this function will panic!

If the characters are from more than one Code fragment, the location will only cover the initial portion of the range sharing the same Code.

§Panics

This function will panic if the range refers to an unconsumed character.

If the start index of the range is the end of input, it must have been peeked and the range must be empty, or the function will panic.

Source

pub fn substitute_alias(&mut self, begin: usize, alias: &Rc<Alias>)

Performs alias substitution right before the current position.

This function must be called just after a word has been parsed that matches the name of the argument alias. No check is done in this function that there is a matching word before the current position. The characters starting from the begin index up to the current position are silently replaced with the alias value.

The resulting part of code will be characters with a Source::Alias origin.

After the substitution, the position will be set before the replaced string.

§Panics

If the replaced part is empty, i.e., begin >= self.index().

Source

pub fn is_after_blank_ending_alias(&self, index: usize) -> bool

Tests if the given index is after the replacement string of alias substitution that ends with a blank.

§Panics

If index is larger than the currently read index.

Source

pub async fn inner_program(&mut self) -> Result<String>

Parses an optional compound list that is the content of a command substitution.

This function consumes characters until a token that cannot be the beginning of an and-or list is found and returns the string that was consumed.

Source

pub fn inner_program_boxed( &mut self, ) -> Pin<Box<dyn Future<Output = Result<String>> + '_>>

Like Lexer::inner_program, but returns the future in a pinning box.

Source

pub async fn arithmetic_expansion( &mut self, start_index: usize, ) -> Result<Option<TextUnit>>

Parses an arithmetic expansion.

The initial $ must have been consumed before calling this function. In this function, the next two characters are examined to see if they begin an arithmetic expansion. If the characters are ((, then the arithmetic expansion is parsed, in which case this function consumes up to the closing )) (inclusive). Otherwise, no characters are consumed and the return value is Ok(None).

The start_index parameter should be the index for the initial $. It is used to construct the result, but this function does not check if it actually points to the $.

Source

pub async fn command_substitution( &mut self, start_index: usize, ) -> Result<Option<TextUnit>>

Parses a command substitution of the form $(...).

The initial $ must have been consumed before calling this function. In this function, the next character is examined to see if it begins a command substitution. If it is (, the following characters are parsed as commands to find a matching ), which will be consumed before this function returns. Otherwise, no characters are consumed and the return value is Ok(None).

The start_index parameter should be the index for the initial $. It is used to construct the result, but this function does not check if it actually points to the $.

Source

pub async fn escape_unit(&mut self) -> Result<Option<EscapeUnit>>

Parses an escape unit.

This function tests if the next character is an escape sequence and returns it if it is. If the next character is not an escape sequence, it returns as EscapeUnit::Literal. If there is no next character, it returns Ok(None). It returns an error if an invalid escape sequence is found.

This function should be called in a context where line continuations are disabled, so that backslash-newline pairs are not removed before they are parsed as escape sequences.

Source

pub async fn escaped_string<F>( &mut self, is_delimiter: F, ) -> Result<EscapedString>
where F: FnMut(char) -> bool,

Parses an escaped string.

The is_delimiter function is called with each character in the string to determine if it is a delimiter. If is_delimiter returns true, the character is not consumed and the function returns the string up to that point. Otherwise, the character is consumed and the function continues.

The string may contain escape sequences as defined in EscapeUnit.

Escaped strings typically appear as the content of dollar-single-quotes, so is_delimiter is usually |c| c == '\''.

Source

pub async fn line(&mut self) -> Result<String>

Reads a line literally.

This function recognizes no quotes or expansions. Starting from the current position, the line is read up to (but not including) the terminating newline.

Source

pub async fn here_doc_content(&mut self, here_doc: &HereDoc) -> Result<()>

Parses the content of a here-document.

This function reads here-document content corresponding to the here-document operator represented by the argument and fills here_doc.content with the results. The argument does not have to be mutable because here_doc.content is a RefCell. Note that this function will panic if here_doc.content has been borrowed, and that this function keeps a borrow from here_doc.content until the returned future resolves to the final result.

In case of an error, partial results may be left in here_doc.content.

Source

pub async fn skip_if<F>(&mut self, f: F) -> Result<bool>
where F: FnMut(char) -> bool,

Skips a character if the given function returns true for it.

Returns Ok(true) if the character was skipped, Ok(false) if the function returned false, and Err(_) if an error occurred, respectively.

skip_if is a simpler version of consume_char_if.

Source

pub async fn skip_blanks(&mut self) -> Result<()>

Skips blank characters until reaching a non-blank.

Source

pub async fn skip_comment(&mut self) -> Result<()>

Skips a comment, if any.

A comment ends just before a newline. The newline is not part of the comment.

This function does not recognize line continuation inside the comment.

Source

pub async fn skip_blanks_and_comment(&mut self) -> Result<()>

Skips blank characters and a comment, if any.

This function is the same as skip_blanks followed by skip_comment.

Source

pub async fn operator(&mut self) -> Result<Option<Token>>

Parses an operator token.

Source

pub async fn raw_param( &mut self, start_index: usize, ) -> Result<Option<TextUnit>>

Parses a parameter expansion that is not enclosed in braces.

The initial $ must have been consumed before calling this function. This functions checks if the next character is a valid POSIXly-portable parameter name. If so, the name is consumed and returned. Otherwise, no characters are consumed and the return value is Ok(None).

The start_index parameter should be the index for the initial $. It is used to construct the result, but this function does not check if it actually points to the $.

Source

pub async fn text<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Text>
where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

Parses a text, i.e., a (possibly empty) sequence of TextUnits.

is_delimiter tests if an unquoted character is a delimiter. When is_delimiter returns true, the parser stops parsing and returns the text up to the delimiter.

is_escapable tests if a backslash can escape a character. When the parser founds an unquoted backslash, the next character is passed to is_escapable. If is_escapable returns true, the backslash is treated as a valid escape (TextUnit::Backslashed). Otherwise, it ia a literal (TextUnit::Literal).

is_escapable also affects escaping of double-quotes inside backquotes. See text_unit for details. Note that this function calls text_unit with WordContext::Text.

Source

pub async fn text_with_parentheses<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Text>
where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

Parses a text that may contain nested parentheses.

This function works similarly to text. However, if an unquoted ( is found in the text, all text units are parsed up to the next matching unquoted ). Inside the parentheses, the is_delimiter function is ignored and all non-special characters are parsed as literal word units. After finding the ), this function continues parsing to find a delimiter (as per is_delimiter) or another parentheses.

Nested parentheses are supported: the number of (s and )s must match. In other words, the final delimiter is recognized only outside outermost parentheses.

Source

pub async fn token(&mut self) -> Result<Token>

Parses a token.

If there is no more token that can be parsed, the result is a token with an empty word and EndOfInput token identifier.

Trait Implementations§

Source §

impl<'a, 'b> Debug for WordLexer<'a, 'b>

Source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Source §

impl<'b> Deref for WordLexer<'_, 'b>

Source §

type Target = Lexer<'b>

The resulting type after dereferencing.

Source §

fn deref(&self) -> &Lexer<'b>

Dereferences the value.

Source §

impl<'b> DerefMut for WordLexer<'_, 'b>

Source §

fn deref_mut(&mut self) -> &mut Lexer<'b>

Mutably dereferences the value.

Auto Trait Implementations§

§

impl<'a, 'b> !UnwindSafe for WordLexer<'a, 'b>

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

impl<P, T> Receiver for P
where P: Deref<Target = T> + ?Sized, T: ?Sized,

Source §

type Target = T

🔬This is a nightly-only experimental API. (arbitrary_self_types)

The target type on which the method may be called.

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

Struct WordLexer Copy item path

Fields§

Implementations§

impl WordLexer<'_, '_>

pub async fn backquote(&mut self) -> Result<Option<TextUnit>>

impl WordLexer<'_, '_>

pub async fn braced_param( &mut self, start_index: usize, ) -> Result<Option<BracedParam>>

impl WordLexer<'_, '_>

pub async fn dollar_unit(&mut self) -> Result<Option<TextUnit>>

impl WordLexer<'_, '_>

pub async fn suffix_modifier(&mut self) -> Result<Modifier>

impl WordLexer<'_, '_>

pub async fn text_unit<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Option<TextUnit>>where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

impl WordLexer<'_, '_>

pub async fn word_unit<F>( &mut self, is_delimiter: F, ) -> Result<Option<WordUnit>>where F: Fn(char) -> bool,

pub async fn word<F>(&mut self, is_delimiter: F) -> Result<Word>where F: Fn(char) -> bool,

Methods from Deref<Target = Lexer<'b>>§

pub fn disable_line_continuation<'b>(&'b mut self) -> PlainLexer<'b, 'a>

pub async fn peek_char(&mut self) -> Result<Option<char>>

pub async fn location(&mut self) -> Result<&Location>

pub fn consume_char(&mut self)

pub fn index(&self) -> usize

pub fn rewind(&mut self, index: usize)

pub fn pending(&self) -> bool

pub fn flush(&mut self)

pub fn reset(&mut self)

pub async fn consume_char_if<F>(&mut self, f: F) -> Result<Option<&SourceChar>>where F: FnMut(char) -> bool,

pub fn source_string(&self, range: Range<usize>) -> String

§Panics

pub fn location_range(&self, range: Range<usize>) -> Location

§Panics

pub fn substitute_alias(&mut self, begin: usize, alias: &Rc<Alias>)

§Panics

pub fn is_after_blank_ending_alias(&self, index: usize) -> bool

§Panics

pub async fn inner_program(&mut self) -> Result<String>

pub fn inner_program_boxed( &mut self, ) -> Pin<Box<dyn Future<Output = Result<String>> + '_>>

pub async fn arithmetic_expansion( &mut self, start_index: usize, ) -> Result<Option<TextUnit>>

pub async fn command_substitution( &mut self, start_index: usize, ) -> Result<Option<TextUnit>>

pub async fn escape_unit(&mut self) -> Result<Option<EscapeUnit>>

pub async fn escaped_string<F>( &mut self, is_delimiter: F, ) -> Result<EscapedString>where F: FnMut(char) -> bool,

pub async fn line(&mut self) -> Result<String>

pub async fn here_doc_content(&mut self, here_doc: &HereDoc) -> Result<()>

pub async fn skip_if<F>(&mut self, f: F) -> Result<bool>where F: FnMut(char) -> bool,

pub async fn skip_blanks(&mut self) -> Result<()>

pub async fn skip_comment(&mut self) -> Result<()>

pub async fn skip_blanks_and_comment(&mut self) -> Result<()>

pub async fn operator(&mut self) -> Result<Option<Token>>

pub async fn raw_param( &mut self, start_index: usize, ) -> Result<Option<TextUnit>>

pub async fn text<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Text>where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

pub async fn text_with_parentheses<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Text>where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

pub async fn token(&mut self) -> Result<Token>

Trait Implementations§

impl<'a, 'b> Debug for WordLexer<'a, 'b>

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl<'b> Deref for WordLexer<'_, 'b>

type Target = Lexer<'b>

fn deref(&self) -> &Lexer<'b>

impl<'b> DerefMut for WordLexer<'_, 'b>

fn deref_mut(&mut self) -> &mut Lexer<'b>

Auto Trait Implementations§

impl<'a, 'b> Freeze for WordLexer<'a, 'b>

impl<'a, 'b> !RefUnwindSafe for WordLexer<'a, 'b>

impl<'a, 'b> !Send for WordLexer<'a, 'b>

impl<'a, 'b> !Sync for WordLexer<'a, 'b>

impl<'a, 'b> Unpin for WordLexer<'a, 'b>

impl<'a, 'b> !UnwindSafe for WordLexer<'a, 'b>

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

Struct WordLexer

pub async fn text_unit<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Option<TextUnit>>
where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

pub async fn word_unit<F>( &mut self, is_delimiter: F, ) -> Result<Option<WordUnit>>
where F: Fn(char) -> bool,

pub async fn word<F>(&mut self, is_delimiter: F) -> Result<Word>
where F: Fn(char) -> bool,

pub async fn consume_char_if<F>(&mut self, f: F) -> Result<Option<&SourceChar>>
where F: FnMut(char) -> bool,

pub async fn escaped_string<F>( &mut self, is_delimiter: F, ) -> Result<EscapedString>
where F: FnMut(char) -> bool,

pub async fn skip_if<F>(&mut self, f: F) -> Result<bool>
where F: FnMut(char) -> bool,

pub async fn text<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Text>
where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

pub async fn text_with_parentheses<F, G>( &mut self, is_delimiter: F, is_escapable: G, ) -> Result<Text>
where F: FnMut(char) -> bool, G: FnMut(char) -> bool,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<P, T> Receiver for P
where P: Deref<Target = T> + ?Sized, T: ?Sized,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,