Skip to main content

Module lexer

Module lexer 

Source
Expand description

Tokenization module for VB6 source code.

Provides functionality to tokenize VB6 source code into a stream of tokens.

§Example

use vb6parse::language::Token;
use vb6parse::lexer::tokenize;
use vb6parse::io::SourceStream;

let mut input = SourceStream::new("test.bas", "Dim x As Integer");
let (result_opt, failures) = tokenize(&mut input).unpack();

if !failures.is_empty() {
  eprintln!("Errors during tokenization:");
  for failure in failures  {
      failure.print();
  }
  panic!("Failed to parse vb6 code.");
}

let tokens = result_opt.expect("Tokens should be present.");

assert_eq!(tokens.len(), 7);
assert_eq!(tokens[0], ("Dim", Token::DimKeyword));
assert_eq!(tokens[1], (" ", Token::Whitespace));
assert_eq!(tokens[2], ("x", Token::Identifier));
assert_eq!(tokens[3], (" ", Token::Whitespace));
assert_eq!(tokens[4], ("As", Token::AsKeyword));
assert_eq!(tokens[5], (" ", Token::Whitespace));
assert_eq!(tokens[6], ("Integer", Token::IntegerKeyword));

§Overview

The tokenize module provides functionality to parse VB6 source code into a stream of tokens. This is a crucial step in the parsing process, as it breaks down the source code into manageable pieces that can be further analyzed and processed.

The main function in this module is tokenize, which takes a SourceStream as input and returns a ParseResult containing a TokenStream and/or a list of errors.

The module uses lookup tables to efficiently identify keywords and symbols in the VB6 language. These tables map strings to their corresponding Token enum variants, allowing for quick identification during the tokenization process.

The tokenization process handles various types of tokens, including keywords, symbols, identifiers, literals (string, numeric, date), comments, and whitespace.

§See Also

  • SourceStream: Low-level character stream with offset tracking and line/column info
  • TokenStream: Tokenized stream of VB6 tokens
  • ParseResult: Result type for parsing operations, including errors
  • Token: Enum representing VB6 tokens
  • ErrorDetails: Detailed error information for parsing operations

Re-exports§

pub use crate::language::Token;
pub use token_stream::TokenStream;

Modules§

token_stream
Module defines the TokenStream structure for managing a stream of tokens with positional tracking.

Functions§

take_matching_text
Attempts to take a matching text from the input stream, ensuring that the match is not part of a larger identifier.
tokenize
Parses VB6 code into a token stream.
tokenize_without_whitespaces
Parses VB6 code into a token stream, excluding whitespace tokens.

Type Aliases§

LineCommentTuple
Type alias for a tuple representing a line comment and an optional newline token. The first element of the tuple is another tuple containing the comment text and its corresponding token. The second element is an optional tuple containing the newline text and its corresponding token.
TextTokenTuple
Type alias for a tuple representing text and its corresponding token.