Skip to main content

Crate parser

Crate parser 

Source
Expand description

This crate can be used to parse Python source code into an Abstract Syntax Tree.

§Overview

The process by which source code is parsed into an AST can be broken down into two general stages: lexical analysis and parsing.

During lexical analysis, the source code is converted into a stream of lexical tokens that represent the smallest meaningful units of the language. For example, the source code print("Hello world") would roughly be converted into the following stream of tokens:

Name("print"), LeftParen, String("Hello world"), RightParen

These tokens are then consumed by the ruff_python_parser, which matches them against a set of grammar rules to verify that the source code is syntactically valid and to construct an AST that represents the source code.

During parsing, the ruff_python_parser consumes the tokens generated by the lexer and constructs a tree representation of the source code. The tree is made up of nodes that represent the different syntactic constructs of the language. If the source code is syntactically invalid, parsing fails and an error is returned. After a successful parse, the AST can be used to perform further analysis on the source code. Continuing with the example above, the AST generated by the ruff_python_parser would roughly look something like this:

node: Expr {
    value: {
        node: Call {
            func: {
                node: Name {
                    id: "print",
                    ctx: Load,
                },
            },
            args: [
                node: Constant {
                    value: Str("Hello World"),
                    kind: None,
                },
            ],
            keywords: [],
        },
    },
},

Note: The Tokens/ASTs shown above are not the exact tokens/ASTs generated by the ruff_python_parser. Refer to the playground for the correct representation.

§Source code layout

The functionality of this crate is split into several modules:

  • token: This module contains the definition of the tokens that are generated by the lexer.
  • lexer: This module contains the lexer and is responsible for generating the tokens.
  • parser: This module contains an interface to the Parsed and is responsible for generating the AST.
  • mode: This module contains the definition of the different modes that the ruff_python_parser can be in.

Modules§

lexer
This module takes care of lexing Python source text.
semantic_errors
SemanticSyntaxChecker for AST-based syntax errors.
typing
This module takes care of parsing a type annotation.

Structs§

ModeParseError
Returned when a given mode is not valid.
ParseError
Represents represent errors that occur during parsing and are returned by the parse_* functions.
ParseOptions
Options for controlling how a source file is parsed.
Parsed
Represents the parsed source code.
UnsupportedSyntaxError
Represents a version-related syntax error detected during parsing.

Enums§

InterpolatedStringErrorType
Represents the different types of errors that can occur during parsing of an f-string or t-string.
LexicalErrorType
Represents the different types of errors that can occur during lexing.
Mode
Control in the different modes by which a source file can be parsed.
ParseErrorType
Represents the different types of errors that can occur during parsing.
UnsupportedSyntaxErrorKind

Traits§

AsMode
A type that can be represented as Mode.

Functions§

parse
Parse the given Python source code using the specified ParseOptions.
parse_expression
Parses a single Python expression.
parse_expression_range
Parses a Python expression for the given range in the source.
parse_module
Parse a full Python module usually consisting of multiple lines.
parse_parenthesized_expression_range
Parses a Python expression as if it is parenthesized.
parse_string_annotation
Parses a Python expression from a string annotation.
parse_unchecked
Parse the given Python source code using the specified ParseOptions.
parse_unchecked_source
Parse the given Python source code using the specified PySourceType.