This crate can be used to parse Python source code into an Abstract Syntax Tree.
Overview
The process by which source code is parsed into an AST can be broken down into two general stages: lexical analysis and parsing.
During lexical analysis, the source code is converted into a stream of lexical
tokens that represent the smallest meaningful units of the language. For example,
the source code print("Hello world") would roughly be converted into the following
stream of tokens:
Name("print"), LeftParen, String("Hello world"), RightParen
These tokens are then consumed by the ruff_python_parser, which matches them against a set of
grammar rules to verify that the source code is syntactically valid and to construct
an AST that represents the source code.
During parsing, the ruff_python_parser consumes the tokens generated by the lexer and constructs
a tree representation of the source code. The tree is made up of nodes that represent
the different syntactic constructs of the language. If the source code is syntactically
invalid, parsing fails and an error is returned. After a successful parse, the AST can
be used to perform further analysis on the source code. Continuing with the example
above, the AST generated by the ruff_python_parser would roughly look something like this:
node: Expr {
value: {
node: Call {
func: {
node: Name {
id: "print",
ctx: Load,
},
},
args: [
node: Constant {
value: Str("Hello World"),
kind: None,
},
],
keywords: [],
},
},
},
Note: The Tokens/ASTs shown above are not the exact tokens/ASTs generated by the ruff_python_parser.
Refer to the playground for the correct representation.
Source code layout
The functionality of this crate is split into several modules:
- token: This module contains the definition of the tokens that are generated by the lexer.
- lexer: This module contains the lexer and is responsible for generating the tokens.
- parser: This module contains an interface to the [Parsed] and is responsible for generating the AST.
- mode: This module contains the definition of the different modes that the
ruff_python_parsercan be in.