rustpython_parser/lib.rs
1//! This crate can be used to parse Python source code into an Abstract
2//! Syntax Tree.
3//!
4//! ## Overview:
5//!
6//! The process by which source code is parsed into an AST can be broken down
7//! into two general stages: [lexical analysis] and [parsing].
8//!
9//! During lexical analysis, the source code is converted into a stream of lexical
10//! tokens that represent the smallest meaningful units of the language. For example,
11//! the source code `print("Hello world")` would _roughly_ be converted into the following
12//! stream of tokens:
13//!
14//! ```text
15//! Name("print"), LeftParen, String("Hello world"), RightParen
16//! ```
17//!
18//! these tokens are then consumed by the parser, which matches them against a set of
19//! grammar rules to verify that the source code is syntactically valid and to construct
20//! an AST that represents the source code.
21//!
22//! During parsing, the parser consumes the tokens generated by the lexer and constructs
23//! a tree representation of the source code. The tree is made up of nodes that represent
24//! the different syntactic constructs of the language. If the source code is syntactically
25//! invalid, parsing fails and an error is returned. After a successful parse, the AST can
26//! be used to perform further analysis on the source code. Continuing with the example
27//! above, the AST generated by the parser would _roughly_ look something like this:
28//!
29//! ```text
30//! node: Expr {
31//! value: {
32//! node: Call {
33//! func: {
34//! node: Name {
35//! id: "print",
36//! ctx: Load,
37//! },
38//! },
39//! args: [
40//! node: Constant {
41//! value: Str("Hello World"),
42//! kind: None,
43//! },
44//! ],
45//! keywords: [],
46//! },
47//! },
48//! },
49//!```
50//!
51//! Note: The Tokens/ASTs shown above are not the exact tokens/ASTs generated by the parser.
52//!
53//! ## Source code layout:
54//!
55//! The functionality of this crate is split into several modules:
56//!
57//! - token: This module contains the definition of the tokens that are generated by the lexer.
58//! - [lexer]: This module contains the lexer and is responsible for generating the tokens.
59//! - parser: This module contains an interface to the parser and is responsible for generating the AST.
60//! - Functions and strings have special parsing requirements that are handled in additional files.
61//! - mode: This module contains the definition of the different modes that the parser can be in.
62//!
63//! # Examples
64//!
65//! For example, to get a stream of tokens from a given string, one could do this:
66//!
67//! ```
68//! use rustpython_parser::{lexer::lex, Mode};
69//!
70//! let python_source = r#"
71//! def is_odd(i):
72//! return bool(i & 1)
73//! "#;
74//! let mut tokens = lex(python_source, Mode::Module);
75//! assert!(tokens.all(|t| t.is_ok()));
76//! ```
77//!
78//! These tokens can be directly fed into the parser to generate an AST:
79//!
80//! ```
81//! use rustpython_parser::{lexer::lex, Mode, parse_tokens};
82//!
83//! let python_source = r#"
84//! def is_odd(i):
85//! return bool(i & 1)
86//! "#;
87//! let tokens = lex(python_source, Mode::Module);
88//! let ast = parse_tokens(tokens, Mode::Module, "<embedded>");
89//!
90//! assert!(ast.is_ok());
91//! ```
92//!
93//! Alternatively, you can use one of the other `parse_*` functions to parse a string directly without using a specific
94//! mode or tokenizing the source beforehand:
95//!
96//! ```
97//! use rustpython_parser::{Parse, ast};
98//!
99//! let python_source = r#"
100//! def is_odd(i):
101//! return bool(i & 1)
102//! "#;
103//! let ast = ast::Suite::parse(python_source, "<embedded>");
104//!
105//! assert!(ast.is_ok());
106//! ```
107//!
108//! [lexical analysis]: https://en.wikipedia.org/wiki/Lexical_analysis
109//! [parsing]: https://en.wikipedia.org/wiki/Parsing
110//! [lexer]: crate::lexer
111
112#![doc(html_logo_url = "https://raw.githubusercontent.com/RustPython/RustPython/main/logo.png")]
113#![doc(html_root_url = "https://docs.rs/rustpython-parser/")]
114
115pub use rustpython_ast as ast;
116#[cfg(feature = "location")]
117pub use rustpython_parser_core::source_code;
118pub use rustpython_parser_core::{text_size, Mode};
119
120mod function;
121// Skip flattening lexer to distinguish from full parser
122mod context;
123pub mod lexer;
124mod parser;
125mod soft_keywords;
126mod string;
127mod token;
128
129pub use parser::{parse, parse_starts_at, parse_tokens, Parse, ParseError, ParseErrorType};
130pub use string::FStringErrorType;
131pub use token::{StringKind, Tok};
132
133#[allow(deprecated)]
134pub use parser::{parse_expression, parse_expression_starts_at, parse_program};
135
136#[rustfmt::skip]
137mod python {
138 #![allow(clippy::all)]
139 #![allow(unused)]
140
141 #[cfg(feature = "lalrpop")]
142 include!(concat!(env!("OUT_DIR"), "/src/python.rs"));
143
144 #[cfg(not(feature = "lalrpop"))]
145 include!("python.rs");
146}