Expand description
§Carbon Parser
A high-performance parser for Google’s Carbon programming language, built with Rust using the Pest parsing framework.
§Overview
This library provides a complete parsing solution for Carbon source code, enabling developers to build tools, compilers, analyzers, and other applications that work with Carbon programs. The parser is built on top of Pest, a PEG (Parsing Expression Grammar) parser that provides excellent performance and clear error messages.
§Features
- Fast and reliable parsing using Pest PEG parser
- Complete support for Carbon language constructs
- Type-safe AST manipulation through Rust’s type system
- Detailed error messages with line and column information
- Zero-copy parsing for efficient memory usage
- Command-line interface for file parsing
- Comprehensive test coverage
§Installation
Add this to your Cargo.toml:
[dependencies]
carbon-parser = "0.1.2"
pest = "2.7"§Quick Start
use carbon_parser::parse_carbon;
let code = r#"
fn main() -> i32 {
var x: i32 = 42;
return x;
}
"#;
match parse_carbon(code) {
Ok(pairs) => {
println!("Parsing successful");
for pair in pairs {
println!("Rule: {:?}", pair.as_rule());
}
}
Err(e) => eprintln!("Parse error: {}", e),
}§Supported Language Constructs
§Functions
The parser supports function declarations with parameters and return types:
use carbon_parser::parse_function_decl;
// Simple function without parameters
let code = "fn test() -> i32 { return 42; }";
assert!(parse_function_decl(code).is_ok());
// Function with single parameter
let code = "fn square(x: i32) -> i32 { return x; }";
assert!(parse_function_decl(code).is_ok());
// Function with multiple parameters
let code = "fn add(x: i32, y: i32, z: i32) -> i32 { return x; }";
assert!(parse_function_decl(code).is_ok());
// Function without return type
let code = "fn print_hello() { return 0; }";
assert!(parse_function_decl(code).is_ok());
// Function with different parameter types
let code = "fn process(name: String, age: i32, active: bool) -> bool { return active; }";
assert!(parse_function_decl(code).is_ok());§Variables
Variable declarations with type annotations and optional initialization:
use carbon_parser::parse_var_decl;
// Variable with initialization
let code = "var x: i32 = 42;";
assert!(parse_var_decl(code).is_ok());
// Variable without initialization
let code = "var y: bool;";
assert!(parse_var_decl(code).is_ok());
// String variable
let code = r#"var name: String = "John";"#;
assert!(parse_var_decl(code).is_ok());
// Float variable
let code = "var pi: f64 = 3.14;";
assert!(parse_var_decl(code).is_ok());
// Variable with expression
let code = "var sum: i32 = 10 + 20;";
assert!(parse_var_decl(code).is_ok());§Expressions
The parser handles various expression types including literals, identifiers, binary operations, and function calls:
use carbon_parser::parse_expression;
// Integer literal
assert!(parse_expression("42").is_ok());
// Float literal
assert!(parse_expression("3.14").is_ok());
// Boolean literals
assert!(parse_expression("true").is_ok());
assert!(parse_expression("false").is_ok());
// String literal
assert!(parse_expression(r#""Hello, World!""#).is_ok());
// Identifier
assert!(parse_expression("variable_name").is_ok());
// Binary operations
assert!(parse_expression("10 + 20").is_ok());
assert!(parse_expression("10 + 20 * 30 - 5").is_ok());
// Function call
assert!(parse_expression("calculate(x, y)").is_ok());
// Comparison operators
assert!(parse_expression("x == y").is_ok());
assert!(parse_expression("x != y").is_ok());
assert!(parse_expression("x < y").is_ok());
assert!(parse_expression("x > y").is_ok());§Type System
The parser supports both primitive and custom types:
use carbon_parser::parse_type_name;
// Integer types
assert!(parse_type_name("i32").is_ok());
assert!(parse_type_name("i64").is_ok());
// Float types
assert!(parse_type_name("f32").is_ok());
assert!(parse_type_name("f64").is_ok());
// Boolean type
assert!(parse_type_name("bool").is_ok());
// String type
assert!(parse_type_name("String").is_ok());
// Custom types
assert!(parse_type_name("CustomType").is_ok());§Complete Programs
The main parsing function handles complete Carbon programs:
use carbon_parser::parse_carbon;
// Empty program
let code = "";
assert!(parse_carbon(code).is_ok());
// Single function program
let code = r#"
fn main() -> i32 {
return 0;
}
"#;
assert!(parse_carbon(code).is_ok());
// Multiple functions
let code = r#"
fn add(x: i32, y: i32) -> i32 {
return x;
}
fn main() -> i32 {
return 0;
}
"#;
assert!(parse_carbon(code).is_ok());
// Program with variables
let code = r#"
var global_x: i32 = 100;
fn main() -> i32 {
var local_y: i32 = 200;
return 0;
}
"#;
assert!(parse_carbon(code).is_ok());
// Program with comments
let code = r#"
// This is a comment
fn main() -> i32 {
/* Multi-line
comment */
return 0;
}
"#;
assert!(parse_carbon(code).is_ok());§Working with Parse Trees
After parsing, you can traverse and inspect the resulting parse tree:
use carbon_parser::{parse_carbon, Rule};
let code = r#"
fn calculate(x: i32) -> i32 {
var result: i32 = 42;
return result;
}
"#;
let pairs = parse_carbon(code).unwrap();
for pair in pairs {
println!("Top-level rule: {:?}", pair.as_rule());
println!("Text: {}", pair.as_str());
// Traverse nested pairs
for inner_pair in pair.into_inner() {
println!(" Nested rule: {:?}", inner_pair.as_rule());
println!(" Nested text: {}", inner_pair.as_str());
}
}§Error Handling
The parser provides detailed error messages indicating the exact location and nature of syntax errors:
use carbon_parser::parse_carbon;
// Invalid syntax - missing closing parenthesis
let code = "fn main( { }";
assert!(parse_carbon(code).is_err());
// Missing semicolon
let code = "var x: i32 = 42";
assert!(parse_carbon(code).is_err());
// Invalid identifier starting with number
let code = "var 123invalid: i32 = 0;";
assert!(parse_carbon(code).is_err());
// Proper error handling
match parse_carbon("fn broken( { }") {
Ok(_) => println!("Success"),
Err(e) => {
eprintln!("Parse error occurred:");
eprintln!("{}", e);
// The error will show the exact line and column where parsing failed
}
}§Command Line Interface
The parser includes a CLI tool for parsing Carbon files:
# Parse a Carbon file
cargo run -- parse example.carbon
# Parse with verbose output showing the parse tree
cargo run -- parse example.carbon --verbose
# Show author information
cargo run -- authors§Advanced Usage
§Building a Syntax Highlighter
use carbon_parser::{parse_carbon, Rule};
fn highlight_code(code: &str) -> Result<String, Box<dyn std::error::Error>> {
let pairs = parse_carbon(code)?;
let mut highlighted = String::new();
for pair in pairs {
match pair.as_rule() {
Rule::function_decl => {
highlighted.push_str(&format!("<span class='function'>{}</span>",
pair.as_str()));
}
Rule::var_decl => {
highlighted.push_str(&format!("<span class='variable'>{}</span>",
pair.as_str()));
}
_ => highlighted.push_str(pair.as_str()),
}
}
Ok(highlighted)
}§Code Analysis
use carbon_parser::{parse_carbon, Rule};
use std::collections::HashMap;
fn count_functions(code: &str) -> Result<usize, Box<dyn std::error::Error>> {
let pairs = parse_carbon(code)?;
let mut count = 0;
for pair in pairs {
if matches!(pair.as_rule(), Rule::function_decl) {
count += 1;
}
}
Ok(count)
}§Performance Considerations
This parser is designed for optimal performance:
- Zero-copy parsing: The parse tree references the original input string rather than copying data, minimizing memory allocations.
- Lazy evaluation: Parse tree nodes are created on-demand as you traverse the tree.
- Efficient grammar: The PEG grammar is optimized to minimize backtracking.
For large files (>1MB), consider:
- Using streaming or incremental parsing if available
- Processing the parse tree in chunks
- Using the
--verboseflag judiciously in the CLI tool
§Error Recovery
When parsing fails, the error type provides detailed information:
use carbon_parser::{parse_carbon, ParseError};
match parse_carbon("invalid code") {
Err(ParseError::PestError(e)) => {
// Pest error with line/column information
eprintln!("Syntax error at: {}", e);
}
Err(ParseError::SyntaxError(msg)) => {
// Custom syntax error
eprintln!("Error: {}", msg);
}
Ok(_) => {}
}§Testing
The library includes comprehensive integration tests covering:
- Function declarations with various parameter combinations
- Variable declarations with different types
- Expression parsing including operators and precedence
- Type system validation
- Complete program parsing
- Error detection and reporting
Run tests with:
cargo test§Grammar Reference
The parser is based on a formal grammar defined in carbon.pest. Key grammar rules include:
program: Top-level rule matching complete Carbon programsfunction_decl: Function declarationsvar_decl: Variable declarationsexpression: All expression typestype_name: Type annotationsstatement: Individual statements
§Contributing
Contributions are welcome. Please ensure all tests pass and add tests for new features.
§Author
Daniil Cherniavskyi
§License
This project is available under standard open source licenses.
Structs§
- Carbon
Parser - Carbon parser implementation using Pest.
Enums§
- Parse
Error - Errors that can occur during parsing.
- Rule
Functions§
- parse_
carbon - Parses a complete Carbon program.
- parse_
expression - Parses an expression.
- parse_
function_ decl - Parses a single function declaration.
- parse_
type_ name - Parses a type name.
- parse_
var_ decl - Parses a variable declaration statement.
Type Aliases§
- Parse
Result - Result type for parsing operations.