pascal-rs - A Production-Ready Optimizing Pascal Compiler
pascal-rs is a modern, full-featured Pascal compiler written in Rust with advanced optimizations, register allocation, type inference, and SIMD support. It features a complete compilation pipeline with automatic dependency resolution, precompiled unit (PPU) files, and a user-friendly command-line interface.
Status: โ Production Ready - Milestone 3 Complete (87 tests passing)
๐ Features
โ Lexical Analysis
- Complete tokenization of Pascal source code
- Support for identifiers, numbers, strings, and operators
- Keyword recognition (program, var, begin, end, if, then, else, while, do, etc.)
- Comment handling (both
{ }and//styles) - Whitespace and error handling
โ Parsing
- Full Abstract Syntax Tree (AST) generation
- Program structure parsing (program declarations, variable declarations)
- Expression parsing with operator precedence
- Statement parsing (assignments, conditionals, loops)
- Type system support (integer, real, boolean, char, string, arrays, records)
- Proper variable scope management
- Nested block support
โ Code Generation
- x86-64 Assembly Output (Intel syntax)
- Register Allocation - Graph coloring algorithm with live range analysis
- Symbol Tables - Hierarchical scopes with type tracking
- Type Checking - Full validation and inference
- Expression Generation - All Pascal expressions supported
- Statement Generation - Complete control flow support
- Function/Procedure Code - Prologue, body, epilogue generation
- Multiple Calling Conventions - System V AMD64, Win64, custom
โ Advanced Optimizations
- Constant Folding - Compile-time expression evaluation
- Dead Code Elimination - Remove unreachable code
- Common Subexpression Elimination (CSE) - Eliminate redundant calculations
- Function Inlining - Inline small functions automatically
- Loop Unrolling - Unroll constant-iteration loops
- Strength Reduction - Replace expensive operations (x*8 โ x<<3)
- Tail Call Optimization - Convert recursion to iteration
- Peephole Optimization - Assembly-level optimizations
- Algebraic Simplification - x+0=x, x1=x, x0=0
โ Advanced Type System
- Generic Types - Parametric polymorphism with constraints
- Type Inference - Hindley-Milner style type inference
- Operator Overloading - Custom operator definitions for types
- Type Classes - Ad-hoc polymorphism support
- Type Constraints - Generic bounds and requirements
โ SIMD & Vectorization
- SIMD Registers - XMM (128-bit), YMM (256-bit), ZMM (512-bit)
- Vectorization - Automatic loop vectorization
- SIMD Instructions - SSE, AVX, AVX-512 support
- Packed Operations - addps, mulps, and more
โ Language Support
- Data Types: integer, real, boolean, char, string, arrays, records, pointers
- Control Structures: if-else, while, for, repeat-until, case statements
- Operators: arithmetic (+, -, *, /, div, mod), comparison (=, <>, <, <=, >, >=), logical (and, or, not), bitwise (&, |, xor)
- Functions & Procedures: parameter passing, return values
- Advanced Features: records, arrays, pointers, type casting
- Scope Management: proper variable scoping with nested blocks
โ Standard Library (60% Complete)
- System Unit - Core I/O, strings, math, memory, file operations (66 functions)
- SysUtils Unit - Utilities, exceptions, file/directory operations (53 functions)
- Classes Unit - OOP support with TObject, TList, TStringList, streams (7 classes)
- Math Unit - Comprehensive math functions, statistics, number theory (60+ functions)
- Total: 180+ functions, 7 classes, 1,500+ lines of Pascal code
๏ธ Building
Make sure you have Rust installed, then run:
# Build the project
# Build optimized release version
# Run tests
# Run specific test suites
The binary will be available at:
target/debug/pascal(debug build)target/release/pascal(optimized build)
๐ฏ Why pascal-rs?
What is pascal-rs?
pascal-rs is a modern, production-ready optimizing Pascal compiler written in Rust. It's designed to bring the safety and performance benefits of Rust to Pascal compilation, while maintaining full compatibility with existing Pascal codebases.
Why Use pascal-rs?
1. Modern & Safe
- Built with Rust's memory safety guarantees - no buffer overflows, null pointer dereferences, or data races
- Zero-cost abstractions provide C-level performance without the complexity
- Modern tooling with cargo, clippy, and rustfmt
2. Production-Ready Optimizations
- Advanced compiler optimizations: constant folding, dead code elimination, CSE, function inlining, loop unrolling
- Register allocation with graph coloring algorithm
- Type inference and generics support
- SIMD vectorization for performance-critical code
3. Full Pascal Compatibility
- Support for all major Pascal features: OOP, generics, exceptions, operator overloading
- Compatible with existing Pascal codebases and libraries
4. Developer-Friendly
- Clear, colored error messages with source locations
- Comprehensive test coverage (87 tests passing)
- Modular architecture for easy maintenance and extension
- Fast compilation with incremental builds and PPU caching
5. Cross-Platform
- Multiple target architectures: x86-64, ARM, RISC-V, MIPS, PowerPC, WebAssembly
- Works on macOS, Linux, and Windows
- Native macOS GUI support with Cocoa framework
Use Cases
- Education: Learn compiler construction with a clean, modern codebase
- Legacy Code: Modernize existing Pascal projects with a safer compiler
- Embedded Development: Target various architectures for embedded systems
- Performance: Optimizing compiler for performance-critical applications
- Research: Platform for compiler research and experimentation
๐ฏ Usage
Quick Start
# Clone the repository
# Build the compiler
# The binary is now available at ./target/release/pascal
Installation
# Build and install the compiler locally
# Option 1: Run directly from target directory
# Option 2: Install to your PATH (requires installing pascal-cli crate)
# Now you can run pascal from anywhere
Basic Compilation Workflow
# 1. Compile a simple Pascal program
# 2. Compile with optimization (recommended for production)
# 3. Compile with verbose output to see what's happening
# 4. Compile with assembly output (for inspection)
Advanced Usage
# Compile with optimization level 2
# Compile with debug information
# Specify output directory for compiled files
# Add search paths for imported units
# Disable PPU caching (force recompilation)
# Verbose mode (shows all compilation steps)
Working with Units
# Compile a unit (generates .ppu file)
# Inspect a compiled unit
# Compile a program that uses the unit
# (automatically compiles dependencies)
Clean Build Artifacts
# Clean PPU files from current directory
# Clean specific directory
# Clean multiple directories
Command Reference
Global Commands:
Compile Command:
)
)
Info Command:
Clean Command:
Complete Example Workflow
# 1. Create a simple Pascal program
# 2. Compile it
# 3. Check the generated assembly
# 4. Clean up
Project Structure
pascal-rs/
โโโ crates/
โ โโโ pascal-lexer/ # Lexical analysis crate
โ โโโ pascal-ast/ # Abstract Syntax Tree definitions
โ โโโ pascal-parser/ # Syntax analysis crate
โ โโโ pascal-codegen/ # Code generation crate
โ โโโ pascal-module/ # Module system (units, PPU files)
โ โโโ pascal-driver/ # Compilation driver
โ โโโ pascal-cli/ # Command-line interface (binary)
โโโ examples/
โ โโโ *.pas # Example Pascal programs
โโโ tests/ # Test suites
โโโ docs/ # Documentation
๐จ Key Features
Unit System & Modules
- Pascal Units: Full support for
unit,interface, andimplementationsections - Uses Clauses: Automatic dependency resolution
- PPU Files: Binary precompiled unit format for faster compilation
- Module Management: Dependency tracking and topological sort
- Smart Caching: Reuse PPU files when source hasn't changed
Compilation Pipeline
- Automatic Dependencies: Compiles dependencies before dependent units
- Incremental Compilation: Only recompile changed units
- Error Reporting: Clear, colored error messages
- Progress Tracking: Verbose mode shows compilation steps
Command-Line Interface
- Modern CLI: Subcommands with comprehensive options
- Colored Output: Green for success, red for errors, yellow for warnings
- Multiple Commands: compile, info, clean
- Flexible Options: Search paths, optimization levels, debug info
๐ Examples
Example 1: Hello World
The simplest Pascal program:
program HelloWorld;
begin
writeln('Hello, World!');
end.
Compile and run:
Example 2: Variables and Basic Operations
program Variables;
var
x, y, sum: integer;
name: string;
begin
x := 42;
y := 10;
sum := x + y;
name := 'Pascal';
end.
Example 3: Conditional Statements
program Conditionals;
var
age: integer;
begin
age := 18;
if age >= 18 then
begin
writeln('You are an adult');
end
else
begin
writeln('You are a minor');
end;
// Nested conditions
if age >= 18 then
begin
if age >= 65 then
writeln('You are a senior')
else
writeln('You are an adult');
end;
end.
Example 4: Loops
program Loops;
var
i, sum: integer;
begin
// While loop
i := 1;
sum := 0;
while i <= 10 do
begin
sum := sum + i;
i := i + 1;
end;
// For loop
sum := 0;
for i := 1 to 10 do
begin
sum := sum + i;
end;
// For loop with downto
for i := 10 downto 1 do
begin
writeln(i);
end;
end.
Example 5: Functions and Procedures
program Functions;
// Function declaration
(a, b: integer): integer;
begin
Result := a + b;
end;
// Procedure declaration (no return value)
(a, b: integer);
var
sum: integer;
begin
sum := a + b;
writeln('Sum is: ', sum);
end;
var
x, y: integer;
begin
x := 10;
y := 20;
// Call function
writeln('Addition: ', Add(x, y));
// Call procedure
PrintSum(x, y);
end.
Example 6: Arrays
program Arrays;
var
numbers: array[1..5] of integer;
i, sum: integer;
begin
// Initialize array
numbers[1] := 10;
numbers[2] := 20;
numbers[3] := 30;
numbers[4] := 40;
numbers[5] := 50;
// Sum array elements
sum := 0;
for i := 1 to 5 do
begin
sum := sum + numbers[i];
end;
writeln('Sum: ', sum);
end.
Example 7: Creating and Using Units
Step 1: Create a unit (MathUtils.pas)
unit MathUtils;
interface
(a, b: integer): integer;
(a, b: integer): integer;
(n: integer): integer;
implementation
(a, b: integer): integer;
begin
Result := a + b;
end;
(a, b: integer): integer;
begin
Result := a * b;
end;
(n: integer): integer;
begin
if n <= 1 then
Result := 1
else
Result := n * Factorial(n - 1);
end;
end.
Step 2: Compile the unit
# Output: Success: Compiled module: MathUtils
# PPU file: mathutils.ppu
Step 3: Use the unit in a program
program Calculator;
uses MathUtils;
var
x, y: integer;
begin
x := 5;
y := 3;
writeln('Addition: ', Add(x, y));
writeln('Multiplication: ', Multiply(x, y));
writeln('Factorial of 5: ', Factorial(5));
end.
Step 4: Compile and run
# Automatically uses mathutils.ppu if available
Example 8: Records (Structured Types)
program Records;
type
Person = record
name: string;
age: integer;
salary: real;
end;
var
employee: Person;
begin
employee.name := 'John Doe';
employee.age := 30;
employee.salary := 50000.00;
writeln('Name: ', employee.name);
writeln('Age: ', employee.age);
writeln('Salary: ', employee.salary);
end.
Example 9: Fibonacci Sequence
program Fibonacci;
var
n, i, a, b, temp: integer;
begin
n := 10;
a := 0;
b := 1;
writeln('Fibonacci sequence (first ', n, ' numbers):');
for i := 1 to n do
begin
writeln(a);
temp := a + b;
a := b;
b := temp;
end;
end.
Example 10: Generated Assembly Output
When you compile with -S flag, you can see the generated assembly:
Input (simple.pas):
program Simple;
var
x: integer;
begin
x := 42;
x := x + 1;
end.
Compile with assembly output:
Generated assembly (simple.s):
.intel_syntax noprefix
.section .text
main:
push rbp
mov rbp, rsp
# x := 42
mov eax, 42
mov [rbp - 8], eax
# x := x + 1
mov eax, [rbp - 8]
add eax, 1
mov [rbp - 8], eax
pop rbp
ret
Available Example Programs
The examples/ directory contains several ready-to-compile Pascal programs:
| Example | Description |
|---|---|
hello.pas |
Basic conditional and loop example |
simple_math.pas |
Arithmetic operations |
conditional.pas |
Complex if-else statements |
boolean_logic.pas |
Boolean operations (AND, OR, NOT) |
fibonacci.pas |
Fibonacci sequence calculation |
calculator.pas |
Calculator with multiple operations |
loops.pas |
Complex loop structures |
advanced_features.pas |
Advanced Pascal features |
comprehensive_features.pas |
Comprehensive feature demonstration |
Try compiling an example:
๐งช Testing
The project includes comprehensive test coverage:
# Run all tests
# Run specific test categories
Test Results: โ All tests passing (87/87)
Complex Examples: โ All 7 complex examples compile successfully
๐๏ธ Architecture
Modular Design
The pascal-rs compiler is built using a modular architecture with separate crates for each major component:
pascal-lexer: Lexical analysis and tokenizationpascal-ast: Abstract Syntax Tree definitions and typespascal-parser: Syntax analysis and parsingpascal-codegen: Code generation and assembly outputpascal-cli: Command-line interface and user interaction
This modular approach provides several benefits:
- Separation of Concerns: Each crate has a single responsibility
- Reusability: Individual crates can be used independently
- Testability: Each component can be tested in isolation
- Maintainability: Changes to one component don't affect others
- Performance: Only necessary dependencies are compiled
๐๏ธ Architecture
Compiler Pipeline
graph TD
A[Pascal Source Code<br/>.pas file] --> B[Lexical Analysis<br/>pascal-lexer]
B --> C[Token Stream<br/>Keywords, Identifiers, Literals, Operators]
C --> D[Syntax Analysis<br/>pascal-parser]
D --> E[Abstract Syntax Tree<br/>pascal-ast]
E --> F[Code Generation<br/>pascal-codegen]
F --> G[x86-64 Assembly<br/>.s file]
B --> H[Error Handling<br/>LexerError]
D --> I[Error Handling<br/>ParserError]
F --> J[Error Handling<br/>CodegenError]
style A fill:#e1f5fe
style G fill:#e8f5e8
style H fill:#ffebee
style I fill:#ffebee
style J fill:#ffebee
Detailed Component Architecture
graph TB
subgraph "Input Layer"
A[Pascal Source Code]
B[Command Line Interface<br/>pascal-cli]
end
subgraph "Lexical Analysis Layer"
C[Lexer<br/>pascal-lexer]
D[Token Definitions<br/>pascal-lexer/tokens.rs]
E[Token Stream]
end
subgraph "Syntax Analysis Layer"
F[Parser<br/>pascal-parser]
G[Grammar Rules]
H[AST Builder]
end
subgraph "AST Layer"
I[Type Definitions<br/>pascal-ast]
J[Expressions<br/>Literals, BinaryOp, UnaryOp]
K[Statements<br/>Assignment, If, While, For]
L[Types<br/>Integer, Real, Boolean, String, Array, Record]
M[Advanced Types<br/>Variant, Dynamic Array, Set, Enum]
N[OOP Features<br/>Class, Interface, Method, Property]
O[Generic Features<br/>GenericType, TypeParameter]
P[Exception Handling<br/>Try, Except, Finally, Raise]
Q[Memory Management<br/>New, Dispose, GetMem, FreeMem]
R[Inline Assembly<br/>InlineAssembly expressions]
end
subgraph "Code Generation Layer"
S[Code Generator<br/>pascal-codegen]
T[Variable Management<br/>Stack-based allocation]
U[Register Allocation<br/>Optimized usage]
V[Assembly Output<br/>x86-64 Intel syntax]
end
subgraph "Output Layer"
W[Assembly File<br/>.s output]
X[Executable<br/>After linking]
end
subgraph "Test Suite"
Y[Unit Tests<br/>tests/]
Z[Integration Tests<br/>tests/]
AA[Feature Tests<br/>OOP, Generics, Exceptions, etc.]
end
A --> B
B --> C
C --> D
D --> E
E --> F
F --> G
G --> H
H --> I
I --> J
I --> K
I --> L
I --> M
I --> N
I --> O
I --> P
I --> Q
I --> R
I --> S
S --> T
S --> U
S --> V
V --> W
W --> X
Y --> C
Y --> F
Y --> S
Z --> B
AA --> I
style A fill:#e1f5fe
style X fill:#e8f5e8
style I fill:#fff3e0
style Y fill:#f3e5f5
style Z fill:#f3e5f5
style AA fill:#f3e5f5
Language Feature Support Architecture
graph LR
subgraph "Core Language Features"
A[Basic Types<br/>Integer, Real, Boolean, Char, String]
B[Control Structures<br/>If-Else, While, For, Case]
C[Operators<br/>Arithmetic, Comparison, Logical, Bitwise]
D[Functions & Procedures<br/>Parameters, Return values]
end
subgraph "Advanced Type System"
E[Arrays<br/>Static, Dynamic, Open]
F[Records<br/>Packed, Variant]
G[Sets & Enums<br/>Type definitions]
H[Pointers<br/>Memory management]
I[File Types<br/>I/O operations]
end
subgraph "Object-Oriented Programming"
J[Classes<br/>Inheritance, Polymorphism]
K[Interfaces<br/>Contract definitions]
L[Methods & Properties<br/>Encapsulation]
M[Visibility Modifiers<br/>Public, Private, Protected]
end
subgraph "Advanced Features"
N[Generics<br/>Type parameters, Constraints]
O[Exception Handling<br/>Try-Except-Finally]
P[Memory Management<br/>Manual allocation]
Q[Operator Overloading<br/>Custom operators]
R[Inline Assembly<br/>Low-level code]
S[Units & Modules<br/>Modular programming]
end
A --> E
B --> J
C --> Q
D --> L
E --> N
F --> O
G --> P
H --> R
I --> S
style A fill:#e3f2fd
style E fill:#e8f5e8
style J fill:#fff3e0
style N fill:#f3e5f5
Key Components
- AST (
crates/pascal-ast/): Complete type definitions for Pascal language constructs - Error Handling: Comprehensive error reporting throughout the pipeline
- Memory Management: Stack-based variable allocation
- Type System: Support for all major Pascal data types
๐ Documentation
- ARCHITECTURE.md - Detailed project architecture and design
- TODO.md - Development roadmap and current tasks
- THREADING.md - Multi-threading and parallel compilation
- ACKNOWLEDGMENTS.md - Learning from the Pascal compiler ecosystem
- API Documentation - Complete API reference
- Language Reference - Supported Pascal features
- Examples - Sample Pascal programs
๐ Recent Improvements
โ Fixed Issues (Latest Update)
- Variable Scope Management: Fixed critical issue where variables declared in
varsections weren't accessible inbeginblocks - Missing Operators: Added support for
IntDivide(div) andBitwiseAnd(&) operators - Scope-Aware Code Generation: Improved scope management to prevent variable lookup failures
- Complex Example Support: All 7 complex Pascal examples now compile successfully
โ Enhanced Features
- Better Error Handling: Improved error conversion between parser and code generator
- Optimized Assembly: Cleaner x86-64 Intel syntax assembly output
- Comprehensive Testing: All complex examples verified to compile and generate correct assembly
๐ฏ Roadmap
- Enhanced optimization passes
- More Pascal language features
- Better error messages
- Debug information generation
- Cross-platform support
- Real number support improvements
- String literal handling enhancements