Expand description
the Qala compiler and bytecode VM, as a single crate.
the pipeline is lexer -> parser -> type and effect checker -> codegen -> VM.
this file declares the modules in dependency order. the wasm-bindgen
bridge that exposes the pipeline to JavaScript lives in wasm.rs.
Modules§
- arm64
- the ARM64 backend: a typed AST lowered to AArch64 assembly text in the CPSC 355 hosted-Linux dialect.
- ast
- the untyped AST: what the parser produces and the type checker consumes.
every node carries a
Span(so Phase 3’s diagnostics underline the source as written); the tree is boxed at recursive positions; nothing is desugared –Expr::Pipeline,Expr::Interpolation,Stmt::For, andExpr::Matchare real nodes, not lowered to calls /+chains /while. - chunk
- the bytecode chunk + the whole program: instruction bytes, a constant pool, and a parallel source-line map; plus the disassembler that turns them back into the human-readable listing the playground will render in its bytecode panel.
- codegen
- the bytecode codegen: lower a
TypedAstto aProgramof bytecode chunks, ready for the Phase 5 stack VM to execute. - diagnostics
- the diagnostics layer: one
Diagnosticdata model with two rendering paths –Diagnostic::render(Task 2) produces a Rust-style ASCII-underlined source block for CLI output and snapshot tests;Diagnostic::to_monacoproduces a serde-SerializeMonacoDiagnosticfor the playground’s inline editor underlines. errors and warnings share the model;Severitydistinguishes them; warnings carry their snake_case category asDiagnostic::category. - effects
- the effect lattice the type checker infers each function against, stored
as a
u8bitfield. - errors
- the compiler’s error type. one enum,
QalaError, returned everywhere asResult<T, QalaError>. every variant carries aSpanso a diagnostic can point at the exact source text; rich rendering (the source line plus an underline) is built in a later phase on top of that span. - lexer
- the hand-written scanner:
Lexer::tokenizeturns Qala source text into aVec<Token>ending inTokenKind::Eof, or the first lex error. - opcode
- the bytecode opcode enum: every byte the codegen emits is one of these
variants encoded as
Opcode::Foo as u8. dense discriminants 0..=45 for the real opcodes plusOpcode::Haltat0xFFas a sentinel for the decoder so an out-of-bounds byte is detectable rather than silently reinterpretable. - optimizer
- the peephole optimizer: a single-pass walk over a
crate::chunk::Chunkremoving wasteful instruction patterns and folding constant-condition jumps. eight locked patterns – the v1 set in 04-CONTEXT.md “Optimizations” – each matched with a 2-3 instruction sliding window. - parser
- the hand-written parser:
Parser::parseturns the lexer’sVec<Token>into an untypedAst, or the first syntax error. recursive descent for items and statements, a Pratt (binding-power) loop for expressions. - span
- source locations: a
Spanis a byte offset plus a length into the source string, andLineIndexturns a byte offset into a 1-based line and column. - stdlib
- the native-Rust standard library.
- token
- tokens: the units the lexer produces and the parser consumes.
TokenKindis the classification (with a payload for literals and identifiers),Tokenpairs a kind with aSpan. the stream always ends in an explicitTokenKind::Eof. - typechecker
- the type and effect checker:
check_programturns the parser’s untyped AST + the source text into a typed AST plus a vector of errors and a vector of warnings. it does NOT abort on the first error – it accumulates and recovers locally viaQalaType::Unknownpoison-propagation, so a user’s typo doesn’t cascade to twenty errors. - typed_
ast - the typed AST: what the type checker produces and codegen consumes.
- types
- the type lattice the type checker resolves AST type expressions and inferred local types into.
- value
- the runtime value representation, in two types.
- vm
- the stack-based bytecode interpreter.
- wasm
- the wasm-bindgen bridge: a
Qalasession struct that exposes the finished compile-and-VM pipeline to JavaScript.