# CLAUDE.md — ArchScript Project Reference
## Quick Start
```bash
cargo build # build
cargo test # run all 121 tests
cargo run -- eval "2 + 3 * 4" # eval expression (prints: 14)
cargo run -- run examples/hello.as # run a script
cargo run -- run examples/archlinux.as # run stdlib demo
cargo run -- parse examples/hello.as # dump AST
cargo run -- repl # start interactive REPL
just check # fmt + lint + test
just test-stdlib # run stdlib tests only
just repl # start REPL (shorthand)
```
File extension: `.as`
## Project Overview
ArchScript is a programming language designed on top of Arch Linux: Python-like syntax, Haskell-inspired functional features, first-class Arch ecosystem integration. Implementation is Rust-based with Pest PEG parser.
**Status**: Working interpreter for core subset with Arch Linux standard library (pacman, systemd, AUR, file I/O). No compiler/bytecode yet.
**Domain**: archscript.org (AWS Route53, zone Z06425983KLFNTEL8SI3D)
## Architecture
```
source.as → Pest PEG parser → AST → Tree-walking interpreter → output
(archscript.pest) (ast.rs) (interpreter.rs)
↓
stdlib modules
(pacman, systemd, aur, fs)
```
Pipeline: `parser::parse(source) -> Node::Module(Vec<Node>)` then `Interpreter::new().run(&ast) -> Value`
Stdlib modules are registered as Dict values in the global environment at interpreter creation. Member access (`pacman.install`) resolves to `BuiltinFn` values which are dispatched through `stdlib::call()`.
## File Map
| `src/archscript.pest` | PEG grammar (Pest) | 193 lines, top rule: `archscript` |
| `src/ast.rs` | AST node definitions | `Node`, `Expr`, `BinOp`, `Pattern`, `FunctionDecl` |
| `src/parser.rs` | Pest tree → AST | `parse(source) -> Result<Node, ParseError>` |
| `src/interpreter.rs` | Tree-walking interpreter | `Interpreter::run(&Node) -> Result<Value, RuntimeError>` |
| `src/main.rs` | CLI (clap) | Subcommands: `run`, `parse`, `eval`, `repl` |
| `src/repl.rs` | Interactive REPL | `run_repl()`, `run_repl_with_io()`, REPL commands |
| `src/lib.rs` | Module exports | `pub mod ast, interpreter, parser, repl, stdlib` |
| `src/stdlib/mod.rs` | Stdlib registry + dispatch | `register_modules()`, `call()`, `command_result()` |
| `src/stdlib/pacman.rs` | Pacman package management | `install`, `remove`, `update`, `search`, `list`, `info`, `clean` |
| `src/stdlib/systemd.rs` | Systemd service management | `start`, `stop`, `restart`, `enable`, `disable`, `status` |
| `src/stdlib/aur.rs` | AUR helper wrapper | `install`, `search`, `update`, `info` |
| `src/stdlib/fs.rs` | File system operations | `read`, `write`, `exists`, `ls`, `mkdir`, `remove` |
| `tests/integration.rs` | 48 integration tests | Uses `parser::parse` + `Interpreter` directly |
| `examples/hello.as` | Hello world | Variables, arithmetic, lists, println |
| `examples/functions.as` | Functions demo | def, recursion, lambda |
| `examples/archlinux.as` | Arch Linux stdlib demo | pacman, systemd, aur, fs usage |
| `justfile` | Task runner recipes | build, test, test-stdlib, lint, run, ci |
## Dependencies
```toml
pest = "2.7" # PEG parser
pest_derive = "2.7" # derive macro for grammar
thiserror = "1" # error types
clap = "4" # CLI argument parsing
```
## Grammar Summary (archscript.pest)
**Top-level items** (separated by newlines or `;`):
- `import_statement` — `import path`, `import {a,b} from path`, `import path as alias`
- `variable_declaration` — `var name: Type = expr` (type annotation optional)
- `function_declaration` — `def name(params) -> RetType: body` (return type optional, `:` or `=` before body)
- `type_definition` — `type Name = TypeExpr`
- `data_definition` — `data Name = Con1(fields) | Con2`
- `trait_definition` — `trait Name { def ... }`
- `instance_definition` — `instance TraitName for Type { def ... }`
- `expression_statement` — any expression
**Expression precedence** (lowest to highest):
1. Pipe: `|>`
2. Assignment: `= += -= *= /= %= **=`
3. Logical OR: `|| or`
4. Logical AND: `&& and`
5. Equality: `== !=`
6. Relational: `< <= > >=`
7. Additive: `+ -`
8. Multiplicative: `* / // %`
9. Power: `**` (right-associative)
10. Unary: `- + ! not`
11. Postfix: `f()` `a[i]` `a.b`
12. Primary: literals, identifiers, parenthesized, if/for/while/match/lambda/comprehension
**Literals**: `int_literal` (42), `float_literal` (3.14, 1e5), `string_literal` ("..." or '...'), `boolean_literal` (True/False/true/false)
**Collections**: `[1, 2, 3]` (list), `{"k": "v"}` (dict), `(1, 2)` (tuple), `[x*2 for x in list if cond]` (comprehension)
**Whitespace**: `WHITESPACE = _{ " " | "\t" }` (implicit, auto-consumed). `NEWLINE` is separate and used as statement separator.
**Block indentation**: Uses Pest `PUSH/PEEK_ALL/DROP` for indent-based blocks (`block` rule: 4 spaces or tab).
**Keywords**: import, from, as, var, def, if, elif, else, for, while, in, match, type, data, trait, instance, and, or, not, is, return, True, true, False, false, lambda
**Lambda**: `lambda x, y: expr` (uses `lambda_params` not `param_list` to avoid `:` conflict with type annotations)
## AST Node Types
```
Node::Module(Vec<Node>)
Node::Import(Import::{Simple, Selective, Aliased})
Node::VariableDeclaration(name, Option<type>, Box<Expr>)
Node::FunctionDeclaration(FunctionDecl{name, params, return_type, body})
Node::TypeDefinition(name, TypeExpr)
Node::DataDefinition(name, Vec<DataConstructor>)
Node::TraitDefinition(name, Vec<FunctionDecl>)
Node::InstanceDefinition(name, TypeExpr, Vec<FunctionDecl>)
Node::Expression(Expr)
Node::Return(Option<Expr>)
```
**Expr variants** (26 total):
- Literals: `Integer(i64)`, `Float(f64)`, `StringLit(String)`, `Boolean(bool)`, `Identifier(String)`
- Collections: `List(Vec<Expr>)`, `Dict(Vec<(Expr,Expr)>)`, `Tuple(Vec<Expr>)`
- Operations: `BinaryOp(BinOp, lhs, rhs)`, `UnaryOp(UnaryOp, expr)`, `Pipe(lhs, rhs)`
- Access: `FunctionCall(callee, args)`, `Index(expr, idx)`, `MemberAccess(expr, field)`
- Control: `If(cond, then, elifs, else)`, `For(var, iter, body)`, `While(cond, body)`, `Match(subject, arms)`
- Functions: `Lambda(params, body)`, `ListComprehension(expr, var, iter, filter)`
- Blocks: `Block(Vec<Node>)`
- Assignment: `Assign(target, value)`, `CompoundAssign(op, target, value)`
**BinOp**: Add, Sub, Mul, Div, IntDiv, Mod, Pow, Eq, Neq, Lt, Lte, Gt, Gte, And, Or
**UnaryOp**: Neg, Pos, Not
**Pattern**: Wildcard, Literal(Expr), Identifier(String), Tuple(Vec), List(Vec, Option<rest>), Constructor(name, Vec)
## Interpreter Runtime
**Value enum**: Integer(i64), Float(f64), String, Boolean, List(Vec<Value>), Dict(Vec<(Value,Value)>), Tuple(Vec<Value>), Function(FuncValue), BuiltinFn(String), None
**Environment**: `Vec<HashMap<String, Value>>` — stack of scopes. Methods: `get(name)`, `set(name, val)` (updates existing or creates in current), `define(name, val)` (always current scope), `push_scope()`, `pop_scope()`.
**Function calls**: Closure-based. `FuncValue` stores `name: Option<String>`, params, body expr, closure env. On call: clone closure, push scope, bind function name for recursion, bind params, swap env, eval body, restore env.
**Built-in functions** (12):
| Function | Signature | Description |
|----------|-----------|-------------|
| `print` | `print(args...)` | Output to stdout (captured in `interpreter.output`) |
| `println` | `println(args...)` | Same as print (no newline difference in capture mode) |
| `len` | `len(collection)` | Length of list, string, or dict |
| `range` | `range(end)` / `range(start, end)` / `range(start, end, step)` | Generate integer list |
| `str` | `str(value)` | Convert to string |
| `int` | `int(value)` | Convert to integer |
| `float` | `float(value)` | Convert to float |
| `type` | `type(value)` | Return type name as string |
| `map` | `map(fn, list)` | Apply function to each element |
| `filter` | `filter(fn, list)` | Keep elements where fn returns truthy |
| `sum` | `sum(list)` | Sum numeric list |
| `append` | `append(list, item)` | Return new list with item appended |
**Truthiness**: False=falsy, 0=falsy, 0.0=falsy, ""=falsy, []=falsy, None=falsy, everything else truthy.
**Type coercion**: Int+Float -> Float, Int/Int -> Float (division always produces float), String+String -> concat, List+List -> concat, String*Int -> repeat.
## Known Bugs Fixed
1. **`var y = 13.1` parse failure** — Fixed by ensuring `float_literal` is in `primary_expression` and reachable via expression chain from `variable_declaration`. Root cause was the original grammar's `expression` not reaching `literal` via `primary_expression`.
2. **Lambda `:` ambiguity** — `lambda x: expr` conflicted with param type annotation `param: Type`. Fixed by using separate `lambda_params = { identifier ~ ("," ~ identifier)* }` instead of `param_list`.
3. **Recursive function undefined** — `FuncValue.closure` captured env before function was defined. Fixed by adding `name: Option<String>` to `FuncValue` and binding the function itself in call env: `call_env.define(name, func.clone())`.
4. **Postfix `call_args` not reaching parser** — `postfix_op` was non-silent, wrapping `call_args`/`index_access`/`member_access` inside a `Rule::postfix_op`. Parser code matched on `Rule::call_args` directly. Fixed by making `postfix_op` silent: `postfix_op = _{ ... }`.
## Test Coverage
- **20 unit tests** in `src/parser.rs` and `src/interpreter.rs`
- **26 unit tests** in `src/repl.rs` (REPL session tests, continuation detection)
- **29 unit tests** in `src/stdlib/` (pacman: 9, systemd: 8, aur: 6, fs: 6)
- **48 integration tests** in `tests/integration.rs`
- **123 total tests**
- Tests cover: all literal types, variable declarations (including the float bug), arithmetic with precedence, string concat, comparisons, logical operators, all built-in functions, function def/call, recursion, lambdas, imports, multi-statement programs, stdlib module access, pacman/systemd/aur command generation, fs read/write/exists/ls/mkdir/remove, stdlib error handling, REPL session persistence, REPL commands, multiline continuation detection
## Design Goals (from original spec)
1. Python-like syntax, Haskell-inspired functional features
2. First-class Arch Linux integration (pacman, systemctl, AUR wrappers in stdlib)
3. Consumer-driven contract testing (CDCT) support
4. Microservices and distributed systems first-class support
5. Actor-based concurrency model
6. WebAssembly compile target
7. LSP support
8. ArchAgent integration (GPT-4 agent that generates/runs ArchScript)
## Standard Library — Arch Linux Integration
The stdlib provides four modules registered as Dict values in the global scope. Each module's functions are accessible via member access (e.g., `pacman.install("vim")`). Functions that wrap system commands execute them via `std::process::Command` and return a result Dict with `{success, output, code, command}` fields.
### `pacman` — Package Management
| `pacman.install(name)` | `sudo pacman -S --noconfirm <name>` | Install a package |
| `pacman.remove(name)` | `sudo pacman -R --noconfirm <name>` | Remove a package |
| `pacman.update()` | `sudo pacman -Syu --noconfirm` | Full system update |
| `pacman.search(query)` | `pacman -Ss <query>` | Search packages |
| `pacman.list()` | `pacman -Q` | List installed packages |
| `pacman.info(name)` | `pacman -Qi <name>` | Get package info |
| `pacman.clean()` | `sudo pacman -Scc --noconfirm` | Clean package cache |
### `systemd` — Service Management
| `systemd.start(service)` | `sudo systemctl start <service>` | Start a service |
| `systemd.stop(service)` | `sudo systemctl stop <service>` | Stop a service |
| `systemd.restart(service)` | `sudo systemctl restart <service>` | Restart a service |
| `systemd.enable(service)` | `sudo systemctl enable <service>` | Enable at boot |
| `systemd.disable(service)` | `sudo systemctl disable <service>` | Disable at boot |
| `systemd.status(service)` | `systemctl status <service>` | Check status |
### `aur` — AUR Package Management
Uses `yay` as the default AUR helper.
| `aur.install(name)` | `yay -S --noconfirm <name>` | Install AUR package |
| `aur.search(query)` | `yay -Ss <query>` | Search AUR |
| `aur.update()` | `yay -Sua --noconfirm` | Update AUR packages |
| `aur.info(name)` | `yay -Qi <name>` | Get AUR package info |
### `fs` — File System Operations
Uses Rust's `std::fs` for safe, cross-platform file I/O. Returns values directly (not command result Dicts).
| `fs.read(path)` | `String` | Read file contents |
| `fs.write(path, content)` | `Boolean` (true) | Write string to file |
| `fs.exists(path)` | `Boolean` | Check if path exists |
| `fs.ls(path)` | `List[String]` | List directory entries |
| `fs.mkdir(path)` | `Boolean` (true) | Create directory (with parents) |
| `fs.remove(path)` | `Boolean` (true) | Remove file or directory |
### Stdlib Implementation Notes
- Modules are registered in `Environment::new()` via `stdlib::register_modules()`
- Each module is a `Value::Dict` containing `Value::BuiltinFn` entries
- Member access (`pacman.install`) resolves through the interpreter's `MemberAccess` → `Dict` lookup
- System command functions use `std::process::Command` (no shell interpolation — safe from injection)
- The `call_builtin` method delegates to `stdlib::call()` for dotted names (e.g., `"pacman.install"`)
### Stdlib Usage Example
```
// Package management
var result = pacman.install("vim")
println(result.command) // "sudo pacman -S --noconfirm vim"
println(result.success) // True/False
// Service management
systemd.enable("sshd")
systemd.start("sshd")
var status = systemd.status("sshd")
// AUR packages
aur.install("visual-studio-code-bin")
// File system
fs.write("/tmp/hello.txt", "Hello from ArchScript!")
var content = fs.read("/tmp/hello.txt")
var exists = fs.exists("/tmp/hello.txt")
var files = fs.ls("/tmp")
fs.mkdir("/tmp/mydir")
fs.remove("/tmp/hello.txt")
```
## Interactive REPL
The REPL (Read-Eval-Print Loop) provides an interactive interpreter session via `archscript repl` or `just repl`.
### REPL Commands
| `:help`, `:h`, `:?` | Show help message |
| `:quit`, `:exit`, `:q` | Exit the REPL |
| `:env` | Show all user-defined variables |
| `:reset` | Reset the interpreter environment |
| `:ast <expr>` | Show the AST for an expression |
### REPL Features
- **Persistent state**: Variables, functions, and values persist across lines within a session
- **Multiline input**: Lines ending with `:`, `(`, `[`, `{`, or `\` automatically continue to the next line. An empty line in multiline mode submits the accumulated input.
- **Error resilience**: Parse errors and runtime errors are displayed inline without terminating the session
- **Result display**: Non-None expression results are automatically printed
- **Testable I/O**: `run_repl_with_io<R: BufRead, W: Write>()` accepts generic I/O for testing
### REPL Implementation Notes
- `src/repl.rs` contains the full REPL implementation
- `run_repl()` is the stdin/stdout entry point; `run_repl_with_io()` is the testable generic version
- The interpreter instance is reused across inputs for state persistence
- `needs_continuation()` detects incomplete expressions by tracking bracket balance and trailing tokens
- REPL commands (`:` prefix) are parsed before attempting expression evaluation
- The `:env` command filters out built-in functions and stdlib modules, showing only user-defined names
## Planned Features (Not Yet Implemented)
### Language
- Indentation-based block scoping (partially: `block` rule exists but not fully wired)
- Class definitions (`class` keyword per BNF spec)
- Async/await
- Generator expressions
- Set literals
- Decorators/macros
- Type checking / gradual typing
- `try`/`catch`/`throw` error handling
### Standard Library (Planned Extensions)
- `postgresql.setup({...})` — service configuration
- `ssh.configure({...})` — SSH hardening
- `disk.optimize({...})` — disk operations
- `backup.create({...})` — backup management
- Networking, crypto modules
### Tooling
- LSP server
- WASM compilation target
- Package manager for ArchScript modules
- CI/CD templates (GitHub Actions)
### ArchAgent Integration
- ArchAgent is a GPT-4 agent that outputs ArchScript
- Contract: ArchAgent generates calls like `postgresql.setup({...})`, `pacman.install({...})`
- Stdlib APIs should match those intents
- Vanilla variant uses raw Arch/bash commands instead
## Coding Conventions
- Rust edition 2021, formatted with `rustfmt`
- `cargo clippy -- -D warnings` must pass
- Grammar changes in `archscript.pest` require rebuild (Pest derive macro)
- Parser functions named `build_<rule>` matching grammar rules
- Tests inline in modules (`#[cfg(test)] mod tests`) + separate `tests/integration.rs`
- No `unwrap()` in production code paths; use `?` or proper error handling
- `ParseError(String)` and `RuntimeError(String)` for error types
## Operator Reference
| `+` `-` `*` `/` `%` | Arithmetic | `/` always returns float |
| `//` | Integer division | Integers only |
| `**` | Power | Right-associative |
| `==` `!=` | Equality | Cross-type int/float comparison works |
| `<` `<=` `>` `>=` | Relational | Numeric and string comparison |
| `and` `&&` | Logical AND | Short-circuit |
| `or` `\|\|` | Logical OR | Short-circuit |
| `not` `!` | Logical NOT | |
| `\|>` | Pipe | `expr \|> fn` calls `fn(expr)` |
| `=` | Assignment | |
| `+=` `-=` `*=` `/=` `%=` `**=` | Compound assignment | |
## Example ArchScript Code
```
// Variables and arithmetic
var x = 42
var pi = 3.14159
var greeting = "Hello, " + "World!"
// Functions
def add(a, b) = a + b
def factorial(n) = if n <= 1: 1
else: n * factorial(n - 1)
// Lambda and higher-order
var double = lambda x: x * 2
var nums = range(10)
var evens = filter(lambda x: x % 2 == 0, nums)
// Pattern matching
match value {
0 => println("zero"),
x if x > 0 => println("positive"),
_ => println("other")
}
// Pipe operator
// Type definitions
data Color = Red | Green | Blue | Custom(r, g, b)
type Point = Tuple(Float, Float)
// Imports
import math
import { sqrt, pi } from math
import math as m
// Arch Linux stdlib
var result = pacman.install("vim")
println(result.command)
systemd.enable("sshd")
var content = fs.read("/etc/hostname")
```