poshtree 0.4.2

PowerShell syntax tree: tokenizer, parser, AST, and unparser
Documentation

poshtree

Lossless PowerShell parsing for Rust. Tokenize or parse a script, walk or rewrite the result, and get the exact source back.

poshtree keeps every byte of the input. The lexer attaches whitespace, newlines, and comments to the tokens as trivia, so reconstructing the token stream returns the original source byte-for-byte, malformed input included. A native recursive-descent parser sits on top and builds a tree whose every node carries a byte span and a token range. Broken input becomes error nodes instead of a failed parse, so there is always a tree to work with. That combination makes it a practical base for formatters, linters, codemods, and editor tooling. It has no dependencies.

Install

[dependencies]
poshtree = "0.2.2"

Or point at a local checkout:

[dependencies]
poshtree = { path = "../poshtree" }

Items live under the v2 module and are used path-qualified; nothing is re-exported at the crate root.

Lossless tokens

Whitespace, newlines, and comments ride along as trivia on the tokens, and reconstruct glues them back into the original source.

use poshtree::v2::{lex, reconstruct, apply_edits, TextEdit, TokenKind};

let src = "get-wmiobject Win32_BIOS   # keep this comment\n";
let out = lex(src);
assert_eq!(reconstruct(&out.tokens), src); // byte-for-byte

// Minimal-diff rewriting: patch one token, leave the rest alone.
let edits: Vec<TextEdit> = out.tokens.iter()
    .filter(|t| t.kind == TokenKind::Generic && t.value_eq_ci("Get-WmiObject"))
    .map(|t| TextEdit::replace(t.span, "Get-CimInstance"))
    .collect();
let fixed = apply_edits(src, &edits).unwrap();
assert_eq!(fixed, "Get-CimInstance Win32_BIOS   # keep this comment\n");

Every token and trivia carries a byte Span, and a LineIndex maps an offset to line and column. --% is handled in the lexer: the rest of the line becomes one raw VerbatimArgs token. A few constructs lex more cohesively than you might expect, with a path like C:\tmp or a dotted run like a.b.c staying a single token; the module docs spell those out.

Parse and walk

parse returns the script tree plus any recoverable errors. Each node carries a byte Span and a TokenRange, so a node can be sliced straight back to its source.

use poshtree::v2::{parse, NodeKind};

let out = parse("get-process | sort-object CPU\n");
assert!(out.errors.is_empty());

out.script.walk(&mut |n| {
    if let NodeKind::Command { name, .. } = &n.kind {
        if let NodeKind::BareWord(s) = &name.kind {
            println!("command: {s}");
        }
    }
});

The grammar covers pipelines and &&/|| chains, commands with parameter-argument binding and redirections, every control-flow statement, function/filter/workflow, class, enum, using, trap/data/dynamicparam, the full expression layer, double-quoted string interpolation parts, and Add-Type C# extraction (it pulls [DllImport] signatures out of the inline C#, following a string through a variable assignment when it has to). It runs against a broad corpus and is fuzzed, so adversarial input recovers into error nodes rather than panicking.

C# in Add-Type

Add-Type embeds C# inside a PowerShell string, and a type or method defined there is used back in PowerShell as ordinary syntax: [Win32], [Win32]::Beep(...), New-Object Win32. The optional csharp feature parses that C# into its own lossless tree, resolves it (scopes, shadowing, and references), and connects the two languages, so a rename moves both sides at once.

[dependencies]
poshtree = { version = "0.2.2", features = ["csharp"] }
use poshtree::v2::{parse, apply_edits};
use poshtree::v2::csharp::rename_type;

let src = "Add-Type -TypeDefinition @'\npublic class Win32 { }\n'@\n[Win32]::Beep(800, 200)\n$h = New-Object Win32\n";
let out = parse(src);

// Renames the C# declaration and every PowerShell use in one pass.
let edits = rename_type(&out.script, src, "Win32", "NativeMethods");
let fixed = apply_edits(src, &edits).unwrap();
// class NativeMethods ... [NativeMethods]::Beep(800, 200) ... New-Object NativeMethods

rename_member does the same for a member and its [Type]::Member call sites, and rename_csharp_field, _method, _local, and _parameter rename within the C# alone. Resolution is single-file and case-correct, since C# is case-sensitive and PowerShell is not. A member access is renamed only when its receiver can mean that member: this.Length, or a static Type.Length, but never an unrelated other.Length whose type is unknown. With the feature on, the [DllImport] extraction above also reads from this parse rather than the fallback scanner. It adds no dependencies.

Formatting

format_source is a width-aware formatter built on the lossless tokens.

let pretty = poshtree::v2::format_source("if($x){\nls\n}\n")?;
// "if ($x) {\n    ls\n}\n"

It normalizes indentation, spacing, blank lines, backtick continuations, and over-long lines, breaking them at pipes, chain operators, commas, and brackets. Comments, here-strings, --% arguments, and token adjacency stay byte-for-byte. It refuses input that has syntax errors, and before returning it re-lexes and re-parses its own output to confirm the program is unchanged. If that check fails you get an error instead of altered code.

Examples

The examples/ directory has three runnable programs.

pascalize is a small codemod on the v2 layer. It parses with parse, finds command names, and rewrites each to PascalCase through apply_edits, touching only the name tokens and leaving comments, strings, arguments, and layout intact.

$ cargo run --example pascalize           # built-in demo
$ cargo run --example pascalize -- file.ps1
$ cat file.ps1 | cargo run --example pascalize -- -

pinvoke-report uses the csharp feature to read the C# in each Add-Type block and print its [DllImport] signatures and declared types, methods, and fields. It only reads the script.

$ cargo run --features csharp --example pinvoke-report           # built-in demo
$ cargo run --features csharp --example pinvoke-report -- file.ps1

rename-native uses the csharp feature to rename a C# type or member across both the Add-Type block and its PowerShell call sites in one pass.

$ cargo run --features csharp --example rename-native            # built-in demo
$ cat file.ps1 | cargo run --features csharp --example rename-native -- type Win32 NativeApi
$ cat file.ps1 | cargo run --features csharp --example rename-native -- member Win32 MessageBox ShowMessage

Versioning

Breaking changes to the token or tree types ship as a new sibling version module rather than mutating what is already published, so pinned code keeps compiling. The current module is v2.

License

MIT. See LICENSE.