# panproto-parse
[](https://crates.io/crates/panproto-parse)
[](https://docs.rs/panproto-parse)
[](../../LICENSE)
Parses source code in 248 programming languages into panproto schema graphs using tree-sitter grammars.
## What it does
Tree-sitter parses source code into an abstract syntax tree (AST): a tree of named node types (`function_definition`, `class_declaration`, `import_statement`) connected by named fields (`name`, `body`, `parameters`). Panproto converts this AST structure into a schema graph where each node type becomes a vertex kind and each field name becomes an edge kind. The schema graph represents the full structure of the source file as panproto data.
The theory for each language (the formal description of what the schema graph for that language looks like) is extracted automatically from the grammar's `node-types.json` file. Because the theory is always derived from the grammar itself, it stays in sync automatically as grammars are updated. One `AstWalker` implementation handles all 248 languages; there is no per-language parsing code.
Alongside each schema vertex, the walker records interstitial text: the keywords, punctuation, and whitespace that appear between named AST children. The emitter collects these fragments by byte position and concatenates them to reproduce the original source exactly. `emit(parse(source)) == source` for any file the grammar can parse.
## Quick example
```rust,ignore
use panproto_parse::registry;
// All 248 languages are registered automatically with the default feature set.
let reg = registry::global();
// Parse a Rust source file into a schema graph.
let schema = reg.parse_file("src/main.rs")?;
// Emit the schema back to source code.
let source = reg.emit_file("src/main.rs", &schema)?;
assert_eq!(source, std::fs::read("src/main.rs")?);
// Extract the theory for the Rust language.
let parser = reg.get("rust").unwrap();
let theory_meta = parser.theory_meta();
```
## API overview
| `ParserRegistry` | Holds all language parsers; dispatches by protocol name or file extension |
| `registry::global()` | Returns the global registry populated from `panproto-grammars` |
| `AstParser` | Trait for a single-language parser and emitter (implement to add a language) |
| `AstWalker` | Generic tree-sitter walker that works for all languages |
| `WalkerConfig` | Per-language customization: scope hints, formatting constraints |
| `extract_theory_from_node_types` | Derive a panproto theory from a grammar's `node-types.json` |
| `ExtractedTheoryMeta` | The derived theory plus sort counts and field statistics |
| `IdGenerator` | Scope-aware vertex ID generation for full-AST schemas |
| `ParseError` | Error type for parse and emit failures |
## Theory extraction mapping
| Named node type | Sort (vertex kind) |
| Required field | Mandatory operation (edge kind) |
| Optional field | Partial operation |
| Multiple field | Ordered operation |
| Supertype | Abstract sort with subtype inclusions |
## License
[MIT](../../LICENSE)