normalize-languages 0.3.1

Tree-sitter language support and dynamic grammar loading
Documentation
1
2
3
4
5
# normalize-languages/src

Source for the normalize-languages crate.

One file per language (e.g., `python.rs`, `rust.rs`, `go.rs`, `typescript.rs`, ...) plus shared infrastructure: `traits.rs` (`Language` trait and capability sub-traits: `LanguageSymbols` — programming languages with code symbols, `LanguageEmbedded` — multi-language formats like Vue/HTML/Svelte; data/markup format symbol extraction for JSON, TOML, YAML, CSS, SCSS, HTML, XML via tags.scm + trait methods; also `ContainerBody`, `EmbeddedBlock`, `Symbol`, `Import`, `Visibility`, `simple_symbol`, `simple_function_symbol`; `extract_module_doc()` — returns module-level doc comment for file view preamble, implemented for Rust `//!`, Python docstrings, Go package comments, JS/TS JSDoc, Ruby `#` comment blocks), `grammar_loader.rs` (`GrammarLoader` — dynamic `.so`/`.dylib` loading with ABI version checking; compiled query caching via `get_compiled_query()` avoids recompilation across calls; TSX imports uses TypeScript query), `query_predicates.rs` (`satisfies_predicates()` — evaluates standard tree-sitter predicates `#match?`, `#not-match?`, `#eq?`, `#not-eq?` against a `QueryMatch`; unknown predicates pass so future predicates don't break existing queries; used by `collect_captures` in tests and `decoration_extended_start` in normalize-refactor), `registry.rs` (global language registry), `parsers.rs` (global `GrammarLoader` singleton; `try_get_grammar`/`parse_with_grammar`/`parser_for` emit a one-shot stderr warning and record missing grammars in a process-wide tracker — drain via `take_missing_grammars()` to summarise affected files; bare `report_missing_grammar(name, err)` for direct `GrammarLoader::get` callers), `body.rs` (shared container-body utilities — `analyze_brace_body`/`analyze_paren_body` delegate to `analyze_delimited_body`), `docstring.rs` (shared `extract_preceding_prefix_comments` helper — used by Go, Lua, Ada, R, Dart), `ecmascript.rs` (shared JS/TS extraction logic, includes `extract_js_module_doc()` for file-top JSDoc), `component.rs` (Vue/Svelte component support), `ast_grep.rs` (ast-grep integration), `ffi.rs` (C FFI helpers), `external_packages.rs` (external package index), and `queries/` (tree-sitter `.scm` query files). Language implementations provide `refine_kind()` for struct/enum/interface/trait refinement (Rust, Go, C#, C++, Swift, Kotlin, Scala, Java, Dart, PHP), `get_visibility()` for access modifiers, `extract_docstring()`, `extract_attributes()`, and `extract_implements()` where the grammar supports these concepts.