normalize-languages 0.3.2

Tree-sitter language support and dynamic grammar loading
Documentation
1
2
3
4
5
# normalize-languages/src

Source for the normalize-languages crate.

One file per language (e.g., `python.rs`, `rust.rs`, `go.rs`, `typescript.rs`, ...) plus shared infrastructure: `traits.rs` (`Language` trait and capability sub-traits: `LanguageSymbols` — programming languages with code symbols, `LanguageEmbedded` — multi-language formats like Vue/HTML/Svelte; Phase 0 module resolution types: `ModuleResolver` trait, `ResolverConfig`, `ImportSpec`, `ModuleId`, `Resolution` — per-language module resolution logic with default `module_resolver() -> None`; data/markup format symbol extraction for JSON, TOML, YAML, CSS, SCSS, HTML, XML via tags.scm + trait methods; also `ContainerBody`, `EmbeddedBlock`, `Symbol`, `Import`, `Visibility`, `simple_symbol`, `simple_function_symbol`; `extract_module_doc()` — returns module-level doc comment for file view preamble, implemented for Rust `//!`, Python docstrings, Go package comments, JS/TS JSDoc, Ruby `#` comment blocks), `grammar_loader.rs` (`GrammarLoader` — dynamic `.so`/`.dylib` loading with ABI version checking; compiled query caching via `get_compiled_query()` avoids recompilation across calls; TSX imports uses TypeScript query; `get_cfg(name)` loads `.cfg.scm` CFG queries parallel to `get_complexity`), `query_predicates.rs` (`satisfies_predicates()` — evaluates standard tree-sitter predicates `#match?`, `#not-match?`, `#eq?`, `#not-eq?` against a `QueryMatch`; unknown predicates pass so future predicates don't break existing queries; used by `collect_captures` in tests and `decoration_extended_start` in normalize-refactor), `registry.rs` (global language registry), `parsers.rs` (global `GrammarLoader` singleton; `try_get_grammar`/`parse_with_grammar`/`parser_for` emit a one-shot stderr warning and record missing grammars in a process-wide tracker — drain via `take_missing_grammars()` to summarise affected files; bare `report_missing_grammar(name, err)` for direct `GrammarLoader::get` callers), `body.rs` (shared container-body utilities — `analyze_brace_body`/`analyze_paren_body` delegate to `analyze_delimited_body`), `docstring.rs` (shared `extract_preceding_prefix_comments` helper — used by Go, Lua, Ada, R, Dart), `ecmascript.rs` (shared JS/TS extraction logic, includes `extract_js_module_doc()` for file-top JSDoc), `component.rs` (Vue/Svelte component support), `ast_grep.rs` (ast-grep integration), `ffi.rs` (C FFI helpers), `external_packages.rs` (external package index), and `queries/` (tree-sitter `.scm` query files). Language implementations provide `refine_kind()` for struct/enum/interface/trait refinement (Rust, Go, C#, C++, Swift, Kotlin, Scala, Java, Dart, PHP), `get_visibility()` for access modifiers, `extract_docstring()`, `extract_attributes()`, and `extract_implements()` where the grammar supports these concepts; `module_resolver()` for languages with module systems: Rust (`RustModuleResolver`), TypeScript/TSX (`TsModuleResolver`), Python (`PythonModuleResolver`), Go (`GoModuleResolver`), JavaScript (`JsModuleResolver`), Ruby (`RubyModuleResolver`), Java/Kotlin/Groovy/Scala (JVM Maven/Gradle path conventions), C#/VB/F# (.NET namespace-to-file mapping), Swift (`SwiftModuleResolver` — SPM target resolution), Dart (`DartModuleResolver` — pubspec.yaml package: imports), Zig (`ZigModuleResolver` — @import relative paths), Elixir (`ElixirModuleResolver` — Mix lib/ CamelCase↔snake_case), Erlang (`ErlangModuleResolver` — 1:1 file=module), Haskell (`HaskellModuleResolver` — Cabal hs-source-dirs), OCaml (`OCamlModuleResolver` — capitalized stem), Lua (`LuaModuleResolver` — require dot-path), PHP (`PhpModuleResolver` — composer.json PSR-4), Perl (`PerlModuleResolver` — lib/ :: path), Clojure (`ClojureModuleResolver` — src/ dot-namespace), Common Lisp (`CommonLispModuleResolver` — workspace stem), Scheme (`SchemeModuleResolver` — R7RS sld/scm), Gleam (`GleamModuleResolver` — gleam.toml src/), ReScript (`ReScriptModuleResolver` — bsconfig/rescript.json sources), Elm (`ElmModuleResolver` — elm.json source-directories, dot-module to file path), Nix (`NixModuleResolver` — relative ./path.nix resolution, angle-bracket imports NotFound), R (`RModuleResolver` — source("./file.R") relative load, library(pkg) NotFound), Julia (`JuliaModuleResolver` — include("file.jl") relative + Project.toml package lookup), MATLAB (`MatlabModuleResolver` — filename stem = function name, searches workspace + src/ + lib/), Prolog (`PrologModuleResolver` — relative use_module + bare name search, library(...) NotFound), D (`DModuleResolver` — dub.json sourcePaths, dot-module to file path).