Skip to main content

compute_dominant_language

Function compute_dominant_language 

Source
pub fn compute_dominant_language(root: &Path) -> Option<String>
Expand description

Walk root and return the canonical extension tag of the dominant source language by file count (e.g. rs, py, ts, go). Returns None when the project contains fewer than 3 source files in total, or when no single language holds a clear plurality.

v1.5 Phase 2j MCP follow-up. The engine helper walks the project once at activation time and hands the result to the MCP tool layer, which then exports CODELENS_EMBED_HINT_AUTO_LANG=<lang> so the engine’s auto_hint_should_enable gate can consult language_supports_nl_stack on subsequent embedding calls.

Walk scope is capped (16 k files) to avoid pathological cases on very large monorepos — the goal is to classify the project by dominant language, not to enumerate every file. Directories in EXCLUDED_DIRS are skipped (same filter as collect_files). Only files with an extension recognised by the language registry are counted; build artefacts / README / Markdown are ignored.

The returned tag is the canonical extension string (e.g. rs, py) — exactly what CODELENS_EMBED_HINT_AUTO_LANG expects and what crate::embedding::language_supports_nl_stack accepts.