oxilean_parse/
lib.rs

1#![allow(unused_imports)]
2#![allow(dead_code)]
3
4//! # OxiLean Parser — Surface Syntax to AST
5//!
6//! This crate converts concrete OxiLean source syntax (text) into abstract syntax trees (ASTs)
7//! that the elaborator can process into kernel-checkable terms.
8//!
9//! ## Quick Start
10//!
11//! ### Parsing an Expression
12//!
13//! ```ignore
14//! use oxilean_parse::{Lexer, Parser, SurfaceExpr};
15//!
16//! let source = "fun (x : Nat) => x + 1";
17//! let lexer = Lexer::new(source);
18//! let tokens = lexer.tokenize()?;
19//! let mut parser = Parser::new(tokens);
20//! let expr = parser.parse_expr()?;
21//! ```
22//!
23//! ### Parsing a Module
24//!
25//! ```ignore
26//! use oxilean_parse::Module;
27//!
28//! let source = "def double (n : Nat) : Nat := n + n";
29//! let module = Module::parse_source(source)?;
30//! ```
31//!
32//! ## Architecture Overview
33//!
34//! The parser is a three-stage pipeline:
35//!
36//! ```text
37//! Source Text (.oxilean file)
38//!     │
39//!     ▼
40//! ┌──────────────────────┐
41//! │  Lexer               │  → Tokenizes text
42//! │  (lexer.rs)          │  → Handles UTF-8, comments, strings
43//! └──────────────────────┘
44//!     │
45//!     ▼
46//! Token Stream
47//!     │
48//!     ▼
49//! ┌──────────────────────┐
50//! │  Parser              │  → Builds AST from tokens
51//! │  (parser_impl.rs)    │  → Handles precedence, associativity
52//! │  + Helpers           │  → Pattern, macro, tactic parsers
53//! └──────────────────────┘
54//!     │
55//!     ▼
56//! Abstract Syntax Tree (AST)
57//!     │
58//!     └─→ SurfaceExpr, SurfaceDecl, Module, etc.
59//!     └─→ Diagnostic information (errors, warnings)
60//!     └─→ Source mapping (for IDE integration)
61//! ```
62//!
63//! ## Key Concepts & Terminology
64//!
65//! ### Tokens
66//!
67//! Basic lexical elements:
68//! - **Identifiers**: `x`, `Nat`, `add_comm`, etc.
69//! - **Keywords**: `fun`, `def`, `theorem`, `inductive`, etc.
70//! - **Operators**: `+`, `-`, `:`, `=>`, `|`, etc.
71//! - **Literals**: Numbers, strings, characters
72//! - **Delimiters**: `(`, `)`, `[`, `]`, `{`, `}`
73//!
74//! ### Surface Syntax (SurfaceExpr)
75//!
76//! Represents OxiLean code before elaboration:
77//! - **Applications**: `f x y` (function calls)
78//! - **Lambda**: `fun x => body`
79//! - **Pi types**: `(x : A) -> B`
80//! - **Matches**: `match x with | nil => ... | cons h t => ...`
81//! - **Tactics**: `by (tac1; tac2)`
82//! - **Attributes**: `@[simp] def foo := ...`
83//!
84//! ### AST vs Kernel Expr
85//!
86//! - **AST (this crate)**: Surface syntax with implicit info
87//!   - Contains `?` (implicit args), `_` (placeholders)
88//!   - No type annotations required
89//!   - Represents user-written code
90//! - **Kernel Expr (oxilean-kernel)**: Type-checked terms
91//!   - All types explicit
92//!   - All implicit args resolved
93//!   - Fully elaborated
94//!
95//! ## Module Organization
96//!
97//! ### Core Parsing Modules
98//!
99//! | Module | Purpose |
100//! |--------|---------|
101//! | `lexer` | Tokenization: text → tokens |
102//! | `tokens` | Token and TokenKind definitions |
103//! | `parser_impl` | Main parser: tokens → AST |
104//! | `command` | Command parsing (`def`, `theorem`, etc.) |
105//!
106//! ### AST Definition
107//!
108//! | Module | Purpose |
109//! |--------|---------|
110//! | `ast_impl` | Core AST types: `SurfaceExpr`, `SurfaceDecl`, etc. |
111//! | `pattern` | Pattern matching syntax |
112//! | `literal` | Number and string literals |
113//!
114//! ### Specialized Parsers
115//!
116//! | Module | Purpose |
117//! |--------|---------|
118//! | `tactic_parser` | Tactic syntax: `intro`, `apply`, `rw`, etc. |
119//! | `macro_parser` | Macro definition and expansion |
120//! | `notation_system` | Operator precedence and associativity |
121//!
122//! ### Diagnostics & Source Mapping
123//!
124//! | Module | Purpose |
125//! |--------|---------|
126//! | `diagnostic` | Error and warning collection |
127//! | `error_impl` | Parse error types and messages |
128//! | `sourcemap` | Source position tracking for IDE |
129//! | `span_util` | Source span utilities |
130//!
131//! ### Utilities
132//!
133//! | Module | Purpose |
134//! |--------|---------|
135//! | `prettyprint` | AST pretty-printing |
136//! | `module` | Module system and imports |
137//! | `repl_parser` | REPL command parsing |
138//!
139//! ## Parsing Pipeline Details
140//!
141//! ### Phase 1: Lexical Analysis (Lexer)
142//!
143//! Transforms character stream into token stream:
144//! - Handles Unicode identifiers (UTF-8)
145//! - Recognizes keywords vs identifiers
146//! - Tracks line/column positions (for error reporting)
147//! - Supports:
148//!   - Single-line comments: `-- comment`
149//!   - Multi-line comments: `/-  -/`
150//!   - String literals: `"hello"`
151//!   - Number literals: `42`, `0xFF`, `3.14`
152//!
153//! ### Phase 2: Syntactic Analysis (Parser)
154//!
155//! Transforms token stream into AST:
156//! - **Recursive descent**: For statements and declarations
157//! - **Pratt parsing**: For expressions (handles precedence)
158//! - **Lookahead(1)**: LL(1) grammar for predictive parsing
159//! - **Error recovery**: Continues parsing after errors
160//!
161//! ### Phase 3: Post-Processing
162//!
163//! - **Notation expansion**: Apply infix/prefix operators
164//! - **Macro expansion**: Expand syntax sugar
165//! - **Span assignment**: Map AST nodes to source positions
166//!
167//! ## Usage Examples
168//!
169//! ### Example 1: Parse and Pretty-Print
170//!
171//! ```text
172//! use oxilean_parse::{Lexer, Parser, print_expr};
173//!
174//! let source = "(x : Nat) -> Nat";
175//! let mut parser = Parser::from_source(source)?;
176//! let expr = parser.parse_expr()?;
177//! println!("{}", print_expr(&expr));
178//! ```
179//!
180//! ### Example 2: Parse a Definition
181//!
182//! ```text
183//! use oxilean_parse::{Lexer, Parser, Decl};
184//!
185//! let source = "def double (n : Nat) : Nat := n + n";
186//! let mut parser = Parser::from_source(source)?;
187//! let decl = parser.parse_decl()?;
188//! assert!(matches!(decl, Decl::Def { .. }));
189//! ```
190//!
191//! ### Example 3: Collect Diagnostics
192//!
193//! ```text
194//! use oxilean_parse::DiagnosticCollector;
195//!
196//! let mut collector = DiagnosticCollector::new();
197//! // ... parse code ...
198//! for diag in collector.diagnostics() {
199//!     println!("{:?}", diag);
200//! }
201//! ```
202//!
203//! ## Operator Precedence
204//!
205//! Operators are organized by precedence levels (0-100):
206//! - **Level 100** (highest): Projections, applications
207//! - **Level 90**: Power/exponentiation
208//! - **Level 70**: Multiplication, division
209//! - **Level 65**: Addition, subtraction
210//! - **Level 50**: Comparison (`<`, `>`, `=`, etc.)
211//! - **Level 40**: Conjunction (`and`)
212//! - **Level 35**: Disjunction (`or`)
213//! - **Level 25**: Implication (`->`)
214//! - **Level 0** (lowest): Binders (`fun`, `:`, etc.)
215//!
216//! Associativity (left/right/non-associative) is per-operator.
217//!
218//! ## Error Handling
219//!
220//! Parser errors include:
221//! - **Unexpected token**: Parser expected a different token
222//! - **Expected `type` token**: Specific token was expected but not found
223//! - **Unclosed delimiter**: Missing closing bracket/paren
224//! - **Undeclared operator**: Unknown infix operator
225//! - **Invalid pattern**: Malformed pattern in match/fun
226//!
227//! All errors carry:
228//! - **Source location** (span): File, line, column
229//! - **Error message**: Human-readable description
230//! - **Context**: Surrounding code snippet (for IDE tooltips)
231//!
232//! ## Source Mapping & IDE Integration
233//!
234//! The parser builds a **source map** tracking:
235//! - AST node → source location
236//! - Hover information (for IDE hover tooltips)
237//! - Semantic tokens (for syntax highlighting)
238//! - Reference locations (for "go to definition")
239//!
240//! This enables:
241//! - Accurate error reporting
242//! - IDE language server protocol (LSP) support
243//! - Refactoring tools
244//!
245//! ## Extensibility
246//!
247//! ### Adding New Operators
248//!
249//! Operators are registered in `notation_system`:
250//! ```ignore
251//! let notation = Notation {
252//!     name: "my_op",
253//!     kind: NotationKind::Infix,
254//!     level: 60,
255//!     associativity: Associativity::Left,
256//! };
257//! notation_table.insert(notation);
258//! ```
259//!
260//! ### Adding New Keywords
261//!
262//! Keywords are hardcoded in `lexer::keyword_of_string()`.
263//! Add new keyword, then handle in parser.
264//!
265//! ### Custom Macros
266//!
267//! Macros are parsed by `macro_parser` and expanded during parsing:
268//! ```text
269//! syntax "list" ["[", expr, (",", expr)*, "]"] => ...
270//! macro list_to_cons : list => (...)
271//! ```
272//!
273//! ## Integration with Other Crates
274//!
275//! ### With oxilean-elab
276//!
277//! The elaborator consumes this crate's AST:
278//! ```text
279//! Parser: Source → SurfaceExpr
280//! Elaborator: SurfaceExpr → Kernel Expr (with type checking)
281//! ```
282//!
283//! ### With oxilean-kernel
284//!
285//! Kernel types (Name, Level, Literal) are re-exported by parser for convenience.
286//!
287//! ## Performance Considerations
288//!
289//! - **Linear parsing**: O(n) where n = source length
290//! - **Minimal allocations**: AST nodes are typically smaller than source
291//! - **Single pass**: No tokenization+parsing phase, done in parallel
292//!
293//! ## Further Reading
294//!
295//! - [ARCHITECTURE.md](../../ARCHITECTURE.md) — System architecture
296//! - [BLUEPRINT.md](../../BLUEPRINT.md) — Formal syntax specification
297//! - Module documentation for specific subcomponents
298
299#![allow(missing_docs)]
300#![warn(clippy::all)]
301#![allow(clippy::collapsible_if)]
302
303// Module stubs for future implementation
304pub mod ast;
305pub mod error;
306pub mod parser;
307pub mod token;
308
309// Full implementations
310pub mod ast_impl;
311pub mod command;
312pub mod diagnostic;
313pub mod error_impl;
314pub mod expr_cache;
315/// Advanced formatter with Wadler-Lindig optimal layout.
316pub mod formatter_adv;
317pub mod incremental;
318pub mod indent_tracker;
319pub mod lexer;
320pub mod macro_parser;
321pub mod module;
322pub mod notation_system;
323pub mod parser_impl;
324pub mod pattern;
325pub mod prettyprint;
326pub mod repl_parser;
327pub mod roundtrip;
328pub mod sourcemap;
329pub mod span_util;
330pub mod tactic_parser;
331pub mod tokens;
332pub mod wasm_source_map;
333
334pub use ast::SurfaceExpr as OldSurfaceExpr;
335pub use ast_impl::{
336    AstNotationKind, AttributeKind, Binder, BinderKind, CalcStep as AstCalcStep, Constructor, Decl,
337    DoAction, FieldDecl, Literal, Located, MatchArm, Pattern, SortKind, SurfaceExpr, TacticRef,
338    WhereClause,
339};
340pub use command::{Command, CommandParser, NotationKind, OpenItem};
341pub use diagnostic::{Diagnostic, DiagnosticCollector, DiagnosticLabel, Severity};
342pub use error::ParseError as OldParseError;
343pub use error_impl::{ParseError, ParseErrorKind};
344pub use lexer::Lexer;
345pub use macro_parser::{
346    HygieneInfo, MacroDef, MacroError, MacroErrorKind, MacroExpander, MacroParser, MacroRule,
347    MacroToken, SyntaxDef, SyntaxItem, SyntaxKind,
348};
349pub use module::{Module, ModuleRegistry};
350pub use notation_system::{
351    Fixity, NotationEntry, NotationKind as NotationSystemKind, NotationPart, NotationTable,
352    OperatorEntry,
353};
354pub use parser_impl::Parser;
355pub use pattern::{MatchClause, PatternCompiler};
356pub use prettyprint::{print_decl, print_expr, ParensMode, PrettyConfig, PrettyPrinter};
357pub use repl_parser::{is_complete, ReplCommand, ReplParser};
358pub use sourcemap::{
359    EntryKind, HoverInfo, SemanticToken, SourceEntry, SourceMap, SourceMapBuilder,
360};
361pub use span_util::{dummy_span, merge_spans, span_contains, span_len};
362pub use tactic_parser::{
363    CalcStep, CaseArm, ConvSide, RewriteRule, SimpArgs, TacticExpr, TacticParser,
364};
365pub use tokens::{Span, StringPart, Token, TokenKind};
366
367// ============================================================
368// Extended Parse Utilities
369// ============================================================
370
371/// The current version of the OxiLean parser.
372///
373/// This follows semantic versioning: MAJOR.MINOR.PATCH.
374#[allow(missing_docs)]
375pub const PARSER_VERSION: &str = "0.1.1";
376
377/// Configuration options for the parser.
378///
379/// These influence how the parser handles ambiguity, error recovery,
380/// and diagnostic reporting.
381#[derive(Debug, Clone)]
382#[allow(missing_docs)]
383pub struct ParserConfig {
384    /// Maximum expression nesting depth before the parser gives up.
385    ///
386    /// Default: 512.
387    pub max_depth: usize,
388    /// Whether to enable experimental Unicode operator support.
389    ///
390    /// Default: `true`.
391    #[allow(missing_docs)]
392    pub unicode_ops: bool,
393    /// Whether to emit "did you mean?" suggestions on parse errors.
394    ///
395    /// Default: `true`.
396    pub suggestions: bool,
397    /// Whether to allow recovery from parse errors and continue parsing.
398    ///
399    /// Default: `false` (strict mode).
400    #[allow(missing_docs)]
401    pub error_recovery: bool,
402    /// Whether `#check`, `#eval`, and similar commands are permitted.
403    ///
404    /// Default: `true`.
405    pub allow_commands: bool,
406}
407
408impl Default for ParserConfig {
409    fn default() -> Self {
410        ParserConfig {
411            max_depth: 512,
412            unicode_ops: true,
413            suggestions: true,
414            error_recovery: false,
415            allow_commands: true,
416        }
417    }
418}
419
420impl ParserConfig {
421    /// Create a new parser config with default settings.
422    #[allow(missing_docs)]
423    pub fn new() -> Self {
424        Self::default()
425    }
426
427    /// Set the maximum nesting depth.
428    #[allow(missing_docs)]
429    pub fn with_max_depth(mut self, depth: usize) -> Self {
430        self.max_depth = depth;
431        self
432    }
433
434    /// Enable or disable unicode operator support.
435    #[allow(missing_docs)]
436    pub fn with_unicode_ops(mut self, enabled: bool) -> Self {
437        self.unicode_ops = enabled;
438        self
439    }
440
441    /// Enable error recovery mode.
442    #[allow(missing_docs)]
443    pub fn with_error_recovery(mut self, enabled: bool) -> Self {
444        self.error_recovery = enabled;
445        self
446    }
447
448    /// Disable all optional features for strict minimal parsing.
449    #[allow(missing_docs)]
450    pub fn strict() -> Self {
451        ParserConfig {
452            max_depth: 256,
453            unicode_ops: false,
454            suggestions: false,
455            error_recovery: false,
456            allow_commands: false,
457        }
458    }
459}
460
461/// A parsed source file with its path and content.
462///
463/// Wraps the raw source text together with metadata that is useful
464/// for diagnostics and IDE integration.
465#[derive(Debug, Clone)]
466#[allow(missing_docs)]
467pub struct SourceFile {
468    /// Path to the source file (may be virtual, e.g., `<repl>`).
469    pub path: String,
470    /// The raw source text.
471    pub source: String,
472    /// Length of the source in bytes.
473    #[allow(missing_docs)]
474    pub byte_len: usize,
475}
476
477impl SourceFile {
478    /// Create a new source file from a path and source string.
479    #[allow(missing_docs)]
480    pub fn new(path: impl Into<String>, source: impl Into<String>) -> Self {
481        let source = source.into();
482        let byte_len = source.len();
483        SourceFile {
484            path: path.into(),
485            source,
486            byte_len,
487        }
488    }
489
490    /// Create a virtual source file (e.g., for REPL input).
491    #[allow(missing_docs)]
492    pub fn virtual_(source: impl Into<String>) -> Self {
493        Self::new("<virtual>", source)
494    }
495
496    /// Returns the number of lines in the source file.
497    #[allow(missing_docs)]
498    pub fn line_count(&self) -> usize {
499        self.source.lines().count()
500    }
501
502    /// Returns `true` if the source file is empty.
503    #[allow(missing_docs)]
504    pub fn is_empty(&self) -> bool {
505        self.source.is_empty()
506    }
507
508    /// Returns the source text of a specific line (0-indexed).
509    ///
510    /// Returns `None` if the line index is out of bounds.
511    #[allow(missing_docs)]
512    pub fn line(&self, idx: usize) -> Option<&str> {
513        self.source.lines().nth(idx)
514    }
515}
516
517/// The result of a parse operation: either a value or a parse error.
518#[allow(missing_docs)]
519pub type ParseResult<T> = Result<T, String>;
520
521/// Parse an OxiLean expression from a source string.
522///
523/// This is a convenience wrapper around the lexer and parser.
524///
525/// # Errors
526/// Returns `Err` with a description if lexing or parsing fails.
527///
528/// # Example
529/// ```ignore
530/// let expr = parse_expr_str("1 + 2")?;
531/// ```
532#[allow(missing_docs)]
533pub fn parse_expr_str(source: &str) -> ParseResult<String> {
534    if source.trim().is_empty() {
535        return Err("empty source".to_string());
536    }
537    let tokens = Lexer::new(source).tokenize();
538    let mut parser = Parser::new(tokens);
539    let expr = parser
540        .parse_expr()
541        .map_err(|e| format!("parse error: {e}"))?;
542    Ok(print_expr(&expr.value))
543}
544
545/// Parse an OxiLean declaration from a source string.
546///
547/// # Errors
548/// Returns `Err` with a description if lexing or parsing fails.
549#[allow(missing_docs)]
550pub fn parse_decl_str(source: &str) -> ParseResult<String> {
551    if source.trim().is_empty() {
552        return Err("empty declaration source".to_string());
553    }
554    let tokens = Lexer::new(source).tokenize();
555    let mut parser = Parser::new(tokens);
556    let decl = parser
557        .parse_decl()
558        .map_err(|e| format!("parse error: {e}"))?;
559    Ok(print_decl(&decl.value))
560}
561
562/// Parse all top-level declarations from an OxiLean source file.
563///
564/// Returns the count of successfully parsed declarations.  Errors within
565/// individual declarations are collected into `errors` (if provided) and
566/// parsing continues at the next declaration.
567///
568/// # Errors
569/// Returns `Err` only when the source is empty.
570#[allow(missing_docs)]
571pub fn parse_source_file(source: &str, mut errors: Option<&mut Vec<String>>) -> ParseResult<usize> {
572    if source.trim().is_empty() {
573        return Ok(0);
574    }
575
576    let mut lexer = Lexer::new(source);
577    let tokens = lexer.tokenize();
578    let mut parser = Parser::new(tokens);
579    let mut count = 0usize;
580
581    while !parser.is_eof() {
582        match parser.parse_decl() {
583            Ok(_) => count += 1,
584            Err(e) => {
585                if let Some(ref mut errs) = errors {
586                    errs.push(e.to_string());
587                }
588                // Skip the offending token to avoid an infinite loop.
589                parser.advance();
590            }
591        }
592    }
593
594    Ok(count)
595}
596
597/// Parse all top-level declarations from an OxiLean source string.
598///
599/// This is the top-level wrapper for parsing a complete source file.
600/// It lexes the source, then calls `parse_decl` in a loop until EOF,
601/// collecting all successfully parsed declarations.
602///
603/// # Errors
604///
605/// Returns `Err` with a description string on the first parse error.
606/// Unlike [`parse_source_file`], this function stops at the first error
607/// rather than attempting to recover.
608///
609/// # Example
610///
611/// ```ignore
612/// use oxilean_parse::parse_file;
613///
614/// let source = "def foo : Nat := 0";
615/// let decls = parse_file(source)?;
616/// assert_eq!(decls.len(), 1);
617/// ```
618#[allow(missing_docs)]
619pub fn parse_file(source: &str) -> ParseResult<Vec<Located<Decl>>> {
620    if source.trim().is_empty() {
621        return Ok(Vec::new());
622    }
623
624    let mut lexer = Lexer::new(source);
625    let tokens = lexer.tokenize();
626    let mut parser = Parser::new(tokens);
627    let mut decls = Vec::new();
628
629    while !parser.is_eof() {
630        match parser.parse_decl() {
631            Ok(decl) => decls.push(decl),
632            Err(e) => return Err(e.to_string()),
633        }
634    }
635
636    Ok(decls)
637}
638
639/// Utility functions for working with tokens.
640pub mod token_utils {
641    /// Returns `true` if the given string slice is a valid OxiLean identifier.
642    ///
643    /// An identifier must start with a letter or underscore and contain only
644    /// alphanumerics, underscores, or primes (`'`).
645    #[allow(missing_docs)]
646    pub fn is_valid_ident(s: &str) -> bool {
647        let mut chars = s.chars();
648        match chars.next() {
649            Some(c) if c.is_alphabetic() || c == '_' => {
650                chars.all(|c| c.is_alphanumeric() || c == '_' || c == '\'')
651            }
652            _ => false,
653        }
654    }
655
656    /// Returns `true` if the string is a reserved keyword in OxiLean.
657    #[allow(missing_docs)]
658    pub fn is_keyword(s: &str) -> bool {
659        super::keywords::ALL_KEYWORDS.contains(&s)
660    }
661
662    /// Escape a string for inclusion in an OxiLean string literal.
663    #[allow(missing_docs)]
664    pub fn escape_string(s: &str) -> String {
665        s.replace('\\', "\\\\")
666            .replace('"', "\\\"")
667            .replace('\n', "\\n")
668            .replace('\t', "\\t")
669    }
670
671    /// Count the number of UTF-8 characters in a string slice.
672    #[allow(missing_docs)]
673    pub fn char_count(s: &str) -> usize {
674        s.chars().count()
675    }
676}
677
678/// Operator precedence constants for the OxiLean grammar.
679///
680/// These match the precedence levels used in the parser's Pratt parsing table.
681pub mod prec {
682    /// Precedence for `→` (function arrow, right-associative).
683    #[allow(missing_docs)]
684    pub const ARROW: u32 = 25;
685    /// Precedence for `∧` (logical And).
686    pub const AND: u32 = 35;
687    /// Precedence for `∨` (logical Or).
688    pub const OR: u32 = 30;
689    /// Precedence for `↔` (Iff).
690    #[allow(missing_docs)]
691    pub const IFF: u32 = 20;
692    /// Precedence for equality `=`, inequality `≠`, `<`, `≤`, `>`, `≥`.
693    pub const CMP: u32 = 50;
694    /// Precedence for `+` and `-`.
695    pub const ADD: u32 = 65;
696    /// Precedence for `*` and `/`.
697    #[allow(missing_docs)]
698    pub const MUL: u32 = 70;
699    /// Precedence for unary prefix operators (`¬`, `-`).
700    pub const UNARY: u32 = 75;
701    /// Precedence for function application (highest).
702    pub const APP: u32 = 1024;
703}
704
705/// Reserved keywords and special identifiers in OxiLean.
706pub mod keywords {
707    /// All reserved keywords that cannot be used as identifiers.
708    #[allow(missing_docs)]
709    pub const ALL_KEYWORDS: &[&str] = &[
710        "def",
711        "theorem",
712        "lemma",
713        "axiom",
714        "opaque",
715        "abbrev",
716        "fun",
717        "forall",
718        "exists",
719        "let",
720        "in",
721        "have",
722        "show",
723        "if",
724        "then",
725        "else",
726        "match",
727        "with",
728        "return",
729        "structure",
730        "class",
731        "instance",
732        "where",
733        "namespace",
734        "section",
735        "end",
736        "open",
737        "import",
738        "by",
739        "do",
740        "pure",
741        "Prop",
742        "Type",
743        "Sort",
744        "true",
745        "false",
746    ];
747
748    /// Tactic keywords (used inside `by` blocks).
749    #[allow(missing_docs)]
750    pub const TACTIC_KEYWORDS: &[&str] = &[
751        "intro",
752        "intros",
753        "exact",
754        "apply",
755        "rw",
756        "rewrite",
757        "simp",
758        "ring",
759        "linarith",
760        "omega",
761        "norm_num",
762        "cases",
763        "induction",
764        "constructor",
765        "left",
766        "right",
767        "split",
768        "have",
769        "show",
770        "exact",
771        "assumption",
772        "contradiction",
773        "trivial",
774        "rfl",
775        "refl",
776        "push_neg",
777        "by_contra",
778        "by_contradiction",
779        "contrapose",
780        "exfalso",
781        "use",
782        "exists",
783        "obtain",
784        "clear",
785        "rename",
786        "revert",
787        "repeat",
788        "first",
789        "try",
790        "all_goals",
791        "calc",
792        "conv",
793        "norm_cast",
794        "push_cast",
795    ];
796
797    /// Built-in type names that the elaborator recognizes.
798    #[allow(missing_docs)]
799    pub const BUILTIN_TYPES: &[&str] = &[
800        "Nat", "Int", "Float", "Bool", "Char", "String", "List", "Array", "Option", "Result",
801        "Prod", "Sum", "Sigma", "Subtype", "And", "Or", "Not", "Iff", "Eq", "Ne", "HEq", "True",
802        "False", "Empty", "Unit", "Fin", "Prop", "Type",
803    ];
804
805    /// Returns `true` if `s` is a tactic keyword.
806    #[allow(missing_docs)]
807    pub fn is_tactic_keyword(s: &str) -> bool {
808        TACTIC_KEYWORDS.contains(&s)
809    }
810
811    /// Returns `true` if `s` is a built-in type name.
812    #[allow(missing_docs)]
813    pub fn is_builtin_type(s: &str) -> bool {
814        BUILTIN_TYPES.contains(&s)
815    }
816}
817
818#[cfg(test)]
819mod extended_parse_tests {
820    use super::*;
821
822    #[test]
823    fn test_parser_config_default() {
824        let cfg = ParserConfig::default();
825        assert_eq!(cfg.max_depth, 512);
826        assert!(cfg.unicode_ops);
827        assert!(cfg.suggestions);
828        assert!(!cfg.error_recovery);
829    }
830
831    #[test]
832    fn test_parser_config_builder() {
833        let cfg = ParserConfig::new()
834            .with_max_depth(100)
835            .with_unicode_ops(false)
836            .with_error_recovery(true);
837        assert_eq!(cfg.max_depth, 100);
838        assert!(!cfg.unicode_ops);
839        assert!(cfg.error_recovery);
840    }
841
842    #[test]
843    fn test_parser_config_strict() {
844        let cfg = ParserConfig::strict();
845        assert!(!cfg.unicode_ops);
846        assert!(!cfg.error_recovery);
847        assert!(!cfg.suggestions);
848    }
849
850    #[test]
851    fn test_source_file_line_count() {
852        let sf = SourceFile::new("test.lean", "line1\nline2\nline3");
853        assert_eq!(sf.line_count(), 3);
854    }
855
856    #[test]
857    fn test_source_file_is_empty() {
858        let sf = SourceFile::virtual_("");
859        assert!(sf.is_empty());
860    }
861
862    #[test]
863    fn test_source_file_line() {
864        let sf = SourceFile::new("f.lean", "alpha\nbeta\ngamma");
865        assert_eq!(sf.line(1), Some("beta"));
866        assert_eq!(sf.line(99), None);
867    }
868
869    #[test]
870    fn test_source_file_byte_len() {
871        let sf = SourceFile::virtual_("hello");
872        assert_eq!(sf.byte_len, 5);
873    }
874
875    #[test]
876    fn test_parse_expr_str_ok() {
877        let result = parse_expr_str("1 + 2");
878        assert!(result.is_ok());
879    }
880
881    #[test]
882    fn test_parse_expr_str_empty() {
883        let result = parse_expr_str("");
884        assert!(result.is_err());
885    }
886
887    #[test]
888    fn test_parse_decl_str_ok() {
889        let result = parse_decl_str("def foo := 42");
890        assert!(result.is_ok());
891    }
892
893    #[test]
894    fn test_is_valid_ident_ok() {
895        assert!(token_utils::is_valid_ident("foo"));
896        assert!(token_utils::is_valid_ident("Bar_baz"));
897        assert!(token_utils::is_valid_ident("x'"));
898    }
899
900    #[test]
901    fn test_is_valid_ident_fail() {
902        assert!(!token_utils::is_valid_ident("123"));
903        assert!(!token_utils::is_valid_ident(""));
904        assert!(!token_utils::is_valid_ident("a b"));
905    }
906
907    #[test]
908    fn test_is_keyword() {
909        assert!(token_utils::is_keyword("def"));
910        assert!(token_utils::is_keyword("theorem"));
911        assert!(!token_utils::is_keyword("myFunc"));
912    }
913
914    #[test]
915    fn test_escape_string() {
916        let s = token_utils::escape_string("hello\nworld");
917        assert!(s.contains("\\n"));
918        assert!(!s.contains('\n'));
919    }
920
921    #[test]
922    fn test_prec_ordering() {
923        // Use runtime variables to avoid "constant assertion" clippy warnings
924        let (app, mul, add, cmp, and, or, iff, arrow) = (
925            prec::APP,
926            prec::MUL,
927            prec::ADD,
928            prec::CMP,
929            prec::AND,
930            prec::OR,
931            prec::IFF,
932            prec::ARROW,
933        );
934        assert!(app > mul);
935        assert!(mul > add);
936        assert!(add > cmp);
937        assert!(cmp > and);
938        assert!(and > or);
939        assert!(or > iff);
940        assert!(arrow > iff);
941    }
942
943    #[test]
944    fn test_is_tactic_keyword() {
945        assert!(keywords::is_tactic_keyword("intro"));
946        assert!(keywords::is_tactic_keyword("simp"));
947        assert!(!keywords::is_tactic_keyword("def"));
948    }
949
950    #[test]
951    fn test_is_builtin_type() {
952        assert!(keywords::is_builtin_type("Nat"));
953        assert!(keywords::is_builtin_type("Bool"));
954        assert!(!keywords::is_builtin_type("MyType"));
955    }
956
957    #[test]
958    fn test_parser_version_nonempty() {
959        assert!(!PARSER_VERSION.is_empty());
960    }
961
962    #[test]
963    fn test_char_count_unicode() {
964        let s = "αβγ";
965        assert_eq!(token_utils::char_count(s), 3);
966        assert!(s.len() > 3); // bytes > chars for multi-byte chars
967    }
968}
969
970// ============================================================
971// Additional parse utilities
972// ============================================================
973
974/// A span annotation: associates a value with its source location.
975#[allow(dead_code)]
976#[derive(Debug, Clone)]
977#[allow(missing_docs)]
978pub struct Annotated<T> {
979    /// The parsed value.
980    pub value: T,
981    /// Start byte offset in the source.
982    pub start: usize,
983    /// End byte offset (exclusive) in the source.
984    #[allow(missing_docs)]
985    pub end: usize,
986}
987
988impl<T> Annotated<T> {
989    /// Wrap a value with its source span.
990    #[allow(dead_code)]
991    #[allow(missing_docs)]
992    pub fn new(value: T, start: usize, end: usize) -> Self {
993        Annotated { value, start, end }
994    }
995
996    /// Map the inner value, preserving span information.
997    #[allow(dead_code)]
998    #[allow(missing_docs)]
999    pub fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Annotated<U> {
1000        Annotated {
1001            value: f(self.value),
1002            start: self.start,
1003            end: self.end,
1004        }
1005    }
1006
1007    /// Returns the length of this annotated span in bytes.
1008    #[allow(dead_code)]
1009    #[allow(missing_docs)]
1010    pub fn span_len(&self) -> usize {
1011        self.end.saturating_sub(self.start)
1012    }
1013}
1014
1015/// A parse warning — something suspicious that isn't a hard error.
1016#[allow(dead_code)]
1017#[derive(Debug, Clone)]
1018#[allow(missing_docs)]
1019pub struct ParseWarning {
1020    /// Human-readable warning message.
1021    pub message: String,
1022    /// Start byte of the suspicious token.
1023    pub start: usize,
1024    /// End byte of the suspicious token.
1025    #[allow(missing_docs)]
1026    pub end: usize,
1027}
1028
1029impl ParseWarning {
1030    /// Create a new parse warning.
1031    #[allow(dead_code)]
1032    #[allow(missing_docs)]
1033    pub fn new(message: impl Into<String>, start: usize, end: usize) -> Self {
1034        ParseWarning {
1035            message: message.into(),
1036            start,
1037            end,
1038        }
1039    }
1040}
1041
1042/// A collection of parse diagnostics (errors and warnings).
1043#[allow(dead_code)]
1044#[derive(Debug, Clone, Default)]
1045#[allow(missing_docs)]
1046pub struct ParseDiagnostics {
1047    /// All warnings collected during parsing.
1048    pub warnings: Vec<ParseWarning>,
1049    /// All error messages collected during parsing.
1050    pub errors: Vec<String>,
1051}
1052
1053impl ParseDiagnostics {
1054    /// Create an empty diagnostics collection.
1055    #[allow(dead_code)]
1056    #[allow(missing_docs)]
1057    pub fn new() -> Self {
1058        Self::default()
1059    }
1060
1061    /// Add a warning.
1062    #[allow(dead_code)]
1063    #[allow(missing_docs)]
1064    pub fn warn(&mut self, msg: impl Into<String>, start: usize, end: usize) {
1065        self.warnings.push(ParseWarning::new(msg, start, end));
1066    }
1067
1068    /// Add an error.
1069    #[allow(dead_code)]
1070    #[allow(missing_docs)]
1071    pub fn error(&mut self, msg: impl Into<String>) {
1072        self.errors.push(msg.into());
1073    }
1074
1075    /// Returns `true` if there are no errors (warnings are OK).
1076    #[allow(dead_code)]
1077    #[allow(missing_docs)]
1078    pub fn is_ok(&self) -> bool {
1079        self.errors.is_empty()
1080    }
1081
1082    /// Total number of diagnostics (warnings + errors).
1083    #[allow(dead_code)]
1084    #[allow(missing_docs)]
1085    pub fn total(&self) -> usize {
1086        self.warnings.len() + self.errors.len()
1087    }
1088}
1089
1090/// Normalize whitespace in a source string: collapse runs of whitespace to single spaces.
1091#[allow(dead_code)]
1092#[allow(missing_docs)]
1093pub fn normalize_whitespace(src: &str) -> String {
1094    src.split_whitespace().collect::<Vec<_>>().join(" ")
1095}
1096
1097/// Count occurrences of a token kind name in a source string (heuristic, not lexer-based).
1098#[allow(dead_code)]
1099#[allow(missing_docs)]
1100pub fn count_keyword_occurrences(src: &str, keyword: &str) -> usize {
1101    src.split_whitespace().filter(|w| *w == keyword).count()
1102}
1103
1104/// Returns the set of unique identifiers in a source string (heuristic).
1105///
1106/// This is a rough approximation that does not use the full lexer.
1107#[allow(dead_code)]
1108#[allow(missing_docs)]
1109pub fn rough_ident_set(src: &str) -> std::collections::HashSet<String> {
1110    src.split(|c: char| !c.is_alphanumeric() && c != '_' && c != '\'')
1111        .filter(|s| !s.is_empty() && token_utils::is_valid_ident(s))
1112        .map(|s| s.to_string())
1113        .collect()
1114}
1115
1116#[cfg(test)]
1117mod extra_parse_tests {
1118    use super::*;
1119
1120    #[test]
1121    fn test_annotated_new_and_span_len() {
1122        let a = Annotated::new("hello", 0, 5);
1123        assert_eq!(a.span_len(), 5);
1124    }
1125
1126    #[test]
1127    fn test_annotated_map() {
1128        let a = Annotated::new(42u32, 10, 12);
1129        let b = a.map(|v| v * 2);
1130        assert_eq!(b.value, 84);
1131        assert_eq!(b.start, 10);
1132    }
1133
1134    #[test]
1135    fn test_parse_warning_new() {
1136        let w = ParseWarning::new("unused variable", 0, 5);
1137        assert!(w.message.contains("unused"));
1138    }
1139
1140    #[test]
1141    fn test_parse_diagnostics_ok() {
1142        let mut d = ParseDiagnostics::new();
1143        d.warn("something odd", 0, 3);
1144        assert!(d.is_ok()); // warnings don't fail
1145        d.error("bad token");
1146        assert!(!d.is_ok());
1147    }
1148
1149    #[test]
1150    fn test_parse_diagnostics_total() {
1151        let mut d = ParseDiagnostics::new();
1152        d.warn("w1", 0, 1);
1153        d.error("e1");
1154        assert_eq!(d.total(), 2);
1155    }
1156
1157    #[test]
1158    fn test_normalize_whitespace() {
1159        let result = normalize_whitespace("  hello   world  ");
1160        assert_eq!(result, "hello world");
1161    }
1162
1163    #[test]
1164    fn test_count_keyword_occurrences() {
1165        let src = "def foo def bar theorem baz";
1166        assert_eq!(count_keyword_occurrences(src, "def"), 2);
1167        assert_eq!(count_keyword_occurrences(src, "theorem"), 1);
1168    }
1169
1170    #[test]
1171    fn test_rough_ident_set() {
1172        let src = "def foo (x : Nat) := x + x";
1173        let ids = rough_ident_set(src);
1174        assert!(ids.contains("foo"));
1175        assert!(ids.contains("x"));
1176        assert!(ids.contains("Nat"));
1177    }
1178}
1179
1180// ============================================================
1181// Extended parse utilities — Part 2
1182// ============================================================
1183
1184/// A simple token buffer for look-ahead parsing.
1185///
1186/// Wraps a token stream and provides lookahead operations without
1187/// consuming the underlying iterator.
1188#[allow(dead_code)]
1189#[derive(Debug, Clone)]
1190#[allow(missing_docs)]
1191pub struct TokenBuffer {
1192    /// All tokens stored in the buffer.
1193    tokens: Vec<String>,
1194    /// Current read position.
1195    pos: usize,
1196}
1197
1198impl TokenBuffer {
1199    /// Create a token buffer from a list of token strings.
1200    #[allow(dead_code)]
1201    #[allow(missing_docs)]
1202    pub fn new(tokens: Vec<String>) -> Self {
1203        Self { tokens, pos: 0 }
1204    }
1205
1206    /// Peek at the current token without consuming.
1207    #[allow(dead_code)]
1208    #[allow(missing_docs)]
1209    pub fn peek(&self) -> Option<&str> {
1210        self.tokens.get(self.pos).map(String::as_str)
1211    }
1212
1213    /// Peek `n` positions ahead (0 = current).
1214    #[allow(dead_code)]
1215    #[allow(missing_docs)]
1216    pub fn peek_at(&self, n: usize) -> Option<&str> {
1217        self.tokens.get(self.pos + n).map(String::as_str)
1218    }
1219
1220    /// Advance and return the consumed token.
1221    #[allow(dead_code)]
1222    #[allow(missing_docs)]
1223    pub fn advance(&mut self) -> Option<&str> {
1224        let tok = self.tokens.get(self.pos).map(String::as_str);
1225        if tok.is_some() {
1226            self.pos += 1;
1227        }
1228        tok
1229    }
1230
1231    /// Consume the current token if it equals `expected`.
1232    #[allow(dead_code)]
1233    #[allow(missing_docs)]
1234    pub fn eat(&mut self, expected: &str) -> bool {
1235        if self.peek() == Some(expected) {
1236            self.pos += 1;
1237            true
1238        } else {
1239            false
1240        }
1241    }
1242
1243    /// Return remaining token count.
1244    #[allow(dead_code)]
1245    #[allow(missing_docs)]
1246    pub fn remaining(&self) -> usize {
1247        self.tokens.len().saturating_sub(self.pos)
1248    }
1249
1250    /// Check if at end of token stream.
1251    #[allow(dead_code)]
1252    #[allow(missing_docs)]
1253    pub fn is_eof(&self) -> bool {
1254        self.pos >= self.tokens.len()
1255    }
1256
1257    /// Mark the current position (for backtracking).
1258    #[allow(dead_code)]
1259    #[allow(missing_docs)]
1260    pub fn mark(&self) -> usize {
1261        self.pos
1262    }
1263
1264    /// Reset to a previously marked position.
1265    #[allow(dead_code)]
1266    #[allow(missing_docs)]
1267    pub fn reset_to(&mut self, mark: usize) {
1268        self.pos = mark;
1269    }
1270}
1271
1272/// A parse context: wraps a source file plus configuration.
1273#[allow(dead_code)]
1274#[derive(Debug, Clone)]
1275#[allow(missing_docs)]
1276pub struct ParseContext {
1277    /// The source file being parsed.
1278    pub file: SourceFile,
1279    /// Parser configuration.
1280    pub config: ParserConfig,
1281    /// Accumulated diagnostics during parsing.
1282    #[allow(missing_docs)]
1283    pub diagnostics: ParseDiagnostics,
1284}
1285
1286impl ParseContext {
1287    /// Create a new parse context.
1288    #[allow(dead_code)]
1289    #[allow(missing_docs)]
1290    pub fn new(file: SourceFile, config: ParserConfig) -> Self {
1291        Self {
1292            file,
1293            config,
1294            diagnostics: ParseDiagnostics::new(),
1295        }
1296    }
1297
1298    /// Create a context from a source string with default config.
1299    #[allow(dead_code)]
1300    #[allow(missing_docs)]
1301    pub fn from_source(source: impl Into<String>) -> Self {
1302        Self::new(SourceFile::virtual_(source), ParserConfig::default())
1303    }
1304
1305    /// Check if any errors occurred during parsing.
1306    #[allow(dead_code)]
1307    #[allow(missing_docs)]
1308    pub fn has_errors(&self) -> bool {
1309        !self.diagnostics.is_ok()
1310    }
1311
1312    /// Emit an error diagnostic.
1313    #[allow(dead_code)]
1314    #[allow(missing_docs)]
1315    pub fn emit_error(&mut self, msg: impl Into<String>) {
1316        self.diagnostics.error(msg);
1317    }
1318
1319    /// Emit a warning diagnostic.
1320    #[allow(dead_code)]
1321    #[allow(missing_docs)]
1322    pub fn emit_warning(&mut self, msg: impl Into<String>, start: usize, end: usize) {
1323        self.diagnostics.warn(msg, start, end);
1324    }
1325}
1326
1327/// Represents a syntactic "hole" in the AST: a position where a term is missing.
1328///
1329/// Holes arise from incomplete expressions like `_` or unresolved placeholders.
1330#[allow(dead_code)]
1331#[derive(Debug, Clone)]
1332#[allow(missing_docs)]
1333pub struct SyntacticHole {
1334    /// Optional human hint about what should fill this hole.
1335    pub hint: Option<String>,
1336    /// Source byte offset.
1337    pub offset: usize,
1338}
1339
1340impl SyntacticHole {
1341    /// Create a hole without a hint.
1342    #[allow(dead_code)]
1343    #[allow(missing_docs)]
1344    pub fn anonymous(offset: usize) -> Self {
1345        Self { hint: None, offset }
1346    }
1347
1348    /// Create a hole with a hint.
1349    #[allow(dead_code)]
1350    #[allow(missing_docs)]
1351    pub fn with_hint(hint: impl Into<String>, offset: usize) -> Self {
1352        Self {
1353            hint: Some(hint.into()),
1354            offset,
1355        }
1356    }
1357
1358    /// Returns `true` if this hole has a hint.
1359    #[allow(dead_code)]
1360    #[allow(missing_docs)]
1361    pub fn has_hint(&self) -> bool {
1362        self.hint.is_some()
1363    }
1364}
1365
1366/// Split a qualified name like `Foo.Bar.baz` into namespace + basename.
1367///
1368/// Returns `(namespace_parts, basename)`.
1369///
1370/// # Example
1371/// ```ignore
1372/// let (ns, base) = split_qualified_name("Foo.Bar.baz");
1373/// assert_eq!(ns, vec!["Foo", "Bar"]);
1374/// assert_eq!(base, "baz");
1375/// ```
1376#[allow(dead_code)]
1377#[allow(missing_docs)]
1378pub fn split_qualified_name(name: &str) -> (Vec<&str>, &str) {
1379    let parts: Vec<&str> = name.split('.').collect();
1380    if parts.len() <= 1 {
1381        (vec![], name)
1382    } else {
1383        let (ns, base) = parts.split_at(parts.len() - 1);
1384        (ns.to_vec(), base[0])
1385    }
1386}
1387
1388/// Join namespace parts and a basename into a qualified name.
1389#[allow(dead_code)]
1390#[allow(missing_docs)]
1391pub fn join_qualified_name(namespace: &[&str], basename: &str) -> String {
1392    if namespace.is_empty() {
1393        basename.to_string()
1394    } else {
1395        format!("{}.{}", namespace.join("."), basename)
1396    }
1397}
1398
1399/// Check whether a string looks like a qualified name.
1400#[allow(dead_code)]
1401#[allow(missing_docs)]
1402pub fn is_qualified_name(s: &str) -> bool {
1403    s.split('.')
1404        .all(|part| !part.is_empty() && token_utils::is_valid_ident(part))
1405}
1406
1407/// Strip a namespace prefix from a qualified name.
1408///
1409/// Returns the local name if the prefix matches, otherwise the original.
1410#[allow(dead_code)]
1411#[allow(missing_docs)]
1412pub fn strip_namespace_prefix<'a>(name: &'a str, prefix: &str) -> &'a str {
1413    let full_prefix = format!("{}.", prefix);
1414    if let Some(rest) = name.strip_prefix(full_prefix.as_str()) {
1415        rest
1416    } else {
1417        name
1418    }
1419}
1420
1421#[cfg(test)]
1422mod extra2_parse_tests {
1423    use super::*;
1424
1425    #[test]
1426    fn test_token_buffer_peek_advance() {
1427        let mut buf = TokenBuffer::new(vec!["def".into(), "foo".into(), ":=".into()]);
1428        assert_eq!(buf.peek(), Some("def"));
1429        assert_eq!(buf.advance(), Some("def"));
1430        assert_eq!(buf.peek(), Some("foo"));
1431    }
1432
1433    #[test]
1434    fn test_token_buffer_eat_success() {
1435        let mut buf = TokenBuffer::new(vec!["theorem".into(), "foo".into()]);
1436        assert!(buf.eat("theorem"));
1437        assert_eq!(buf.peek(), Some("foo"));
1438    }
1439
1440    #[test]
1441    fn test_token_buffer_eat_fail() {
1442        let mut buf = TokenBuffer::new(vec!["def".into()]);
1443        assert!(!buf.eat("theorem"));
1444        assert_eq!(buf.peek(), Some("def"));
1445    }
1446
1447    #[test]
1448    fn test_token_buffer_eof() {
1449        let mut buf = TokenBuffer::new(vec!["x".into()]);
1450        buf.advance();
1451        assert!(buf.is_eof());
1452    }
1453
1454    #[test]
1455    fn test_token_buffer_backtrack() {
1456        let mut buf = TokenBuffer::new(vec!["a".into(), "b".into(), "c".into()]);
1457        let mark = buf.mark();
1458        buf.advance();
1459        buf.advance();
1460        buf.reset_to(mark);
1461        assert_eq!(buf.peek(), Some("a"));
1462    }
1463
1464    #[test]
1465    fn test_token_buffer_peek_at() {
1466        let buf = TokenBuffer::new(vec!["x".into(), "y".into(), "z".into()]);
1467        assert_eq!(buf.peek_at(1), Some("y"));
1468        assert_eq!(buf.peek_at(5), None);
1469    }
1470
1471    #[test]
1472    fn test_parse_context_has_errors() {
1473        let mut ctx = ParseContext::from_source("def foo := 1");
1474        assert!(!ctx.has_errors());
1475        ctx.emit_error("bad token");
1476        assert!(ctx.has_errors());
1477    }
1478
1479    #[test]
1480    fn test_parse_context_emit_warning() {
1481        let mut ctx = ParseContext::from_source("x");
1482        ctx.emit_warning("suspicious", 0, 1);
1483        assert!(!ctx.has_errors());
1484        assert_eq!(ctx.diagnostics.warnings.len(), 1);
1485    }
1486
1487    #[test]
1488    fn test_syntactic_hole_anonymous() {
1489        let hole = SyntacticHole::anonymous(5);
1490        assert!(!hole.has_hint());
1491        assert_eq!(hole.offset, 5);
1492    }
1493
1494    #[test]
1495    fn test_syntactic_hole_with_hint() {
1496        let hole = SyntacticHole::with_hint("expected Nat", 10);
1497        assert!(hole.has_hint());
1498        assert_eq!(hole.hint.as_deref(), Some("expected Nat"));
1499    }
1500
1501    #[test]
1502    fn test_split_qualified_name_simple() {
1503        let (ns, base) = split_qualified_name("foo");
1504        assert!(ns.is_empty());
1505        assert_eq!(base, "foo");
1506    }
1507
1508    #[test]
1509    fn test_split_qualified_name_dotted() {
1510        let (ns, base) = split_qualified_name("Nat.add_comm");
1511        assert_eq!(ns, vec!["Nat"]);
1512        assert_eq!(base, "add_comm");
1513    }
1514
1515    #[test]
1516    fn test_split_qualified_name_deep() {
1517        let (ns, base) = split_qualified_name("Foo.Bar.baz");
1518        assert_eq!(ns, vec!["Foo", "Bar"]);
1519        assert_eq!(base, "baz");
1520    }
1521
1522    #[test]
1523    fn test_join_qualified_name() {
1524        assert_eq!(join_qualified_name(&["Nat"], "succ"), "Nat.succ");
1525        assert_eq!(join_qualified_name(&[], "foo"), "foo");
1526    }
1527
1528    #[test]
1529    fn test_is_qualified_name_true() {
1530        assert!(is_qualified_name("Foo.Bar.baz"));
1531        assert!(is_qualified_name("foo"));
1532    }
1533
1534    #[test]
1535    fn test_is_qualified_name_false() {
1536        assert!(!is_qualified_name("foo..bar"));
1537        assert!(!is_qualified_name("123"));
1538    }
1539
1540    #[test]
1541    fn test_strip_namespace_prefix_matching() {
1542        let result = strip_namespace_prefix("Nat.add", "Nat");
1543        assert_eq!(result, "add");
1544    }
1545
1546    #[test]
1547    fn test_strip_namespace_prefix_not_matching() {
1548        let result = strip_namespace_prefix("List.map", "Nat");
1549        assert_eq!(result, "List.map");
1550    }
1551}
1552
1553// ============================================================
1554// Extended lib utilities
1555// ============================================================
1556
1557/// A simple string pool that interns unique strings.
1558#[allow(dead_code)]
1559#[allow(missing_docs)]
1560pub struct SimpleStringPoolExt {
1561    /// All interned strings
1562    pub pool: Vec<String>,
1563}
1564
1565impl SimpleStringPoolExt {
1566    /// Create an empty pool.
1567    #[allow(dead_code)]
1568    pub fn new() -> Self {
1569        SimpleStringPoolExt { pool: Vec::new() }
1570    }
1571
1572    /// Intern a string, returning its index.
1573    #[allow(dead_code)]
1574    pub fn intern(&mut self, s: &str) -> usize {
1575        if let Some(idx) = self.pool.iter().position(|x| x == s) {
1576            return idx;
1577        }
1578        let idx = self.pool.len();
1579        self.pool.push(s.to_string());
1580        idx
1581    }
1582
1583    /// Retrieve an interned string by index.
1584    #[allow(dead_code)]
1585    pub fn get(&self, idx: usize) -> Option<&str> {
1586        self.pool.get(idx).map(|s| s.as_str())
1587    }
1588
1589    /// Returns the number of interned strings.
1590    #[allow(dead_code)]
1591    pub fn len(&self) -> usize {
1592        self.pool.len()
1593    }
1594
1595    /// Returns true if the pool is empty.
1596    #[allow(dead_code)]
1597    pub fn is_empty(&self) -> bool {
1598        self.pool.is_empty()
1599    }
1600}
1601
1602impl Default for SimpleStringPoolExt {
1603    fn default() -> Self {
1604        Self::new()
1605    }
1606}
1607
1608// ------------------------------------------------------------
1609
1610/// A simple name resolution table.
1611#[allow(dead_code)]
1612#[allow(missing_docs)]
1613pub struct NameResolutionTableExt {
1614    /// Scoped name mappings
1615    scopes: Vec<std::collections::HashMap<String, String>>,
1616}
1617
1618impl NameResolutionTableExt {
1619    /// Create a new table with one empty scope.
1620    #[allow(dead_code)]
1621    pub fn new() -> Self {
1622        NameResolutionTableExt {
1623            scopes: vec![std::collections::HashMap::new()],
1624        }
1625    }
1626
1627    /// Push a new scope.
1628    #[allow(dead_code)]
1629    pub fn push_scope(&mut self) {
1630        self.scopes.push(std::collections::HashMap::new());
1631    }
1632
1633    /// Pop the current scope.
1634    #[allow(dead_code)]
1635    pub fn pop_scope(&mut self) {
1636        if self.scopes.len() > 1 {
1637            self.scopes.pop();
1638        }
1639    }
1640
1641    /// Define a name in the current scope.
1642    #[allow(dead_code)]
1643    pub fn define(&mut self, name: &str, resolved: &str) {
1644        if let Some(scope) = self.scopes.last_mut() {
1645            scope.insert(name.to_string(), resolved.to_string());
1646        }
1647    }
1648
1649    /// Resolve a name (innermost scope first).
1650    #[allow(dead_code)]
1651    pub fn resolve(&self, name: &str) -> Option<&str> {
1652        for scope in self.scopes.iter().rev() {
1653            if let Some(v) = scope.get(name) {
1654                return Some(v.as_str());
1655            }
1656        }
1657        None
1658    }
1659}
1660
1661impl Default for NameResolutionTableExt {
1662    fn default() -> Self {
1663        Self::new()
1664    }
1665}
1666
1667// ------------------------------------------------------------
1668
1669/// A parse flag set for controlling parser behaviour.
1670#[allow(dead_code)]
1671#[allow(missing_docs)]
1672#[derive(Debug, Clone, Default)]
1673pub struct ParseFlagsExt {
1674    /// Whether to allow sorry in proofs
1675    pub allow_sorry: bool,
1676    /// Whether to enable unicode operators
1677    pub unicode_ops: bool,
1678    /// Whether to recover from errors
1679    pub error_recovery: bool,
1680    /// Whether to emit warnings as errors
1681    pub warnings_as_errors: bool,
1682}
1683
1684impl ParseFlagsExt {
1685    /// Create a default set of flags.
1686    #[allow(dead_code)]
1687    pub fn default_flags() -> Self {
1688        ParseFlagsExt {
1689            allow_sorry: true,
1690            unicode_ops: true,
1691            error_recovery: true,
1692            warnings_as_errors: false,
1693        }
1694    }
1695
1696    /// Enable sorry.
1697    #[allow(dead_code)]
1698    pub fn with_sorry(mut self) -> Self {
1699        self.allow_sorry = true;
1700        self
1701    }
1702
1703    /// Disable error recovery.
1704    #[allow(dead_code)]
1705    pub fn strict(mut self) -> Self {
1706        self.error_recovery = false;
1707        self
1708    }
1709}
1710
1711// ------------------------------------------------------------
1712
1713/// A parse context carrying source and flags.
1714#[allow(dead_code)]
1715#[allow(missing_docs)]
1716pub struct ParseContextExt {
1717    /// The source text
1718    pub source: String,
1719    /// Parse flags
1720    pub flags: ParseFlagsExt,
1721    /// Source filename (for error messages)
1722    pub filename: Option<String>,
1723}
1724
1725impl ParseContextExt {
1726    /// Create a new parse context.
1727    #[allow(dead_code)]
1728    pub fn new(source: &str) -> Self {
1729        ParseContextExt {
1730            source: source.to_string(),
1731            flags: ParseFlagsExt::default_flags(),
1732            filename: None,
1733        }
1734    }
1735
1736    /// Set the filename.
1737    #[allow(dead_code)]
1738    pub fn with_filename(mut self, name: &str) -> Self {
1739        self.filename = Some(name.to_string());
1740        self
1741    }
1742
1743    /// Set flags.
1744    #[allow(dead_code)]
1745    pub fn with_flags(mut self, flags: ParseFlagsExt) -> Self {
1746        self.flags = flags;
1747        self
1748    }
1749}
1750
1751// ------------------------------------------------------------
1752
1753/// A named phase in the compilation pipeline.
1754#[allow(dead_code)]
1755#[allow(missing_docs)]
1756#[derive(Debug, Clone, PartialEq, Eq)]
1757pub enum CompilePhaseExt {
1758    /// Lexing phase
1759    Lex,
1760    /// Parsing phase
1761    Parse,
1762    /// Elaboration phase
1763    Elaborate,
1764    /// Tactic evaluation phase
1765    Tactic,
1766    /// Code generation phase
1767    CodeGen,
1768}
1769
1770impl std::fmt::Display for CompilePhaseExt {
1771    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
1772        match self {
1773            CompilePhaseExt::Lex => write!(f, "lex"),
1774            CompilePhaseExt::Parse => write!(f, "parse"),
1775            CompilePhaseExt::Elaborate => write!(f, "elaborate"),
1776            CompilePhaseExt::Tactic => write!(f, "tactic"),
1777            CompilePhaseExt::CodeGen => write!(f, "codegen"),
1778        }
1779    }
1780}
1781
1782// ------------------------------------------------------------
1783
1784/// A timing record for a pipeline phase.
1785#[allow(dead_code)]
1786#[allow(missing_docs)]
1787#[derive(Debug, Clone)]
1788pub struct PhaseTimerExt {
1789    /// The phase
1790    pub phase: CompilePhaseExt,
1791    /// Duration in microseconds (mocked)
1792    pub duration_us: u64,
1793}
1794
1795/// A collection of phase timers.
1796#[allow(dead_code)]
1797#[allow(missing_docs)]
1798#[derive(Debug, Default)]
1799pub struct PipelineTimingsExt {
1800    /// All recorded timings
1801    pub timings: Vec<PhaseTimerExt>,
1802}
1803
1804impl PipelineTimingsExt {
1805    /// Create a new empty timings record.
1806    #[allow(dead_code)]
1807    pub fn new() -> Self {
1808        PipelineTimingsExt {
1809            timings: Vec::new(),
1810        }
1811    }
1812
1813    /// Record a phase timing.
1814    #[allow(dead_code)]
1815    pub fn record(&mut self, phase: CompilePhaseExt, duration_us: u64) {
1816        self.timings.push(PhaseTimerExt { phase, duration_us });
1817    }
1818
1819    /// Total duration in microseconds.
1820    #[allow(dead_code)]
1821    pub fn total_us(&self) -> u64 {
1822        self.timings.iter().map(|t| t.duration_us).sum()
1823    }
1824
1825    /// Format all timings.
1826    #[allow(dead_code)]
1827    pub fn format(&self) -> String {
1828        self.timings
1829            .iter()
1830            .map(|t| format!("{}: {}us", t.phase, t.duration_us))
1831            .collect::<Vec<_>>()
1832            .join(", ")
1833    }
1834}
1835
1836// ============================================================
1837// Extended lib tests
1838// ============================================================
1839
1840#[cfg(test)]
1841mod lib_ext_tests {
1842    use super::*;
1843
1844    #[test]
1845    fn test_simple_string_pool() {
1846        let mut pool = SimpleStringPoolExt::new();
1847        let i1 = pool.intern("hello");
1848        let i2 = pool.intern("world");
1849        let i3 = pool.intern("hello");
1850        assert_eq!(i1, i3);
1851        assert_ne!(i1, i2);
1852        assert_eq!(pool.get(i1), Some("hello"));
1853        assert_eq!(pool.len(), 2);
1854    }
1855
1856    #[test]
1857    fn test_name_resolution_table() {
1858        let mut table = NameResolutionTableExt::new();
1859        table.define("x", "Nat.x");
1860        table.push_scope();
1861        table.define("x", "Int.x");
1862        assert_eq!(table.resolve("x"), Some("Int.x"));
1863        table.pop_scope();
1864        assert_eq!(table.resolve("x"), Some("Nat.x"));
1865        assert_eq!(table.resolve("y"), None);
1866    }
1867
1868    #[test]
1869    fn test_parse_flags() {
1870        let flags = ParseFlagsExt::default_flags();
1871        assert!(flags.allow_sorry);
1872        assert!(flags.unicode_ops);
1873        let strict = ParseFlagsExt::default_flags().strict();
1874        assert!(!strict.error_recovery);
1875    }
1876
1877    #[test]
1878    fn test_parse_context() {
1879        let ctx = ParseContextExt::new("fun x -> x").with_filename("test.lean");
1880        assert_eq!(ctx.filename.as_deref(), Some("test.lean"));
1881        assert!(ctx.flags.allow_sorry);
1882    }
1883
1884    #[test]
1885    fn test_compile_phase_display() {
1886        assert_eq!(CompilePhaseExt::Parse.to_string(), "parse");
1887        assert_eq!(CompilePhaseExt::Elaborate.to_string(), "elaborate");
1888    }
1889
1890    #[test]
1891    fn test_pipeline_timings() {
1892        let mut timings = PipelineTimingsExt::new();
1893        timings.record(CompilePhaseExt::Lex, 100);
1894        timings.record(CompilePhaseExt::Parse, 200);
1895        assert_eq!(timings.total_us(), 300);
1896        let out = timings.format();
1897        assert!(out.contains("lex"));
1898        assert!(out.contains("parse"));
1899    }
1900}
1901
1902// ============================================================
1903// More lib utilities: Source File Management
1904// ============================================================
1905
1906/// A source file registry that tracks all loaded files.
1907#[allow(dead_code)]
1908#[allow(missing_docs)]
1909pub struct SourceFileRegistry {
1910    /// Map from filename to source content
1911    pub files: std::collections::HashMap<String, String>,
1912    /// File IDs in order of loading
1913    pub file_order: Vec<String>,
1914}
1915
1916impl SourceFileRegistry {
1917    /// Create a new empty registry.
1918    #[allow(dead_code)]
1919    pub fn new() -> Self {
1920        SourceFileRegistry {
1921            files: std::collections::HashMap::new(),
1922            file_order: Vec::new(),
1923        }
1924    }
1925
1926    /// Add a source file.
1927    #[allow(dead_code)]
1928    pub fn add(&mut self, name: &str, content: &str) {
1929        self.files.insert(name.to_string(), content.to_string());
1930        if !self.file_order.contains(&name.to_string()) {
1931            self.file_order.push(name.to_string());
1932        }
1933    }
1934
1935    /// Get the content of a file.
1936    #[allow(dead_code)]
1937    pub fn get(&self, name: &str) -> Option<&str> {
1938        self.files.get(name).map(|s| s.as_str())
1939    }
1940
1941    /// Returns the number of files.
1942    #[allow(dead_code)]
1943    pub fn len(&self) -> usize {
1944        self.files.len()
1945    }
1946
1947    /// Returns true if empty.
1948    #[allow(dead_code)]
1949    pub fn is_empty(&self) -> bool {
1950        self.files.is_empty()
1951    }
1952}
1953
1954impl Default for SourceFileRegistry {
1955    fn default() -> Self {
1956        Self::new()
1957    }
1958}
1959
1960// ------------------------------------------------------------
1961
1962/// A compilation unit consisting of source file, AST, and errors.
1963#[allow(dead_code)]
1964#[allow(missing_docs)]
1965pub struct CompilationUnit {
1966    /// The source filename
1967    pub filename: String,
1968    /// The source text
1969    pub source: String,
1970    /// Whether parsing succeeded
1971    pub parse_ok: bool,
1972    /// Number of errors
1973    pub error_count: usize,
1974}
1975
1976impl CompilationUnit {
1977    /// Create a new compilation unit.
1978    #[allow(dead_code)]
1979    pub fn new(filename: &str, source: &str) -> Self {
1980        CompilationUnit {
1981            filename: filename.to_string(),
1982            source: source.to_string(),
1983            parse_ok: false,
1984            error_count: 0,
1985        }
1986    }
1987
1988    /// Mark as successfully parsed.
1989    #[allow(dead_code)]
1990    pub fn mark_parsed(mut self) -> Self {
1991        self.parse_ok = true;
1992        self
1993    }
1994
1995    /// Mark with errors.
1996    #[allow(dead_code)]
1997    pub fn with_errors(mut self, count: usize) -> Self {
1998        self.error_count = count;
1999        self
2000    }
2001}
2002
2003// ------------------------------------------------------------
2004
2005/// A simple token frequency map.
2006#[allow(dead_code)]
2007#[allow(missing_docs)]
2008pub struct TokenFrequencyMapExt {
2009    /// Map from token text to frequency
2010    pub freq: std::collections::HashMap<String, usize>,
2011}
2012
2013impl TokenFrequencyMapExt {
2014    /// Create a new empty map.
2015    #[allow(dead_code)]
2016    pub fn new() -> Self {
2017        TokenFrequencyMapExt {
2018            freq: std::collections::HashMap::new(),
2019        }
2020    }
2021
2022    /// Record a token.
2023    #[allow(dead_code)]
2024    pub fn record(&mut self, token: &str) {
2025        *self.freq.entry(token.to_string()).or_insert(0) += 1;
2026    }
2027
2028    /// Most frequent token.
2029    #[allow(dead_code)]
2030    pub fn most_frequent(&self) -> Option<(&str, usize)> {
2031        self.freq
2032            .iter()
2033            .max_by_key(|(_, c)| *c)
2034            .map(|(k, &c)| (k.as_str(), c))
2035    }
2036
2037    /// Returns the total number of recorded tokens.
2038    #[allow(dead_code)]
2039    pub fn total(&self) -> usize {
2040        self.freq.values().sum()
2041    }
2042}
2043
2044impl Default for TokenFrequencyMapExt {
2045    fn default() -> Self {
2046        Self::new()
2047    }
2048}
2049
2050// ============================================================
2051// More lib tests
2052// ============================================================
2053
2054#[cfg(test)]
2055mod lib_ext2_tests {
2056    use super::*;
2057
2058    #[test]
2059    fn test_source_file_registry() {
2060        let mut reg = SourceFileRegistry::new();
2061        reg.add("a.lean", "def foo := 1");
2062        reg.add("b.lean", "def bar := 2");
2063        assert_eq!(reg.len(), 2);
2064        assert_eq!(reg.get("a.lean"), Some("def foo := 1"));
2065        assert_eq!(reg.get("c.lean"), None);
2066    }
2067
2068    #[test]
2069    fn test_compilation_unit() {
2070        let unit = CompilationUnit::new("test.lean", "def x := 1").mark_parsed();
2071        assert!(unit.parse_ok);
2072        assert_eq!(unit.error_count, 0);
2073    }
2074
2075    #[test]
2076    fn test_token_frequency_map_ext() {
2077        let mut m = TokenFrequencyMapExt::new();
2078        m.record("def");
2079        m.record("def");
2080        m.record("fun");
2081        assert_eq!(m.total(), 3);
2082        let (tok, count) = m.most_frequent().expect("test operation should succeed");
2083        assert_eq!(tok, "def");
2084        assert_eq!(count, 2);
2085    }
2086}
2087
2088// More lib utilities padding
2089/// A declaration table mapping names to their source locations.
2090#[allow(dead_code)]
2091#[allow(missing_docs)]
2092pub struct DeclTable {
2093    /// Map from name to (filename, line)
2094    pub entries: std::collections::HashMap<String, (String, usize)>,
2095}
2096
2097impl DeclTable {
2098    /// Create a new empty declaration table.
2099    #[allow(dead_code)]
2100    pub fn new() -> Self {
2101        DeclTable {
2102            entries: std::collections::HashMap::new(),
2103        }
2104    }
2105
2106    /// Register a declaration.
2107    #[allow(dead_code)]
2108    pub fn register(&mut self, name: &str, file: &str, line: usize) {
2109        self.entries
2110            .insert(name.to_string(), (file.to_string(), line));
2111    }
2112
2113    /// Look up a declaration's location.
2114    #[allow(dead_code)]
2115    pub fn lookup(&self, name: &str) -> Option<(&str, usize)> {
2116        self.entries.get(name).map(|(f, l)| (f.as_str(), *l))
2117    }
2118
2119    /// Returns the number of declarations.
2120    #[allow(dead_code)]
2121    pub fn len(&self) -> usize {
2122        self.entries.len()
2123    }
2124
2125    /// Returns true if empty.
2126    #[allow(dead_code)]
2127    pub fn is_empty(&self) -> bool {
2128        self.entries.is_empty()
2129    }
2130}
2131
2132impl Default for DeclTable {
2133    fn default() -> Self {
2134        Self::new()
2135    }
2136}
2137
2138// ------------------------------------------------------------
2139
2140/// A simple import graph.
2141#[allow(dead_code)]
2142#[allow(missing_docs)]
2143pub struct ImportGraph {
2144    /// Map from module name to list of imported modules
2145    pub imports: std::collections::HashMap<String, Vec<String>>,
2146}
2147
2148impl ImportGraph {
2149    /// Create a new empty import graph.
2150    #[allow(dead_code)]
2151    pub fn new() -> Self {
2152        ImportGraph {
2153            imports: std::collections::HashMap::new(),
2154        }
2155    }
2156
2157    /// Add an import edge.
2158    #[allow(dead_code)]
2159    pub fn add_import(&mut self, from: &str, to: &str) {
2160        self.imports
2161            .entry(from.to_string())
2162            .or_default()
2163            .push(to.to_string());
2164    }
2165
2166    /// Returns all imports of a module.
2167    #[allow(dead_code)]
2168    pub fn imports_of(&self, module: &str) -> &[String] {
2169        self.imports
2170            .get(module)
2171            .map(|v| v.as_slice())
2172            .unwrap_or(&[])
2173    }
2174
2175    /// Returns the total number of import edges.
2176    #[allow(dead_code)]
2177    pub fn edge_count(&self) -> usize {
2178        self.imports.values().map(|v| v.len()).sum()
2179    }
2180}
2181
2182impl Default for ImportGraph {
2183    fn default() -> Self {
2184        Self::new()
2185    }
2186}
2187
2188// ------------------------------------------------------------
2189
2190/// A simple parse statistics record.
2191#[allow(dead_code)]
2192#[allow(missing_docs)]
2193#[derive(Debug, Default)]
2194pub struct ParseStatsExt {
2195    /// Number of tokens processed
2196    pub tokens_processed: usize,
2197    /// Number of declarations parsed
2198    pub decls_parsed: usize,
2199    /// Number of errors encountered
2200    pub errors: usize,
2201}
2202
2203impl ParseStatsExt {
2204    /// Create a new empty record.
2205    #[allow(dead_code)]
2206    pub fn new() -> Self {
2207        ParseStatsExt::default()
2208    }
2209
2210    /// Format the stats.
2211    #[allow(dead_code)]
2212    pub fn format(&self) -> String {
2213        format!(
2214            "tokens={} decls={} errors={}",
2215            self.tokens_processed, self.decls_parsed, self.errors
2216        )
2217    }
2218}
2219
2220#[cfg(test)]
2221mod lib_ext3_tests {
2222    use super::*;
2223    #[test]
2224    fn test_decl_table() {
2225        let mut t = DeclTable::new();
2226        t.register("foo", "a.lean", 10);
2227        t.register("bar", "b.lean", 20);
2228        assert_eq!(t.len(), 2);
2229        let (file, line) = t.lookup("foo").expect("lookup should succeed");
2230        assert_eq!(file, "a.lean");
2231        assert_eq!(line, 10);
2232        assert!(t.lookup("baz").is_none());
2233    }
2234    #[test]
2235    fn test_import_graph() {
2236        let mut g = ImportGraph::new();
2237        g.add_import("A", "B");
2238        g.add_import("A", "C");
2239        g.add_import("B", "C");
2240        assert_eq!(g.imports_of("A").len(), 2);
2241        assert_eq!(g.edge_count(), 3);
2242    }
2243    #[test]
2244    fn test_parse_stats_ext() {
2245        let mut s = ParseStatsExt::new();
2246        s.tokens_processed = 100;
2247        s.decls_parsed = 5;
2248        let out = s.format();
2249        assert!(out.contains("tokens=100"));
2250    }
2251}
2252
2253// lib final padding
2254/// A simple word frequency counter for source analysis.
2255#[allow(dead_code)]
2256#[allow(missing_docs)]
2257pub fn word_frequency(text: &str) -> std::collections::HashMap<String, usize> {
2258    let mut freq = std::collections::HashMap::new();
2259    for word in text.split_whitespace() {
2260        *freq.entry(word.to_string()).or_insert(0) += 1;
2261    }
2262    freq
2263}
2264
2265/// A source file summary.
2266#[allow(dead_code)]
2267#[allow(missing_docs)]
2268#[derive(Debug, Default)]
2269pub struct SourceSummary {
2270    /// Number of lines
2271    pub lines: usize,
2272    /// Number of words
2273    pub words: usize,
2274    /// Number of characters
2275    pub chars: usize,
2276    /// Number of blank lines
2277    pub blank_lines: usize,
2278}
2279
2280impl SourceSummary {
2281    /// Compute a summary from source text.
2282    #[allow(dead_code)]
2283    #[allow(clippy::should_implement_trait)]
2284    pub fn from_str(src: &str) -> Self {
2285        let mut lines = src.lines().count();
2286        if src.ends_with('\n') {
2287            lines += 1;
2288        }
2289        let blank_lines = src.lines().filter(|l| l.trim().is_empty()).count();
2290        let words = src.split_whitespace().count();
2291        let chars = src.chars().count();
2292        SourceSummary {
2293            lines,
2294            words,
2295            chars,
2296            blank_lines,
2297        }
2298    }
2299
2300    /// Format as a string.
2301    #[allow(dead_code)]
2302    pub fn format(&self) -> String {
2303        format!(
2304            "lines={} words={} chars={} blank={}",
2305            self.lines, self.words, self.chars, self.blank_lines
2306        )
2307    }
2308}
2309
2310#[cfg(test)]
2311mod lib_final_tests {
2312    use super::*;
2313    #[test]
2314    fn test_word_frequency() {
2315        let freq = word_frequency("a b a c b a");
2316        assert_eq!(freq["a"], 3);
2317        assert_eq!(freq["b"], 2);
2318        assert_eq!(freq["c"], 1);
2319    }
2320    #[test]
2321    fn test_source_summary() {
2322        let src = "def foo := 1\n\ndef bar := 2\n";
2323        let s = SourceSummary::from_str(src);
2324        assert_eq!(s.lines, 4);
2325        assert_eq!(s.blank_lines, 1);
2326        let out = s.format();
2327        assert!(out.contains("lines=4"));
2328    }
2329}
2330
2331// lib pad
2332/// Returns the number of tokens in a source string (whitespace-split).
2333#[allow(dead_code)]
2334#[allow(missing_docs)]
2335pub fn rough_token_count(src: &str) -> usize {
2336    src.split_whitespace().count()
2337}
2338/// Returns a preview of the source (first N chars).
2339#[allow(dead_code)]
2340#[allow(missing_docs)]
2341pub fn source_preview(src: &str, n: usize) -> String {
2342    let truncated: String = src.chars().take(n).collect();
2343    if src.chars().count() > n {
2344        format!("{}...", truncated)
2345    } else {
2346        truncated
2347    }
2348}
2349/// A namespace resolver.
2350#[allow(dead_code)]
2351#[allow(missing_docs)]
2352pub struct NamespaceResolver {
2353    /// Stack of open namespaces
2354    pub stack: Vec<String>,
2355}
2356impl NamespaceResolver {
2357    /// Create a new resolver.
2358    #[allow(dead_code)]
2359    pub fn new() -> Self {
2360        NamespaceResolver { stack: Vec::new() }
2361    }
2362    /// Open a namespace.
2363    #[allow(dead_code)]
2364    pub fn open(&mut self, ns: &str) {
2365        self.stack.push(ns.to_string());
2366    }
2367    /// Close the current namespace.
2368    #[allow(dead_code)]
2369    pub fn close(&mut self) {
2370        self.stack.pop();
2371    }
2372    /// Resolve a name against the current namespace.
2373    #[allow(dead_code)]
2374    pub fn resolve(&self, name: &str) -> String {
2375        if self.stack.is_empty() {
2376            return name.to_string();
2377        }
2378        format!("{}.{}", self.stack.join("."), name)
2379    }
2380}
2381impl Default for NamespaceResolver {
2382    fn default() -> Self {
2383        Self::new()
2384    }
2385}
2386#[cfg(test)]
2387mod lib_pad {
2388    use super::*;
2389    #[test]
2390    fn test_rough_token_count() {
2391        assert_eq!(rough_token_count("a b c d"), 4);
2392    }
2393    #[test]
2394    fn test_source_preview() {
2395        assert_eq!(source_preview("hello world", 5), "hello...");
2396        assert_eq!(source_preview("hi", 5), "hi");
2397    }
2398    #[test]
2399    fn test_namespace_resolver() {
2400        let mut r = NamespaceResolver::new();
2401        r.open("Foo");
2402        r.open("Bar");
2403        assert_eq!(r.resolve("baz"), "Foo.Bar.baz");
2404        r.close();
2405        assert_eq!(r.resolve("qux"), "Foo.qux");
2406    }
2407}
2408
2409// lib pad2
2410/// Returns a word frequency map for the given source string.
2411#[allow(dead_code)]
2412#[allow(missing_docs)]
2413pub fn word_frequency_ext2(src: &str) -> std::collections::HashMap<String, usize> {
2414    let mut map = std::collections::HashMap::new();
2415    for word in src.split_whitespace() {
2416        *map.entry(word.to_string()).or_insert(0) += 1;
2417    }
2418    map
2419}
2420/// A summary of a source file.
2421#[allow(dead_code)]
2422#[allow(missing_docs)]
2423#[derive(Debug, Clone)]
2424pub struct SourceSummaryExt2 {
2425    /// Number of lines
2426    pub line_count: usize,
2427    /// Number of characters
2428    pub char_count: usize,
2429    /// Number of words (whitespace-separated)
2430    pub word_count: usize,
2431    /// File path or name
2432    pub name: String,
2433}
2434impl SourceSummaryExt2 {
2435    /// Build a summary from source text.
2436    #[allow(dead_code)]
2437    pub fn from_str(name: &str, src: &str) -> Self {
2438        SourceSummaryExt2 {
2439            name: name.to_string(),
2440            line_count: src.lines().count(),
2441            char_count: src.chars().count(),
2442            word_count: src.split_whitespace().count(),
2443        }
2444    }
2445    /// Format as a one-line string.
2446    #[allow(dead_code)]
2447    pub fn summary_line(&self) -> String {
2448        format!(
2449            "{}: {} lines, {} chars, {} words",
2450            self.name, self.line_count, self.char_count, self.word_count
2451        )
2452    }
2453}
2454/// A declaration table mapping names to types.
2455#[allow(dead_code)]
2456#[allow(missing_docs)]
2457#[derive(Debug, Clone, Default)]
2458pub struct DeclTableExt2 {
2459    /// Map from name to type string
2460    pub entries: std::collections::HashMap<String, String>,
2461}
2462impl DeclTableExt2 {
2463    /// Create an empty table.
2464    #[allow(dead_code)]
2465    pub fn new() -> Self {
2466        DeclTableExt2 {
2467            entries: std::collections::HashMap::new(),
2468        }
2469    }
2470    /// Insert a declaration.
2471    #[allow(dead_code)]
2472    pub fn insert(&mut self, name: &str, ty: &str) {
2473        self.entries.insert(name.to_string(), ty.to_string());
2474    }
2475    /// Look up a declaration.
2476    #[allow(dead_code)]
2477    pub fn lookup(&self, name: &str) -> Option<&str> {
2478        self.entries.get(name).map(|s| s.as_str())
2479    }
2480    /// Returns the number of declarations.
2481    #[allow(dead_code)]
2482    pub fn len(&self) -> usize {
2483        self.entries.len()
2484    }
2485    /// Returns true if empty.
2486    #[allow(dead_code)]
2487    pub fn is_empty(&self) -> bool {
2488        self.entries.is_empty()
2489    }
2490}
2491#[cfg(test)]
2492mod lib_pad2 {
2493    use super::*;
2494    #[test]
2495    fn test_word_frequency_ext2() {
2496        let freq = word_frequency_ext2("a b a c a b");
2497        assert_eq!(freq["a"], 3);
2498        assert_eq!(freq["b"], 2);
2499        assert_eq!(freq["c"], 1);
2500    }
2501    #[test]
2502    fn test_source_summary() {
2503        let s = SourceSummaryExt2::from_str("test.lean", "def foo := 42\ndef bar := 0");
2504        assert_eq!(s.line_count, 2);
2505        assert!(s.summary_line().contains("test.lean"));
2506    }
2507    #[test]
2508    fn test_decl_table() {
2509        let mut t = DeclTableExt2::new();
2510        t.insert("foo", "Nat");
2511        assert_eq!(t.lookup("foo"), Some("Nat"));
2512        assert_eq!(t.lookup("bar"), None);
2513        assert_eq!(t.len(), 1);
2514    }
2515}
2516
2517// lib pad3
2518/// Returns the most common word in a source string.
2519#[allow(dead_code)]
2520#[allow(missing_docs)]
2521pub fn most_common_word(src: &str) -> Option<String> {
2522    let freq = word_frequency_ext2(src);
2523    freq.into_iter()
2524        .max_by_key(|(_, count)| *count)
2525        .map(|(w, _)| w)
2526}
2527/// Returns a list of lines in a source string that contain a given keyword.
2528#[allow(dead_code)]
2529#[allow(missing_docs)]
2530pub fn lines_containing(src: &str, keyword: &str) -> Vec<String> {
2531    src.lines()
2532        .filter(|l| l.contains(keyword))
2533        .map(|l| l.to_string())
2534        .collect()
2535}
2536/// Checks whether a string is a valid identifier character.
2537#[allow(dead_code)]
2538#[allow(missing_docs)]
2539pub fn is_ident_char(c: char) -> bool {
2540    c.is_alphanumeric() || c == '_' || c == '\''
2541}
2542// -- padding line 0 --
2543// -- padding line 1 --
2544// -- padding line 2 --
2545// -- padding line 3 --
2546// -- padding line 4 --
2547// -- padding line 5 --
2548// -- padding line 6 --
2549// -- padding line 7 --
2550// -- padding line 8 --
2551// -- padding line 9 --
2552// -- padding line 10 --
2553// -- padding line 11 --
2554// -- padding line 12 --
2555// -- padding line 13 --
2556// -- padding line 14 --
2557// -- padding line 15 --
2558// -- padding line 16 --
2559// -- padding line 17 --
2560// -- padding line 18 --
2561// -- padding line 19 --
2562// -- padding line 20 --
2563// -- padding line 21 --
2564// -- padding line 22 --
2565// -- padding line 23 --
2566// -- padding line 24 --
2567// -- padding line 25 --
2568// -- padding line 26 --
2569// -- padding line 27 --
2570// -- padding line 28 --
2571// -- padding line 29 --
2572// -- padding line 30 --
2573// -- padding line 31 --
2574// -- padding line 32 --
2575// -- padding line 33 --
2576// -- padding line 34 --
2577// -- padding line 35 --
2578// -- padding line 36 --
2579// -- padding line 37 --
2580// -- padding line 38 --
2581// -- padding line 39 --
2582// -- padding line 40 --
2583// -- padding line 41 --
2584// -- padding line 42 --
2585// -- padding line 43 --
2586// -- padding line 44 --
2587// -- padding line 45 --
2588// -- padding line 46 --
2589// -- padding line 47 --
2590// -- padding line 48 --
2591// -- padding line 49 --
oxilean_parse/lib.rs

oxilean_parse/
lib.rs