Skip to main content

Crate perl_parser

Crate perl_parser 

Source
Expand description

§perl-parser — Production-grade Perl parser and Language Server Protocol engine

A comprehensive Perl parser built on recursive descent principles, providing robust AST generation, LSP feature providers, workspace indexing, and test-driven development support.

§Key Features

  • Tree-sitter Compatible: AST with kinds, fields, and position tracking compatible with tree-sitter grammar
  • Comprehensive Parsing: ~100% edge case coverage for Perl 5.8-5.40 syntax
  • LSP Integration: Full Language Server Protocol feature set (~82% coverage in v0.8.6)
  • TDD Workflow: Intelligent test generation with return value analysis
  • Incremental Parsing: Efficient re-parsing for real-time editing
  • Error Recovery: Graceful handling of malformed input with detailed diagnostics
  • Workspace Navigation: Cross-file symbol resolution and reference tracking

§Quick Start

§Basic Parsing

use perl_parser::Parser;

let code = r#"sub hello { print "Hello, world!\n"; }"#;
let mut parser = Parser::new(code);

match parser.parse() {
    Ok(ast) => {
        println!("AST: {}", ast.to_sexp());
        println!("Parsed {} nodes", ast.count_nodes());
    }
    Err(e) => eprintln!("Parse error: {}", e),
}

§Test-Driven Development

Generate tests automatically from parsed code:

use perl_parser::Parser;
use perl_parser::tdd::test_generator::{TestGenerator, TestFramework};

let code = r#"sub add { my ($a, $b) = @_; return $a + $b; }"#;
let mut parser = Parser::new(code);
let ast = parser.parse()?;

let generator = TestGenerator::new(TestFramework::TestMore);
let tests = generator.generate_tests(&ast, code);

// Returns test cases with intelligent assertions
assert!(!tests.is_empty());

§LSP Integration

Use as a library for LSP features (see perl-lsp for the standalone server):

use perl_parser::Parser;
use perl_parser::analysis::semantic::SemanticAnalyzer;

let code = "my $x = 42;";
let mut parser = Parser::new(code);
let ast = parser.parse()?;

// Semantic analysis for hover, completion, etc.
let model = SemanticAnalyzer::analyze(&ast);

§Architecture

The parser is organized into distinct layers for maintainability and testability:

§Core Engine (engine)

  • parser: Recursive descent parser with operator precedence
  • ast: Abstract Syntax Tree definitions and node types
  • error: Error classification, recovery strategies, and diagnostics
  • position: UTF-16 position mapping for LSP protocol compliance
  • quote_parser: Specialized parser for quote-like operators
  • heredoc_collector: FIFO heredoc collection with indent stripping

§IDE Integration (LSP Provider Modules)

§Analysis (analysis)

§Workspace (workspace)

§Refactoring (refactor)

§Test Support (tdd)

  • test_generator: Intelligent test case generation
  • test_runner: Test execution and validation
  • tdd_workflow (test-only): TDD cycle management and coverage tracking

§LSP Feature Support

This crate provides the engine for LSP features. The public standalone server is in perllsp, backed by the perl-lsp-rs implementation crate.

§Implemented Features

  • Completion: Context-aware code completion with type inference
  • Hover: Documentation and type information on hover
  • Definition: Go-to-definition with cross-file support
  • References: Find all references with workspace indexing
  • Rename: Symbol renaming with conflict detection
  • Diagnostics: Syntax errors and semantic warnings
  • Formatting: Code formatting via perltidy integration
  • Folding: Code folding for blocks and regions
  • Semantic Tokens: Fine-grained syntax highlighting
  • Call Hierarchy: Function call navigation
  • Type Hierarchy: Class inheritance navigation

See docs/reference/LSP_CAPABILITY_POLICY.md for the complete capability matrix.

§Incremental Parsing

Enable efficient re-parsing for real-time editing:

use perl_parser::{IncrementalState, apply_edits, Edit};

let mut state = IncrementalState::new("my $x = 1;");
let ast = state.parse()?;

// Apply an edit
let edit = Edit {
    start_byte: 3,
    old_end_byte: 5,
    new_end_byte: 5,
    text: "$y".to_string(),
};
apply_edits(&mut state, vec![edit]);

// Incremental re-parse reuses unchanged nodes
let new_ast = state.parse()?;

§Error Recovery

The parser uses intelligent error recovery to continue parsing after errors:

use perl_parser::Parser;

let code = "sub broken { if (";  // Incomplete code
let mut parser = Parser::new(code);

// Parser recovers and builds partial AST
let result = parser.parse();
assert!(result.is_ok());

// Check recorded errors
let errors = parser.errors();
assert!(!errors.is_empty());

§Workspace Indexing

Build cross-file indexes for workspace-wide navigation:

use perl_parser::workspace_index::WorkspaceIndex;

let mut index = WorkspaceIndex::new();
index.index_file("lib/Foo.pm", "package Foo; sub bar { }");
index.index_file("lib/Baz.pm", "use Foo; Foo::bar();");

// Find all references to Foo::bar
let refs = index.find_references("Foo::bar");

§Testing with perl-corpus

The parser is tested against the comprehensive perl-corpus test suite:

# Run parser tests with full corpus coverage
cargo test -p perl-parser

# Run specific test category
cargo test -p perl-parser --test regex_tests

# Validate documentation examples
cargo test --doc

§Command-Line Tools

Build and install the LSP server binary:

# Build LSP server
cargo build -p perllsp --release

# Install globally
cargo install --path crates/perllsp

# Run LSP server
perllsp --stdio

# Check server health
perllsp --health

§Integration Examples

§VSCode Extension

Configure the LSP server in VSCode settings:

{
  "perl.lsp.path": "/path/to/perllsp",
  "perl.lsp.args": ["--stdio"]
}

§Neovim Integration

require'lspconfig'.perl.setup{
  cmd = { "/path/to/perllsp", "--stdio" },
}

§Performance Characteristics

  • Single-pass parsing: O(n) complexity for well-formed input
  • UTF-16 mapping: Fast bidirectional offset conversion for LSP
  • Incremental updates: Reuses unchanged AST nodes for efficiency
  • Memory efficiency: Streaming token processing with bounded lookahead

§Compatibility

  • Perl Versions: 5.8 through 5.40 (covers 99% of CPAN)
  • LSP Protocol: LSP 3.17 specification
  • Tree-sitter: Compatible AST format and position tracking
  • UTF-16: Full Unicode support with correct LSP position mapping
  • perllsp: Public Cargo entry point for the standalone LSP server
  • perl-lsp-rs: Standalone LSP server runtime implementation (moved from this crate)
  • perl-lexer: Context-aware Perl tokenizer
  • perl-corpus: Comprehensive test corpus and generators
  • perl-dap: Debug Adapter Protocol implementation

§Documentation

  • API Docs: See module documentation below
  • LSP Guide: docs/reference/LSP_IMPLEMENTATION_GUIDE.md
  • Capability Policy: docs/reference/LSP_CAPABILITY_POLICY.md
  • Commands: docs/reference/COMMANDS_REFERENCE.md
  • Current Status: docs/project/CURRENT_STATUS.md

§Architecture

The parser follows a recursive descent design with operator precedence handling, maintaining a clean separation from the lexing phase. This modular approach enables:

  • Independent testing of parsing logic
  • Easy integration with different lexer implementations
  • Clear error boundaries between lexing and parsing phases
  • Optimal performance through single-pass parsing

§Example

use perl_parser::Parser;

let code = "my $x = 42;";
let mut parser = Parser::new(code);

match parser.parse() {
    Ok(ast) => println!("AST: {}", ast.to_sexp()),
    Err(e) => eprintln!("Parse error: {}", e),
}

Re-exports§

pub use tooling::performance;
pub use tooling::perl_critic;
pub use tooling::perltidy;
pub use engine::ast_v2;
pub use engine::edit;
pub use engine::heredoc_collector;
pub use engine::pragma_tracker;
pub use engine::quote_parser;
pub use builtins::builtin_signatures_phf;
pub use perl_dead_code as dead_code_detector;

Modules§

analysis
Semantic analysis, scope resolution, and type inference. Compatibility re-export of semantic analysis modules.
ast
Abstract Syntax Tree (AST) definitions for Perl parsing. Parser engine components and supporting utilities. Abstract Syntax Tree (AST) definitions for Perl parsing. AST facade for the core parser engine.
builtin_signatures
Builtin function signature lookup tables. Builtin function signatures and metadata. Comprehensive built-in function signatures for Perl scripting.
builtins
Perl builtin function signatures and metadata. Re-exported builtin signature tables from perl-parser-core.
code_actions
LSP code actions for automated refactoring and fixes.
completion
LSP completion for code suggestions.
declaration
Variable and subroutine declaration analysis. Semantic analysis, symbol extraction, and type inference. Go-to-declaration support and parent map construction. Declaration Provider for LSP
diagnostics
LSP diagnostics for error reporting.
document_links
LSP document links provider for file and URL navigation.
document_store
In-memory document storage for open editor buffers. Workspace indexing and refactoring orchestration. Document store for managing in-memory text content
engine
Parser engine components and supporting utilities. Re-exported parser engine modules from perl-parser-core.
error
Legacy module aliases for moved engine components. Parser engine components and supporting utilities. Error types and recovery strategies for parser failures. Error types and recovery helpers for the parser engine.
error_classifier
Error classification and recovery strategies for parse failures. Error classification and diagnostic generation for parsed Perl code. Error classification and diagnostic generation for Perl parsing workflows
error_recovery
Error recovery strategies for resilient parsing. Error recovery strategies and traits for the Perl parser. Error recovery for the Perl parser
implementation_provider
LSP implementation provider.
import_optimizer
Import statement analysis and optimization. Refactoring and modernization helpers. Import optimization for Perl modules
index
File and symbol indexing for workspace-wide navigation. Semantic analysis, symbol extraction, and type inference. Lightweight workspace symbol index. Cross-file workspace indexing for Perl symbols
inlay_hints
LSP inlay hints for inline type and parameter information.
inlay_hints_provider
LSP inlay hints provider implementation.
line_index
Line-to-byte offset index for fast position lookups. Line indexing and position mapping utilities.
modernize
Code modernization utilities for Perl best practices. Refactoring and modernization helpers. Legacy Perl modernization helpers.
modernize_refactored
Enhanced code modernization with refactoring capabilities. Refactoring and modernization helpers. Refactored modernization engine with structured pattern definitions.
parser
Legacy module aliases for moved engine components. Parser engine components and supporting utilities. Core parser implementation for Perl source. Recursive descent Perl parser.
parser_context
Parser context with error recovery support. Parser engine components and supporting utilities. Parser context with error recovery support. Parser context with error recovery support
position
Legacy module aliases for moved engine components. Parser engine components and supporting utilities. Position tracking types and UTF-16 mapping utilities. Enhanced position tracking for incremental parsing
refactor
Code refactoring, modernization, and import optimization. Compatibility re-export of refactoring modules.
refactoring
Unified refactoring engine for comprehensive code transformations. Refactoring and modernization helpers. Unified refactoring engine for Perl code transformations
references
LSP references provider for symbol usage analysis.
rename
LSP rename for symbol renaming.
scope_analyzer
Scope analysis for variable and subroutine resolution. Semantic analysis, symbol extraction, and type inference. Scope analysis for variable and subroutine resolution. Scope analysis and variable tracking for Perl parsing workflows
semantic
Semantic model with hover information and token classification. Semantic analysis, symbol extraction, and type inference. Semantic analyzer and token classification. Semantic analysis for IDE features.
semantic_tokens
LSP semantic tokens provider for syntax highlighting.
semantic_tokens_provider
LSP semantic tokens provider implementation.
symbol
Symbol table, extraction, and reference tracking. Semantic analysis, symbol extraction, and type inference. Symbol extraction and symbol table construction. Symbol extraction and symbol table for IDE features
tdd
Test-driven development support and test generation. Compatibility re-export of TDD support modules.
tdd_basic
Basic TDD utilities and test helpers. Test-driven development helpers and generators. Basic TDD workflow support for LSP
test_generator
Intelligent test case generation from parsed Perl code. Test-driven development helpers and generators. Test generator for TDD workflow support
test_runner
Test execution and TDD support functionality. Test-driven development helpers and generators. Test execution and TDD support functionality.
token_stream
Token stream with position-aware iteration. Token stream and trivia utilities for the parser. Token stream adapters used during the Parse stage for LSP workflows. Token stream facade for the core parser engine.
token_wrapper
Lightweight token wrapper for AST integration. Token stream and trivia utilities for the parser. Token wrapper with enhanced position tracking
tokens
Token stream, trivia, and token wrapper utilities. Re-exported token stream utilities from perl-parser-core.
tooling
External tooling integration (perltidy, perlcritic, performance). Compatibility re-export of tooling integrations.
trivia
Trivia (whitespace and comments) representation. Token stream and trivia utilities for the parser. Trivia (comments and whitespace) handling for the Perl parser
trivia_parser
Parser that preserves trivia tokens for formatting. Token stream and trivia utilities for the parser. Trivia-preserving parser implementation
type_definition
LSP type definition provider.
type_hierarchy
LSP type hierarchy provider for inheritance navigation.
type_inference
Type inference engine for Perl variable analysis. Semantic analysis, symbol extraction, and type inference. Type inference engine for Perl variable analysis.
util
Parser utilities and helpers. Utility functions for the Perl parser
workspace
Workspace indexing, document store, and cross-file operations. Compatibility re-export of workspace indexing modules.
workspace_index
Cross-file symbol index for workspace-wide navigation. Workspace indexing and refactoring orchestration. Workspace-wide symbol index for fast cross-file lookups in Perl LSP.
workspace_refactor
Multi-file refactoring operations across a workspace. Workspace-wide refactoring operations for Perl codebases
workspace_rename
Cross-file symbol renaming with conflict detection. Workspace indexing and refactoring orchestration. LSP feature module (deprecated)
workspace_symbols
LSP workspace symbols provider.

Structs§

DuplicateImport
Import analysis, optimization, and unused import detection. A module that is imported multiple times
EnhancedCodeActionsProvider
Enhanced code actions provider with workspace-aware refactoring. Enhanced code actions provider with additional refactorings
HoverInfo
Semantic analysis types for hover, tokens, and code understanding. Hover information for symbols displayed in LSP hover requests.
ImportAnalysis
Import analysis, optimization, and unused import detection. Result of import analysis containing all detected issues and suggestions
ImportEntry
Import analysis, optimization, and unused import detection. A single import statement discovered during analysis
ImportOptimizer
Import analysis, optimization, and unused import detection. Import optimizer for analyzing and optimizing Perl import statements
MissingImport
Import analysis, optimization, and unused import detection. A symbol that is used but not imported
Node
AST node, node kind enum, and source location types. Core AST node representing any Perl language construct within parsing workflows.
NodeWithTrivia
Trivia (whitespace/comments) attached to AST nodes. A node with attached trivia
OrganizationSuggestion
Import analysis, optimization, and unused import detection. A suggestion for improving import organization
Parser
Recursive descent Perl parser with error recovery and AST generation. Parser state for a single Perl source input.
PositionMapper
Line ending detection and UTF-16 position mapping for LSP compliance. Centralized position mapper using rope for efficiency.
PragmaState
Pragma state tracking for use strict, use warnings, etc. Pragma state at a given point in the code
PragmaTracker
Pragma state tracking for use strict, use warnings, etc. Tracks pragma state throughout a Perl file
RefactoringConfig
Refactoring engine types: configuration, operations, and results. Configuration for refactoring operations
RefactoringEngine
Refactoring engine types: configuration, operations, and results. Unified refactoring engine that coordinates all refactoring operations
RefactoringOperation
Refactoring engine types: configuration, operations, and results. Record of a refactoring operation for rollback support
RefactoringResult
Refactoring engine types: configuration, operations, and results. Result of a refactoring operation
ScopeAnalyzer
Scope analysis issue types and analyzer.
ScopeIssue
Scope analysis issue types and analyzer.
SemanticAnalyzer
Semantic analysis types for hover, tokens, and code understanding. Semantic analyzer providing comprehensive IDE features for Perl code.
SemanticModel
Semantic analysis types for hover, tokens, and code understanding. A stable, query-oriented view of semantic information over a parsed file.
SemanticToken
Semantic analysis types for hover, tokens, and code understanding. A semantic token with type and modifiers for LSP syntax highlighting.
Symbol
Symbol extraction, table, and reference types for navigation. A symbol definition in Perl code with comprehensive metadata for Index/Navigate workflows.
SymbolExtractor
Symbol extraction, table, and reference types for navigation. Extract symbols from an AST for Parse/Index workflows.
SymbolReference
Symbol extraction, table, and reference types for navigation. A reference to a symbol with usage context for Navigate/Analyze workflows.
SymbolTable
Symbol extraction, table, and reference types for navigation. Comprehensive symbol table for Perl code analysis and LSP features in Index/Analyze stages.
Token
Token types and token stream for lexer output. Token produced by the lexer and consumed by the parser.
TokenStream
Token types and token stream for lexer output. Token stream that wraps perl-lexer
TriviaPreservingParser
Trivia-preserving parser and formatting utilities. Parser that preserves trivia
TriviaToken
Trivia (whitespace/comments) attached to AST nodes. A trivia token with position information
TypeBasedCompletion
Type inference types: Perl types, constraints, and inference engine. Type-based code completion suggestions
TypeConstraint
Type inference types: Perl types, constraints, and inference engine. Type constraint for type checking
TypeEnvironment
Type inference types: Perl types, constraints, and inference engine. Type environment for tracking variable types
TypeInferenceEngine
Type inference types: Perl types, constraints, and inference engine. Main type inference engine
TypeLocation
Type inference types: Perl types, constraints, and inference engine. Location information for type errors
UnusedImport
Import analysis, optimization, and unused import detection. An import statement containing unused symbols

Enums§

IssueKind
Scope analysis issue types and analyzer.
LineEnding
Line ending detection and UTF-16 position mapping for LSP compliance. Line ending style detected in a document
ModernizationPattern
Refactoring engine types: configuration, operations, and results. Modernization patterns for legacy code
NodeKind
AST node, node kind enum, and source location types. Comprehensive enumeration of all Perl language constructs supported by the parser.
ParseError
Parse error and result types for parser output. Comprehensive error types that can occur during Perl parsing workflows
PerlType
Type inference types: Perl types, constraints, and inference engine. Represents a Perl type
RefactoringScope
Refactoring engine types: configuration, operations, and results. Scope of refactoring operations
RefactoringType
Refactoring engine types: configuration, operations, and results. Types of refactoring operations supported by the engine
ScalarType
Type inference types: Perl types, constraints, and inference engine. Represents specific scalar types in Perl
SemanticTokenModifier
Semantic analysis types for hover, tokens, and code understanding. Semantic token modifiers for Analyze/Complete stage highlighting.
SemanticTokenType
Semantic analysis types for hover, tokens, and code understanding. Semantic token types for syntax highlighting in the Parse/Complete workflow.
SuggestionPriority
Import analysis, optimization, and unused import detection. Priority level for organization suggestions
SymbolKind
Symbol extraction, table, and reference types for navigation. Unified Perl symbol classification for LSP tooling.
TokenKind
Token types and token stream for lexer output. Token classification for Perl parsing.
Trivia
Trivia (whitespace/comments) attached to AST nodes. Trivia represents non-semantic tokens like comments and whitespace

Functions§

format_with_trivia
Trivia-preserving parser and formatting utilities. Format an AST with trivia back to source code

Type Aliases§

ParseResult
Parse error and result types for parser output. Result type for parser operations in the Perl parsing workflow pipeline
SourceLocation
AST node, node kind enum, and source location types. Type alias for backward compatibility with SourceLocation.