Expand description
Comprehensive Perl test corpus and property-based testing infrastructure
This crate provides a curated collection of Perl code samples for testing parser correctness, edge case coverage, and LSP feature validation. It includes both manually curated test cases and property-based test generators for comprehensive coverage.
§Architecture
The corpus is organized into several layers:
- Curated Test Cases: Hand-written examples covering Perl syntax edge cases
- Property-Based Generators: Randomized code generation for fuzz testing
- Real-World Samples: Code from CPAN and production Perl projects
- Metadata System: Tag-based organization with section markers and test IDs
§Corpus Organization
Test cases are stored in text files with section markers and metadata:
==========================================
Basic Variable Declaration
==========================================
# @id: vars.basic.my
# @tags: variables, declaration
my $x = 42;
---
(expected AST representation)Each section includes:
- Title: Human-readable test case name
- Metadata: ID, tags, Perl version requirements, flags
- Body: Perl code to parse
- Expected Output: Optional AST or error expectations (after
---)
§Usage
§Loading Corpus Files
use perl_corpus::{CorpusPaths, get_corpus_files};
let files = get_corpus_files();
for file in files {
println!("Found corpus file: {:?}", file.path);
}§Parsing Corpus Sections
use perl_corpus::parse_file;
use std::path::Path;
let sections = parse_file(path)?;
for section in sections {
println!("Section: {} (id: {})", section.title, section.id);
println!("Tags: {:?}", section.tags);
println!("Code:\n{}", section.body);
}§Finding Cases by Tag
use perl_corpus::{parse_dir, find_by_tag};
use std::path::Path;
let all_sections = parse_dir(corpus_dir)?;
let regex_tests = find_by_tag(&all_sections, "regex");
println!("Found {} regex test cases", regex_tests.len());§Using Property-Based Generators
use perl_corpus::{generate_perl_code_with_seed, CodegenOptions};
// Generate random valid Perl code
let code = generate_perl_code_with_seed(10, 42);
println!("Generated:\n{}", code);
// Generate with specific options
let options = CodegenOptions::default();
let modern_code = generate_perl_code(&options);§Specialized Test Case Modules
The corpus includes focused generators for specific Perl features:
§Complex Data Structures
use perl_corpus::{complex_data_structure_cases, find_complex_case};
let cases = complex_data_structure_cases();
if let Some(nested) = find_complex_case("nested-arrays") {
println!("Test: {}", nested.description);
println!("Code:\n{}", nested.code);
}§Continue/Redo Blocks
use perl_corpus::{continue_redo_cases, valid_continue_redo_cases};
let all_cases = continue_redo_cases();
let valid_only = valid_continue_redo_cases();§Format Statements
use perl_corpus::{format_statement_cases, FormatStatementGenerator};
let cases = format_statement_cases();
let generator = FormatStatementGenerator::new(42);§Glob Expressions
use perl_corpus::{glob_expression_cases, GlobExpressionGenerator};
let cases = glob_expression_cases();
let generator = GlobExpressionGenerator::new(42);§Tie Interface
use perl_corpus::{tie_interface_cases, tie_cases_by_tag};
let all_tie = tie_interface_cases();
let scalar_tie = tie_cases_by_tag("scalar");§Corpus Layers
The corpus is organized into three layers accessible via CorpusLayer:
CorpusLayer::Main: Core test cases intest_corpus/CorpusLayer::TreeSitter: Tree-sitter grammar tests intree-sitter-perl/test/corpus/CorpusLayer::Fuzz: Fuzzing inputs and edge cases incrates/perl-corpus/fuzz/
§Environment Configuration
Override the corpus root with the CORPUS_ROOT environment variable:
export CORPUS_ROOT=/path/to/custom/corpus
cargo test§Integration with Parser Testing
The corpus integrates with perl-parser test suites:
use perl_parser::Parser;
use perl_corpus::{parse_dir, find_by_tag};
let sections = parse_dir(corpus_dir)?;
let regex_cases = find_by_tag(§ions, "regex");
for case in regex_cases {
let mut parser = Parser::new(&case.body);
let result = parser.parse();
assert!(result.is_ok(), "Failed to parse: {}", case.title);
}§Test Case Validation
Corpus files can include validation flags:
parser-sensitive: Requires specific parser versionperl-version:5.26: Requires Perl 5.26+ featuresexpected-error: Test case should produce parse errorwip: Work in progress, may not parse correctly yet
§Contributing Test Cases
To add new test cases:
- Create or edit a corpus file in
test_corpus/ - Use section markers (
====) to separate cases - Add metadata tags for categorization
- Include expected output after
---separator - Run
cargo testto validate
See existing corpus files for examples and conventions.
Re-exports§
pub use api::*;
Modules§
- api
- cases
- Static edge case fixtures and complex data structure samples.
- codegen
- Randomized Perl code generation utilities.
- concepts
- continue_
redo - Continue and redo loop control statement test fixtures.
- files
- Corpus file discovery helpers.
- fixture_
expectations - format_
statements - Format statement test fixtures for Perl LSP corpus.
- gen
- glob_
expressions - Glob expression test fixtures for file pattern matching and diamond operator.
- gold
- index
- inventory
- lint
- loading
- meta
- metadata
- prelude
- sidecar
- tie_
interface - Tie/untie interface corpus - comprehensive test fixtures for Perl’s tie mechanism.