1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
//! Comprehensive Perl test corpus and property-based testing infrastructure
//!
//! This crate provides a curated collection of Perl code samples for testing parser correctness,
//! edge case coverage, and LSP feature validation. It includes both manually curated test cases
//! and property-based test generators for comprehensive coverage.
//!
//! # Architecture
//!
//! The corpus is organized into several layers:
//!
//! - **Curated Test Cases**: Hand-written examples covering Perl syntax edge cases
//! - **Property-Based Generators**: Randomized code generation for fuzz testing
//! - **Real-World Samples**: Code from CPAN and production Perl projects
//! - **Metadata System**: Tag-based organization with section markers and test IDs
//!
//! # Corpus Organization
//!
//! Test cases are stored in text files with section markers and metadata:
//!
//! ```text
//! ==========================================
//! Basic Variable Declaration
//! ==========================================
//! # @id: vars.basic.my
//! # @tags: variables, declaration
//! my $x = 42;
//! ---
//! (expected AST representation)
//! ```
//!
//! Each section includes:
//! - **Title**: Human-readable test case name
//! - **Metadata**: ID, tags, Perl version requirements, flags
//! - **Body**: Perl code to parse
//! - **Expected Output**: Optional AST or error expectations (after `---`)
//!
//! # Usage
//!
//! ## Loading Corpus Files
//!
//! ```rust,ignore
//! use perl_corpus::{CorpusPaths, get_corpus_files};
//!
//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
//! let files = get_corpus_files();
//!
//! for file in files {
//! println!("Found corpus file: {:?}", file.path);
//! }
//! # Ok(())
//! # }
//! ```
//!
//! ## Parsing Corpus Sections
//!
//! ```rust
//! use perl_corpus::parse_file;
//! use std::path::Path;
//!
//! # fn example() -> anyhow::Result<()> {
//! # let path = Path::new("test_corpus/variables.txt");
//! # if !path.exists() { return Ok(()); }
//! let sections = parse_file(path)?;
//!
//! for section in sections {
//! println!("Section: {} (id: {})", section.title, section.id);
//! println!("Tags: {:?}", section.tags);
//! println!("Code:\n{}", section.body);
//! }
//! # Ok(())
//! # }
//! ```
//!
//! ## Finding Cases by Tag
//!
//! ```rust
//! use perl_corpus::{parse_dir, find_by_tag};
//! use std::path::Path;
//!
//! # fn example() -> anyhow::Result<()> {
//! # let corpus_dir = Path::new("test_corpus");
//! # if !corpus_dir.exists() { return Ok(()); }
//! let all_sections = parse_dir(corpus_dir)?;
//! let regex_tests = find_by_tag(&all_sections, "regex");
//!
//! println!("Found {} regex test cases", regex_tests.len());
//! # Ok(())
//! # }
//! ```
//!
//! ## Using Property-Based Generators
//!
//! ```rust,ignore
//! use perl_corpus::{generate_perl_code_with_seed, CodegenOptions};
//!
//! // Generate random valid Perl code
//! let code = generate_perl_code_with_seed(10, 42);
//! println!("Generated:\n{}", code);
//!
//! // Generate with specific options
//! let options = CodegenOptions::default();
//! let modern_code = generate_perl_code(&options);
//! ```
//!
//! ## Specialized Test Case Modules
//!
//! The corpus includes focused generators for specific Perl features:
//!
//! ### Complex Data Structures
//!
//! ```rust,ignore
//! use perl_corpus::{complex_data_structure_cases, find_complex_case};
//!
//! let cases = complex_data_structure_cases();
//! if let Some(nested) = find_complex_case("nested-arrays") {
//! println!("Test: {}", nested.description);
//! println!("Code:\n{}", nested.code);
//! }
//! ```
//!
//! ### Continue/Redo Blocks
//!
//! ```rust
//! use perl_corpus::{continue_redo_cases, valid_continue_redo_cases};
//!
//! let all_cases = continue_redo_cases();
//! let valid_only = valid_continue_redo_cases();
//! ```
//!
//! ### Format Statements
//!
//! ```rust,ignore
//! use perl_corpus::{format_statement_cases, FormatStatementGenerator};
//!
//! let cases = format_statement_cases();
//! let generator = FormatStatementGenerator::new(42);
//! ```
//!
//! ### Glob Expressions
//!
//! ```rust,ignore
//! use perl_corpus::{glob_expression_cases, GlobExpressionGenerator};
//!
//! let cases = glob_expression_cases();
//! let generator = GlobExpressionGenerator::new(42);
//! ```
//!
//! ### Tie Interface
//!
//! ```rust
//! use perl_corpus::{tie_interface_cases, tie_cases_by_tag};
//!
//! let all_tie = tie_interface_cases();
//! let scalar_tie = tie_cases_by_tag("scalar");
//! ```
//!
//! # Corpus Layers
//!
//! The corpus is organized into three layers accessible via [`CorpusLayer`]:
//!
//! - **`CorpusLayer::Main`**: Core test cases in `test_corpus/`
//! - **`CorpusLayer::TreeSitter`**: Tree-sitter grammar tests in `tree-sitter-perl/test/corpus/`
//! - **`CorpusLayer::Fuzz`**: Fuzzing inputs and edge cases in `crates/perl-corpus/fuzz/`
//!
//! ## Environment Configuration
//!
//! Override the corpus root with the `CORPUS_ROOT` environment variable:
//!
//! ```bash
//! export CORPUS_ROOT=/path/to/custom/corpus
//! cargo test
//! ```
//!
//! # Integration with Parser Testing
//!
//! The corpus integrates with `perl-parser` test suites:
//!
//! ```rust,ignore
//! use perl_parser::Parser;
//! use perl_corpus::{parse_dir, find_by_tag};
//!
//! # fn test_parser_with_corpus() -> anyhow::Result<()> {
//! # let corpus_dir = std::path::Path::new("test_corpus");
//! let sections = parse_dir(corpus_dir)?;
//! let regex_cases = find_by_tag(§ions, "regex");
//!
//! for case in regex_cases {
//! let mut parser = Parser::new(&case.body);
//! let result = parser.parse();
//! assert!(result.is_ok(), "Failed to parse: {}", case.title);
//! }
//! # Ok(())
//! # }
//! ```
//!
//! # Test Case Validation
//!
//! Corpus files can include validation flags:
//!
//! - **`parser-sensitive`**: Requires specific parser version
//! - **`perl-version:5.26`**: Requires Perl 5.26+ features
//! - **`expected-error`**: Test case should produce parse error
//! - **`wip`**: Work in progress, may not parse correctly yet
//!
//! # Contributing Test Cases
//!
//! To add new test cases:
//!
//! 1. Create or edit a corpus file in `test_corpus/`
//! 2. Use section markers (`====`) to separate cases
//! 3. Add metadata tags for categorization
//! 4. Include expected output after `---` separator
//! 5. Run `cargo test` to validate
//!
//! See existing corpus files for examples and conventions.
// Corpus crate - focus on core clippy lints only
// Lint enforcement: library code must use tracing, not direct stderr/stdout prints.
pub use *;