Oak CSV Parser
High-performance incremental CSV parser for the oak ecosystem with flexible configuration, optimized for data processing and analysis.
🎯 Overview
Oak-csv is a robust parser for CSV, designed to handle complete CSV syntax including modern features. Built on the solid foundation of oak-core, it provides both high-level convenience and detailed AST generation for data processing and analysis.
✨ Features
- Complete CSV Syntax: Supports all CSV features including modern specifications
- Full AST Generation: Generates comprehensive Abstract Syntax Trees
- Lexer Support: Built-in tokenization with proper span information
- Error Recovery: Graceful handling of syntax errors with detailed diagnostics
🚀 Quick Start
Basic example:
use CsvParser;
📋 Parsing Examples
Document Parsing
use ;
let parser = new;
let csv_content = r#"
product_id,product_name,price,stock
001,Smartphone,599.99,50
002,Laptop,1299.99,25
003,Headphones,199.99,100
"#;
let document = parser.parse_document?;
println!;
println!;
Record Parsing
use ;
let parser = new;
let csv_content = "Alice,28,Engineer,Seattle";
let record = parser.parse_record?;
println!;
println!;
Field Parsing
use ;
let parser = new;
let field_content = "\"John \"JD\" Doe\"";
let field = parser.parse_field?;
println!;
println!;
🔧 Advanced Features
Token-Level Parsing
use ;
let parser = new;
let tokens = parser.tokenize?;
for token in tokens
Error Handling
use CsvParser;
let parser = new;
let invalid_csv = r#"
name,age,city
John,25,NYC
Jane,30 // Missing field
Bob,35,London,UK
"#;
match parser.parse_document
🏗️ AST Structure
The parser generates a comprehensive AST with the following main structures:
- Document: Root container for CSV documents
- Record: CSV records with field values
- Field: Individual CSV fields with optional quoting
- Header: Optional header row with column names
- Delimiter: Field delimiter (usually comma)
- Quote: Quote character for quoted fields
📊 Performance
- Streaming: Parse large CSV files without loading entirely into memory
- Incremental: Re-parse only changed sections
- Memory Efficient: Smart AST node allocation
- Fast Recovery: Quick error recovery for better IDE integration
🔗 Integration
Oak-csv integrates seamlessly with:
- Data Processing: Extract data from CSV files
- Configuration Files: Parse CSV configuration files
- Data Analysis: Process CSV data for analysis
- IDE Support: Language server protocol compatibility
- ETL Pipelines: CSV parsing for data transformation
📚 Examples
Check out the examples directory for comprehensive examples:
- Complete CSV document parsing
- Record and field analysis
- Data extraction and transformation
- Integration with development workflows
🤝 Contributing
Contributions are welcome!
Please feel free to submit pull requests at the project repository or open issues.