spark-ddl-parser
A zero-dependency Rust crate for parsing PySpark DDL schema strings into structured types. API-compatible in behavior with the Python package of the same name.
Features
- Zero dependencies (optional
serdefeature) - PySpark compatible – parses standard PySpark DDL format
- Type safe – returns structured enums and structs
- Comprehensive – supports all PySpark data types (nested structs, arrays, maps, decimal)
- Well tested – 150+ tests ported from the Python suite
Installation
Add to your Cargo.toml:
[]
= "0.1"
With optional serde support for Serialize/Deserialize:
[]
= { = "0.1", = ["serde"] }
Quick start
use ;
let schema = parse_ddl_schema.unwrap;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
Supported types
- Simple:
string,int,integer,long,bigint,double,float,boolean,date,timestamp,binary,short,byte, etc. - Arrays:
array<string>,array<long> - Maps:
map<string,int>,map<string,array<long>> - Structs:
struct<name:string,age:int> - Decimal:
decimal(10,2)(with precision and scale)
Both space and colon separators are supported: id long, name string or id:long, name:string.
License
MIT – see LICENSE in the repository root.