Expand description
Robin Sparkless - A Rust DataFrame library with PySpark-like API
This library provides a PySpark-compatible API built on top of Polars, offering high-performance data processing in pure Rust.
§Panics and errors
Some functions panic when used with invalid or empty inputs (e.g. calling
when(cond).otherwise(val) without .then(), or passing no columns to
format_string, elt, concat, coalesce, or named_struct in Rust).
In Rust, create_map and array return Result for empty input instead of
panicking. From Python, empty columns for coalesce, format_string,
printf, and named_struct raise ValueError. See the documentation for
each function for details.
§API stability
While the crate is in the 0.x series, we follow semver but may introduce breaking changes in minor releases (e.g. 0.1 → 0.2) until 1.0. For behavioral caveats and intentional differences from PySpark, see the repository documentation.
Re-exports§
pub use column::Column;pub use dataframe::CubeRollupData;pub use dataframe::DataFrame;pub use dataframe::GroupedData;pub use dataframe::JoinType;pub use dataframe::SaveMode;pub use dataframe::WriteFormat;pub use dataframe::WriteMode;pub use functions::SortOrder;pub use schema::StructField;pub use schema::StructType;pub use session::DataFrameReader;pub use session::SparkSession;pub use session::SparkSessionBuilder;pub use functions::*;
Modules§
- column
- dataframe
- DataFrame module: main tabular type and submodules for transformations, aggregations, joins, stats.
- expression
- functions
- plan
- Plan interpreter: execute a serialized logical plan (list of ops) using the existing DataFrame API.
- schema
- session
- type_
coercion