Apache Spark Connect Client for Rust
This project houses the experimental client for Spark Connect for Apache Spark written in Rust
Current State of the Project
Currently, the Spark Connect client for Rust is highly experimental and should not be used in any production setting. This is currently a "proof of concept" to identify the methods of interacting with Spark cluster from rust.
Quick Start
The spark-connect-rs
aims to provide an entrypoint to Spark Connect, and provide similar DataFrame API interactions.
use spark_connect_rs;
use ;
async
Getting Started
git clone https://github.com/sjrusso8/spark-connect-rs.git
git submodule update --init --recursive
docker compose up --build -d
cargo build && cargo test
Features
The following section outlines some of the implemented functions that are working with the Spark Connect session
SparkSession
SparkSession | API | Comment |
---|---|---|
range | ||
sql | Does not include the new Spark Connect 3.5 feature with "position arguments" | |
read | ||
createDataFrame | ||
getActiveSession | ||
many more!! |
DataFrame
DataFrame | API | Comment |
---|---|---|
select | ||
selectExpr | Does not include the new Spark Connect 3.5 feature with "position arguments" | |
filter | ||
createTempView | There is an error right now, and the functions are private till it's fixed | |
show | ||
tail | ||
withColumns | ||
drop | ||
sort | ||
groupBy | ||
many more! |