pub struct SparkSession { /* private fields */ }Expand description
Main entry point for creating DataFrames and executing queries Similar to PySpark’s SparkSession but using Polars as the backend
Implementations§
Source§impl SparkSession
impl SparkSession
pub fn new( app_name: Option<String>, master: Option<String>, config: HashMap<String, String>, ) -> Self
Sourcepub fn create_or_replace_temp_view(&self, name: &str, df: DataFrame)
pub fn create_or_replace_temp_view(&self, name: &str, df: DataFrame)
Register a DataFrame as a temporary view (PySpark: createOrReplaceTempView). The view is session-scoped and is dropped when the session is dropped.
Sourcepub fn create_global_temp_view(&self, name: &str, df: DataFrame)
pub fn create_global_temp_view(&self, name: &str, df: DataFrame)
Global temp view (PySpark: createGlobalTempView). Persists across sessions within the same process.
Sourcepub fn create_or_replace_global_temp_view(&self, name: &str, df: DataFrame)
pub fn create_or_replace_global_temp_view(&self, name: &str, df: DataFrame)
Global temp view (PySpark: createOrReplaceGlobalTempView). Persists across sessions within the same process.
Sourcepub fn drop_temp_view(&self, name: &str)
pub fn drop_temp_view(&self, name: &str)
Drop a temporary view by name (PySpark: catalog.dropTempView). No error if the view does not exist.
Sourcepub fn drop_global_temp_view(&self, name: &str) -> bool
pub fn drop_global_temp_view(&self, name: &str) -> bool
Drop a global temporary view (PySpark: catalog.dropGlobalTempView). Removes from process-wide catalog.
Sourcepub fn register_table(&self, name: &str, df: DataFrame)
pub fn register_table(&self, name: &str, df: DataFrame)
Register a DataFrame as a saved table (PySpark: saveAsTable). Inserts into the tables catalog only.
Sourcepub fn get_saved_table(&self, name: &str) -> Option<DataFrame>
pub fn get_saved_table(&self, name: &str) -> Option<DataFrame>
Get a saved table by name (tables map only). Returns None if not in saved tables (temp views not checked).
Sourcepub fn saved_table_exists(&self, name: &str) -> bool
pub fn saved_table_exists(&self, name: &str) -> bool
True if the name exists in the saved-tables map (not temp views).
Sourcepub fn table_exists(&self, name: &str) -> bool
pub fn table_exists(&self, name: &str) -> bool
Check if a table or temp view exists (PySpark: catalog.tableExists). True if name is in temp views, saved tables, global temp, or warehouse.
Sourcepub fn list_global_temp_view_names(&self) -> Vec<String>
pub fn list_global_temp_view_names(&self) -> Vec<String>
Return global temp view names (process-scoped). PySpark: catalog.listTables(dbName=“global_temp”).
Sourcepub fn list_temp_view_names(&self) -> Vec<String>
pub fn list_temp_view_names(&self) -> Vec<String>
Return temporary view names in this session.
Sourcepub fn list_table_names(&self) -> Vec<String>
pub fn list_table_names(&self) -> Vec<String>
Return saved table names in this session (saveAsTable / write_delta_table).
Sourcepub fn drop_table(&self, name: &str) -> bool
pub fn drop_table(&self, name: &str) -> bool
Drop a saved table by name (removes from tables catalog only). No-op if not present.
Sourcepub fn warehouse_dir(&self) -> Option<&str>
pub fn warehouse_dir(&self) -> Option<&str>
Return spark.sql.warehouse.dir from config if set. Enables disk-backed saveAsTable.
Sourcepub fn table(&self, name: &str) -> Result<DataFrame, PolarsError>
pub fn table(&self, name: &str) -> Result<DataFrame, PolarsError>
Look up a table or temp view by name (PySpark: table(name)). Resolution order: (1) global_temp.xyz from global catalog, (2) temp view, (3) saved table, (4) warehouse.
pub fn builder() -> SparkSessionBuilder
Sourcepub fn get_config(&self) -> &HashMap<String, String>
pub fn get_config(&self) -> &HashMap<String, String>
Return a reference to the session config (for catalog/conf compatibility).
Sourcepub fn is_case_sensitive(&self) -> bool
pub fn is_case_sensitive(&self) -> bool
Whether column names are case-sensitive (PySpark: spark.sql.caseSensitive). Default is false (case-insensitive matching).
Sourcepub fn register_udf<F>(&self, name: &str, f: F) -> Result<(), PolarsError>
pub fn register_udf<F>(&self, name: &str, f: F) -> Result<(), PolarsError>
Register a Rust UDF. Session-scoped. Use with call_udf. PySpark: spark.udf.register (Python) or equivalent.
Sourcepub fn create_dataframe(
&self,
data: Vec<(i64, i64, String)>,
column_names: Vec<&str>,
) -> Result<DataFrame, PolarsError>
pub fn create_dataframe( &self, data: Vec<(i64, i64, String)>, column_names: Vec<&str>, ) -> Result<DataFrame, PolarsError>
Create a DataFrame from a vector of tuples (i64, i64, String)
§Example
use robin_sparkless::session::SparkSession;
let spark = SparkSession::builder().app_name("test").get_or_create();
let df = spark.create_dataframe(
vec![
(1, 25, "Alice".to_string()),
(2, 30, "Bob".to_string()),
],
vec!["id", "age", "name"],
)?;Sourcepub fn create_dataframe_from_polars(&self, df: PlDataFrame) -> DataFrame
pub fn create_dataframe_from_polars(&self, df: PlDataFrame) -> DataFrame
Create a DataFrame from a Polars DataFrame
Sourcepub fn create_dataframe_from_rows(
&self,
rows: Vec<Vec<JsonValue>>,
schema: Vec<(String, String)>,
) -> Result<DataFrame, PolarsError>
pub fn create_dataframe_from_rows( &self, rows: Vec<Vec<JsonValue>>, schema: Vec<(String, String)>, ) -> Result<DataFrame, PolarsError>
Create a DataFrame from rows and a schema (arbitrary column count and types).
rows: each inner vec is one row; length must match schema length. Values are JSON-like (i64, f64, string, bool, null, object, array).
schema: list of (column_name, dtype_string), e.g. [("id", "bigint"), ("name", "string")].
Supported dtype strings: bigint, int, long, double, float, string, str, varchar, boolean, bool, date, timestamp, datetime, array<element_type>, structfield:type,....
Sourcepub fn range(
&self,
start: i64,
end: i64,
step: i64,
) -> Result<DataFrame, PolarsError>
pub fn range( &self, start: i64, end: i64, step: i64, ) -> Result<DataFrame, PolarsError>
Create a DataFrame with a single column id (bigint) containing values from start to end (exclusive) with step.
PySpark: spark.range(end) or spark.range(start, end, step).
range(end)→ 0 to end-1, step 1range(start, end)→ start to end-1, step 1range(start, end, step)→ start, start+step, … up to but not including end
Sourcepub fn read_csv(&self, path: impl AsRef<Path>) -> Result<DataFrame, PolarsError>
pub fn read_csv(&self, path: impl AsRef<Path>) -> Result<DataFrame, PolarsError>
Read a CSV file.
Uses Polars’ CSV reader with default options:
- Header row is inferred (default: true)
- Schema is inferred from first 100 rows
§Example
use robin_sparkless::SparkSession;
let spark = SparkSession::builder().app_name("test").get_or_create();
let df_result = spark.read_csv("data.csv");
// Handle the Result as appropriate in your applicationSourcepub fn read_parquet(
&self,
path: impl AsRef<Path>,
) -> Result<DataFrame, PolarsError>
pub fn read_parquet( &self, path: impl AsRef<Path>, ) -> Result<DataFrame, PolarsError>
Read a Parquet file.
Uses Polars’ Parquet reader. Parquet files have embedded schema, so schema inference is automatic.
§Example
use robin_sparkless::SparkSession;
let spark = SparkSession::builder().app_name("test").get_or_create();
let df_result = spark.read_parquet("data.parquet");
// Handle the Result as appropriate in your applicationSourcepub fn read_json(
&self,
path: impl AsRef<Path>,
) -> Result<DataFrame, PolarsError>
pub fn read_json( &self, path: impl AsRef<Path>, ) -> Result<DataFrame, PolarsError>
Read a JSON file (JSONL format - one JSON object per line).
Uses Polars’ JSONL reader with default options:
- Schema is inferred from first 100 rows
§Example
use robin_sparkless::SparkSession;
let spark = SparkSession::builder().app_name("test").get_or_create();
let df_result = spark.read_json("data.json");
// Handle the Result as appropriate in your applicationSourcepub fn sql(&self, _query: &str) -> Result<DataFrame, PolarsError>
pub fn sql(&self, _query: &str) -> Result<DataFrame, PolarsError>
Execute a SQL query (stub when sql feature is disabled).
Sourcepub fn read_delta(&self, name_or_path: &str) -> Result<DataFrame, PolarsError>
pub fn read_delta(&self, name_or_path: &str) -> Result<DataFrame, PolarsError>
Stub when delta feature is disabled. Still supports reading by table name.
pub fn read_delta_with_version( &self, name_or_path: &str, version: Option<i64>, ) -> Result<DataFrame, PolarsError>
pub fn read_delta_from_path( &self, _path: impl AsRef<Path>, ) -> Result<DataFrame, PolarsError>
Source§impl SparkSession
impl SparkSession
Sourcepub fn read(&self) -> DataFrameReader
pub fn read(&self) -> DataFrameReader
Get a DataFrameReader for reading files
Trait Implementations§
Source§impl Clone for SparkSession
impl Clone for SparkSession
Source§fn clone(&self) -> SparkSession
fn clone(&self) -> SparkSession
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreAuto Trait Implementations§
impl Freeze for SparkSession
impl RefUnwindSafe for SparkSession
impl Send for SparkSession
impl Sync for SparkSession
impl Unpin for SparkSession
impl UnwindSafe for SparkSession
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more