pub struct Column { /* private fields */ }Expand description
Column - represents a column in a DataFrame, used for building expressions
Thin wrapper around Polars Expr. May carry a DeferredRandom for rand/randn so with_column can produce one value per row.
Implementations§
Source§impl Column
impl Column
Sourcepub fn from_rand(seed: Option<u64>) -> Self
pub fn from_rand(seed: Option<u64>) -> Self
Create a Column for rand(seed). When used in with_column, generates one value per row (PySpark-like).
Sourcepub fn from_randn(seed: Option<u64>) -> Self
pub fn from_randn(seed: Option<u64>) -> Self
Create a Column for randn(seed). When used in with_column, generates one value per row (PySpark-like).
Sourcepub fn asc(&self) -> SortOrder
pub fn asc(&self) -> SortOrder
Ascending sort, nulls first (Spark default for ASC). PySpark asc.
Sourcepub fn asc_nulls_first(&self) -> SortOrder
pub fn asc_nulls_first(&self) -> SortOrder
Ascending sort, nulls first. PySpark asc_nulls_first.
Sourcepub fn asc_nulls_last(&self) -> SortOrder
pub fn asc_nulls_last(&self) -> SortOrder
Ascending sort, nulls last. PySpark asc_nulls_last.
Sourcepub fn desc(&self) -> SortOrder
pub fn desc(&self) -> SortOrder
Descending sort, nulls last (Spark default for DESC). PySpark desc.
Sourcepub fn desc_nulls_first(&self) -> SortOrder
pub fn desc_nulls_first(&self) -> SortOrder
Descending sort, nulls first. PySpark desc_nulls_first.
Sourcepub fn desc_nulls_last(&self) -> SortOrder
pub fn desc_nulls_last(&self) -> SortOrder
Descending sort, nulls last. PySpark desc_nulls_last.
Sourcepub fn is_not_null(&self) -> Column
pub fn is_not_null(&self) -> Column
Check if column is not null
Sourcepub fn like(&self, pattern: &str, escape_char: Option<char>) -> Column
pub fn like(&self, pattern: &str, escape_char: Option<char>) -> Column
SQL LIKE pattern matching (% = any chars, _ = one char). PySpark like. When escape_char is Some(esc), esc + char treats that char as literal (e.g. \% = literal %).
Sourcepub fn ilike(&self, pattern: &str, escape_char: Option<char>) -> Column
pub fn ilike(&self, pattern: &str, escape_char: Option<char>) -> Column
Case-insensitive LIKE. PySpark ilike. When escape_char is Some(esc), esc + char treats that char as literal.
Sourcepub fn eq_pyspark(&self, other: &Column) -> Column
pub fn eq_pyspark(&self, other: &Column) -> Column
PySpark-style equality comparison (NULL == NULL returns NULL, not True) Any comparison involving NULL returns NULL
Explicitly wraps comparisons with null checks to ensure PySpark semantics. If either side is NULL, the result is NULL.
Sourcepub fn ne_pyspark(&self, other: &Column) -> Column
pub fn ne_pyspark(&self, other: &Column) -> Column
PySpark-style inequality comparison (NULL != NULL returns NULL, not False) Any comparison involving NULL returns NULL
Sourcepub fn eq_null_safe(&self, other: &Column) -> Column
pub fn eq_null_safe(&self, other: &Column) -> Column
Null-safe equality (NULL <=> NULL returns True) PySpark’s eqNullSafe() method
Sourcepub fn gt_pyspark(&self, other: &Column) -> Column
pub fn gt_pyspark(&self, other: &Column) -> Column
PySpark-style greater-than comparison (NULL > value returns NULL) Any comparison involving NULL returns NULL
Sourcepub fn ge_pyspark(&self, other: &Column) -> Column
pub fn ge_pyspark(&self, other: &Column) -> Column
PySpark-style greater-than-or-equal comparison Any comparison involving NULL returns NULL
Sourcepub fn lt_pyspark(&self, other: &Column) -> Column
pub fn lt_pyspark(&self, other: &Column) -> Column
PySpark-style less-than comparison Any comparison involving NULL returns NULL
Sourcepub fn le_pyspark(&self, other: &Column) -> Column
pub fn le_pyspark(&self, other: &Column) -> Column
PySpark-style less-than-or-equal comparison Any comparison involving NULL returns NULL
Sourcepub fn substr(&self, start: i64, length: Option<i64>) -> Column
pub fn substr(&self, start: i64, length: Option<i64>) -> Column
Substring with 1-based start (PySpark substring semantics)
Sourcepub fn bit_length(&self) -> Column
pub fn bit_length(&self) -> Column
Bit length of string in bytes * 8 (PySpark bit_length).
Sourcepub fn octet_length(&self) -> Column
pub fn octet_length(&self) -> Column
Length of string in bytes (PySpark octet_length).
Sourcepub fn char_length(&self) -> Column
pub fn char_length(&self) -> Column
Length of string in characters (PySpark char_length). Alias of length().
Sourcepub fn character_length(&self) -> Column
pub fn character_length(&self) -> Column
Length of string in characters (PySpark character_length). Alias of length().
Sourcepub fn encode(&self, charset: &str) -> Column
pub fn encode(&self, charset: &str) -> Column
Encode string to binary (PySpark encode). Charset: UTF-8. Returns hex string.
Sourcepub fn decode(&self, charset: &str) -> Column
pub fn decode(&self, charset: &str) -> Column
Decode binary (hex string) to string (PySpark decode). Charset: UTF-8.
Sourcepub fn to_binary(&self, fmt: &str) -> Column
pub fn to_binary(&self, fmt: &str) -> Column
Convert to binary (PySpark to_binary). fmt: ‘utf-8’, ‘hex’. Returns hex string.
Sourcepub fn try_to_binary(&self, fmt: &str) -> Column
pub fn try_to_binary(&self, fmt: &str) -> Column
Try convert to binary; null on failure (PySpark try_to_binary).
Sourcepub fn aes_encrypt(&self, key: &str) -> Column
pub fn aes_encrypt(&self, key: &str) -> Column
AES encrypt (PySpark aes_encrypt). Key as string; AES-128-GCM. Output hex(nonce||ciphertext).
Sourcepub fn aes_decrypt(&self, key: &str) -> Column
pub fn aes_decrypt(&self, key: &str) -> Column
AES decrypt (PySpark aes_decrypt). Input hex(nonce||ciphertext). Null on failure.
Sourcepub fn try_aes_decrypt(&self, key: &str) -> Column
pub fn try_aes_decrypt(&self, key: &str) -> Column
Try AES decrypt (PySpark try_aes_decrypt). Returns null on failure.
Sourcepub fn btrim(&self, trim_str: Option<&str>) -> Column
pub fn btrim(&self, trim_str: Option<&str>) -> Column
Trim leading and trailing characters (PySpark btrim). trim_str defaults to whitespace.
Sourcepub fn locate(&self, substr: &str, pos: i64) -> Column
pub fn locate(&self, substr: &str, pos: i64) -> Column
Find substring position 1-based, starting at pos (PySpark locate). 0 if not found.
Sourcepub fn conv(&self, from_base: i32, to_base: i32) -> Column
pub fn conv(&self, from_base: i32, to_base: i32) -> Column
Base conversion (PySpark conv). num_str from from_base to to_base.
Sourcepub fn bit_and(&self, other: &Column) -> Column
pub fn bit_and(&self, other: &Column) -> Column
Bitwise AND of two integer/boolean columns (PySpark bit_and).
Sourcepub fn bit_or(&self, other: &Column) -> Column
pub fn bit_or(&self, other: &Column) -> Column
Bitwise OR of two integer/boolean columns (PySpark bit_or).
Sourcepub fn bit_xor(&self, other: &Column) -> Column
pub fn bit_xor(&self, other: &Column) -> Column
Bitwise XOR of two integer/boolean columns (PySpark bit_xor).
Sourcepub fn bit_count(&self) -> Column
pub fn bit_count(&self) -> Column
Count of set bits in the integer representation (PySpark bit_count).
Sourcepub fn assert_true(&self, err_msg: Option<&str>) -> Column
pub fn assert_true(&self, err_msg: Option<&str>) -> Column
Assert that all boolean values are true; errors otherwise (PySpark assert_true). When err_msg is Some, it is used in the error message when assertion fails.
Sourcepub fn bitwise_not(&self) -> Column
pub fn bitwise_not(&self) -> Column
Bitwise NOT of an integer/boolean column (PySpark bitwise_not / bitwiseNOT).
Sourcepub fn str_to_map(&self, pair_delim: &str, key_value_delim: &str) -> Column
pub fn str_to_map(&self, pair_delim: &str, key_value_delim: &str) -> Column
Parse string to map (PySpark str_to_map). “k1:v1,k2:v2” -> map.
Sourcepub fn regexp_extract(&self, pattern: &str, group_index: usize) -> Column
pub fn regexp_extract(&self, pattern: &str, group_index: usize) -> Column
Extract first match of regex pattern (PySpark regexp_extract). Group 0 = full match.
Sourcepub fn regexp_replace(&self, pattern: &str, replacement: &str) -> Column
pub fn regexp_replace(&self, pattern: &str, replacement: &str) -> Column
Replace first match of regex pattern (PySpark regexp_replace). literal=false for regex.
Sourcepub fn replace(&self, search: &str, replacement: &str) -> Column
pub fn replace(&self, search: &str, replacement: &str) -> Column
Replace all occurrences of literal search string with replacement (PySpark replace for literal).
Sourcepub fn startswith(&self, prefix: &str) -> Column
pub fn startswith(&self, prefix: &str) -> Column
True if string starts with prefix (PySpark startswith).
Sourcepub fn endswith(&self, suffix: &str) -> Column
pub fn endswith(&self, suffix: &str) -> Column
True if string ends with suffix (PySpark endswith).
Sourcepub fn contains(&self, substring: &str) -> Column
pub fn contains(&self, substring: &str) -> Column
True if string contains substring (literal, not regex). PySpark contains.
Sourcepub fn split(&self, delimiter: &str) -> Column
pub fn split(&self, delimiter: &str) -> Column
Split string by delimiter (PySpark split). Returns list of strings. Uses literal split so “|” is not interpreted as regex alternation.
Sourcepub fn initcap(&self) -> Column
pub fn initcap(&self) -> Column
Title case: first letter of each word uppercase (PySpark initcap). Approximates with lowercase when Polars to_titlecase is not enabled.
Sourcepub fn regexp_extract_all(&self, pattern: &str) -> Column
pub fn regexp_extract_all(&self, pattern: &str) -> Column
Extract all matches of regex (PySpark regexp_extract_all). Returns list of strings.
Sourcepub fn regexp_like(&self, pattern: &str) -> Column
pub fn regexp_like(&self, pattern: &str) -> Column
Check if string matches regex (PySpark regexp_like / rlike).
Sourcepub fn regexp_count(&self, pattern: &str) -> Column
pub fn regexp_count(&self, pattern: &str) -> Column
Count of non-overlapping regex matches (PySpark regexp_count).
Sourcepub fn regexp_substr(&self, pattern: &str) -> Column
pub fn regexp_substr(&self, pattern: &str) -> Column
First substring matching regex (PySpark regexp_substr). Null if no match.
Sourcepub fn regexp_instr(&self, pattern: &str, group_idx: Option<usize>) -> Column
pub fn regexp_instr(&self, pattern: &str, group_idx: Option<usize>) -> Column
1-based position of first regex match (PySpark regexp_instr). group_idx 0 = full match; null if no match.
Sourcepub fn find_in_set(&self, set_column: &Column) -> Column
pub fn find_in_set(&self, set_column: &Column) -> Column
1-based index of self in comma-delimited set column (PySpark find_in_set). 0 if not found or self contains comma.
Sourcepub fn repeat(&self, n: i32) -> Column
pub fn repeat(&self, n: i32) -> Column
Repeat string column n times (PySpark repeat). Each element repeated n times.
Sourcepub fn instr(&self, substr: &str) -> Column
pub fn instr(&self, substr: &str) -> Column
Find substring position (1-based; 0 if not found). PySpark instr(col, substr).
Sourcepub fn lpad(&self, length: i32, pad: &str) -> Column
pub fn lpad(&self, length: i32, pad: &str) -> Column
Left-pad string to length with pad character (PySpark lpad).
Sourcepub fn rpad(&self, length: i32, pad: &str) -> Column
pub fn rpad(&self, length: i32, pad: &str) -> Column
Right-pad string to length with pad character (PySpark rpad).
Sourcepub fn translate(&self, from_str: &str, to_str: &str) -> Column
pub fn translate(&self, from_str: &str, to_str: &str) -> Column
Character-by-character translation (PySpark translate). Replaces each char in from_str with corresponding in to_str; if to_str is shorter, extra from chars are removed.
Sourcepub fn mask(
&self,
upper_char: Option<char>,
lower_char: Option<char>,
digit_char: Option<char>,
other_char: Option<char>,
) -> Column
pub fn mask( &self, upper_char: Option<char>, lower_char: Option<char>, digit_char: Option<char>, other_char: Option<char>, ) -> Column
Mask string: replace uppercase with upper_char, lowercase with lower_char, digits with digit_char (PySpark mask). Defaults: upper ‘X’, lower ‘x’, digit ‘n’; other chars unchanged.
Sourcepub fn split_part(&self, delimiter: &str, part_num: i64) -> Column
pub fn split_part(&self, delimiter: &str, part_num: i64) -> Column
Split by delimiter and return 1-based part (PySpark split_part). part_num > 0: from left; part_num < 0: from right; part_num = 0: null; out-of-range: empty string.
Sourcepub fn substring_index(&self, delimiter: &str, count: i64) -> Column
pub fn substring_index(&self, delimiter: &str, count: i64) -> Column
Substring before/after nth delimiter (PySpark substring_index). count > 0: before nth from left; count < 0: after nth from right.
Sourcepub fn soundex(&self) -> Column
pub fn soundex(&self) -> Column
Soundex code (PySpark soundex). Implemented via map UDF (strsim/soundex crates).
Sourcepub fn levenshtein(&self, other: &Column) -> Column
pub fn levenshtein(&self, other: &Column) -> Column
Levenshtein distance to another string (PySpark levenshtein). Implemented via map_many UDF (strsim).
Sourcepub fn crc32(&self) -> Column
pub fn crc32(&self) -> Column
CRC32 checksum of string bytes (PySpark crc32). Implemented via map UDF (crc32fast).
Sourcepub fn xxhash64(&self) -> Column
pub fn xxhash64(&self) -> Column
XXH64 hash of string (PySpark xxhash64). Implemented via map UDF (twox-hash).
Sourcepub fn format_number(&self, decimals: u32) -> Column
pub fn format_number(&self, decimals: u32) -> Column
Format numeric as string with fixed decimal places (PySpark format_number).
Sourcepub fn char(&self) -> Column
pub fn char(&self) -> Column
Int to single-character string (PySpark char / chr). Valid codepoint only.
Sourcepub fn unbase64(&self) -> Column
pub fn unbase64(&self) -> Column
Base64 decode to string (PySpark unbase64). Invalid decode → null.
Sourcepub fn sha2(&self, bit_length: i32) -> Column
pub fn sha2(&self, bit_length: i32) -> Column
SHA2 hash; bit_length 256, 384, or 512 (PySpark sha2). Default 256.
Sourcepub fn overlay(&self, replace: &str, pos: i64, length: i64) -> Column
pub fn overlay(&self, replace: &str, pos: i64, length: i64) -> Column
Replace substring at 1-based position (PySpark overlay). replace is literal string.
Sourcepub fn bround(&self, scale: i32) -> Column
pub fn bround(&self, scale: i32) -> Column
Banker’s rounding - round half to even (PySpark bround).
Sourcepub fn multiply(&self, other: &Column) -> Column
pub fn multiply(&self, other: &Column) -> Column
Multiply by another column or literal (PySpark multiply). Broadcasts scalars.
Sourcepub fn add(&self, other: &Column) -> Column
pub fn add(&self, other: &Column) -> Column
Add another column or literal (PySpark +). Broadcasts scalars.
Sourcepub fn subtract(&self, other: &Column) -> Column
pub fn subtract(&self, other: &Column) -> Column
Subtract another column or literal (PySpark -). Broadcasts scalars.
Sourcepub fn divide(&self, other: &Column) -> Column
pub fn divide(&self, other: &Column) -> Column
Divide by another column or literal (PySpark /). Broadcasts scalars.
Sourcepub fn pow(&self, exp: i64) -> Column
pub fn pow(&self, exp: i64) -> Column
Power (PySpark pow). Exponent can be literal or expression.
Sourcepub fn atan2(&self, x: &Column) -> Column
pub fn atan2(&self, x: &Column) -> Column
Two-argument arc tangent (y, x) -> angle in radians. PySpark atan2.
Sourcepub fn to_degrees(&self) -> Column
pub fn to_degrees(&self) -> Column
Alias for degrees. PySpark toDegrees.
Sourcepub fn to_radians(&self) -> Column
pub fn to_radians(&self) -> Column
Alias for radians. PySpark toRadians.
Sourcepub fn cast_to(&self, type_name: &str) -> Result<Column, String>
pub fn cast_to(&self, type_name: &str) -> Result<Column, String>
Cast to the given type (PySpark cast). Fails on invalid conversion.
Sourcepub fn try_cast_to(&self, type_name: &str) -> Result<Column, String>
pub fn try_cast_to(&self, type_name: &str) -> Result<Column, String>
Cast to the given type, null on invalid conversion (PySpark try_cast).
Sourcepub fn dayofmonth(&self) -> Column
pub fn dayofmonth(&self) -> Column
Alias for day. PySpark dayofmonth.
Sourcepub fn quarter(&self) -> Column
pub fn quarter(&self) -> Column
Extract quarter (1-4) from date/datetime column (PySpark quarter).
Sourcepub fn weekofyear(&self) -> Column
pub fn weekofyear(&self) -> Column
Extract ISO week of year (1-53) (PySpark weekofyear / week).
Sourcepub fn dayofweek(&self) -> Column
pub fn dayofweek(&self) -> Column
Day of week: 1 = Sunday, 2 = Monday, …, 7 = Saturday (PySpark dayofweek). Polars weekday is Mon=1..Sun=7; we convert to Sun=1..Sat=7.
Sourcepub fn to_date(&self) -> Column
pub fn to_date(&self) -> Column
Cast to date (PySpark to_date). Drops time component from datetime/timestamp.
Sourcepub fn date_format(&self, format: &str) -> Column
pub fn date_format(&self, format: &str) -> Column
Format date/datetime as string (PySpark date_format). Uses chrono strftime format.
Sourcepub fn extract(&self, field: &str) -> Column
pub fn extract(&self, field: &str) -> Column
Extract field from date/datetime (PySpark extract). field: “year”,“month”,“day”,“hour”,“minute”,“second”,“quarter”,“week”,“dayofweek”,“dayofyear”.
Sourcepub fn unix_micros(&self) -> Column
pub fn unix_micros(&self) -> Column
Timestamp to microseconds since epoch (PySpark unix_micros).
Sourcepub fn unix_millis(&self) -> Column
pub fn unix_millis(&self) -> Column
Timestamp to milliseconds since epoch (PySpark unix_millis).
Sourcepub fn unix_seconds(&self) -> Column
pub fn unix_seconds(&self) -> Column
Timestamp to seconds since epoch (PySpark unix_seconds).
Sourcepub fn date_add(&self, n: i32) -> Column
pub fn date_add(&self, n: i32) -> Column
Add n days to date/datetime column (PySpark date_add).
Sourcepub fn date_sub(&self, n: i32) -> Column
pub fn date_sub(&self, n: i32) -> Column
Subtract n days from date/datetime column (PySpark date_sub).
Sourcepub fn datediff(&self, other: &Column) -> Column
pub fn datediff(&self, other: &Column) -> Column
Number of days between two date/datetime columns (PySpark datediff). (end - start).
Sourcepub fn last_day(&self) -> Column
pub fn last_day(&self) -> Column
Last day of the month for date/datetime column (PySpark last_day).
Sourcepub fn timestampadd(&self, unit: &str, amount: &Column) -> Column
pub fn timestampadd(&self, unit: &str, amount: &Column) -> Column
Add amount of unit to timestamp (PySpark timestampadd). unit: DAY, HOUR, MINUTE, SECOND, etc.
Sourcepub fn timestampdiff(&self, unit: &str, other: &Column) -> Column
pub fn timestampdiff(&self, unit: &str, other: &Column) -> Column
Difference between timestamps in given unit (PySpark timestampdiff). unit: DAY, HOUR, MINUTE, SECOND.
Sourcepub fn from_utc_timestamp(&self, tz: &str) -> Column
pub fn from_utc_timestamp(&self, tz: &str) -> Column
Interpret timestamp as UTC, convert to target timezone (PySpark from_utc_timestamp).
Sourcepub fn to_utc_timestamp(&self, tz: &str) -> Column
pub fn to_utc_timestamp(&self, tz: &str) -> Column
Interpret timestamp as in tz, convert to UTC (PySpark to_utc_timestamp).
Sourcepub fn trunc(&self, format: &str) -> Column
pub fn trunc(&self, format: &str) -> Column
Truncate date/datetime to unit (e.g. “mo”, “wk”, “day”). PySpark trunc.
Sourcepub fn add_months(&self, n: i32) -> Column
pub fn add_months(&self, n: i32) -> Column
Add n months to date/datetime column (PySpark add_months). Month-aware.
Sourcepub fn months_between(&self, start: &Column, round_off: bool) -> Column
pub fn months_between(&self, start: &Column, round_off: bool) -> Column
Number of months between end and start dates, as fractional (PySpark months_between). When round_off is true, rounds to 8 decimal places (PySpark default).
Sourcepub fn next_day(&self, day_of_week: &str) -> Column
pub fn next_day(&self, day_of_week: &str) -> Column
Next date that is the given day of week (e.g. “Mon”, “Tue”) (PySpark next_day).
Sourcepub fn unix_timestamp(&self, format: Option<&str>) -> Column
pub fn unix_timestamp(&self, format: Option<&str>) -> Column
Parse string timestamp to seconds since epoch (PySpark unix_timestamp).
Sourcepub fn from_unixtime(&self, format: Option<&str>) -> Column
pub fn from_unixtime(&self, format: Option<&str>) -> Column
Convert seconds since epoch to formatted string (PySpark from_unixtime).
Sourcepub fn timestamp_seconds(&self) -> Column
pub fn timestamp_seconds(&self) -> Column
Convert seconds since epoch to timestamp (PySpark timestamp_seconds).
Sourcepub fn timestamp_millis(&self) -> Column
pub fn timestamp_millis(&self) -> Column
Convert milliseconds since epoch to timestamp (PySpark timestamp_millis).
Sourcepub fn timestamp_micros(&self) -> Column
pub fn timestamp_micros(&self) -> Column
Convert microseconds since epoch to timestamp (PySpark timestamp_micros).
Sourcepub fn date_from_unix_date(&self) -> Column
pub fn date_from_unix_date(&self) -> Column
Days since epoch to date (PySpark date_from_unix_date).
Sourcepub fn pmod(&self, divisor: &Column) -> Column
pub fn pmod(&self, divisor: &Column) -> Column
Positive modulus (PySpark pmod). Column method: pmod(self, other).
Sourcepub fn over(&self, partition_by: &[&str]) -> Column
pub fn over(&self, partition_by: &[&str]) -> Column
Apply window partitioning. Returns a new Column with .over(partition_by).
Use after rank(), dense_rank(), row_number(), lag(), lead().
Sourcepub fn rank(&self, descending: bool) -> Column
pub fn rank(&self, descending: bool) -> Column
Rank (with ties, gaps). Use with .over(partition_by).
Sourcepub fn dense_rank(&self, descending: bool) -> Column
pub fn dense_rank(&self, descending: bool) -> Column
Dense rank (no gaps). Use with .over(partition_by).
Sourcepub fn row_number(&self, descending: bool) -> Column
pub fn row_number(&self, descending: bool) -> Column
Row number (1, 2, 3 by this column’s order). Use with .over(partition_by).
Sourcepub fn lag(&self, n: i64) -> Column
pub fn lag(&self, n: i64) -> Column
Lag: value from n rows before. Use with .over(partition_by).
Sourcepub fn lead(&self, n: i64) -> Column
pub fn lead(&self, n: i64) -> Column
Lead: value from n rows after. Use with .over(partition_by).
Sourcepub fn first_value(&self) -> Column
pub fn first_value(&self) -> Column
First value in partition (PySpark first_value). Use with .over(partition_by).
Sourcepub fn last_value(&self) -> Column
pub fn last_value(&self) -> Column
Last value in partition (PySpark last_value). Use with .over(partition_by).
Sourcepub fn percent_rank(&self, partition_by: &[&str], descending: bool) -> Column
pub fn percent_rank(&self, partition_by: &[&str], descending: bool) -> Column
Percent rank in partition: (rank - 1) / (count - 1). Window is applied; do not call .over() again.
Sourcepub fn cume_dist(&self, partition_by: &[&str], descending: bool) -> Column
pub fn cume_dist(&self, partition_by: &[&str], descending: bool) -> Column
Cumulative distribution in partition: row_number / count. Window is applied; do not call .over() again.
Sourcepub fn ntile(&self, n: u32, partition_by: &[&str], descending: bool) -> Column
pub fn ntile(&self, n: u32, partition_by: &[&str], descending: bool) -> Column
Ntile: bucket 1..n by rank within partition (ceil(rank * n / count)). Window is applied; do not call .over() again.
Sourcepub fn nth_value(
&self,
n: i64,
partition_by: &[&str],
descending: bool,
) -> Column
pub fn nth_value( &self, n: i64, partition_by: &[&str], descending: bool, ) -> Column
Nth value in partition by order (1-based n). Returns a Column with window already applied; do not call .over() again.
Sourcepub fn array_size(&self) -> Column
pub fn array_size(&self) -> Column
Number of elements in list (PySpark size / array_size). Returns Int32.
Sourcepub fn cardinality(&self) -> Column
pub fn cardinality(&self) -> Column
Cardinality: number of elements in array/list (PySpark cardinality). Alias for array_size.
Sourcepub fn array_contains(&self, value: Expr) -> Column
pub fn array_contains(&self, value: Expr) -> Column
Check if list contains value (PySpark array_contains).
Sourcepub fn array_join(&self, separator: &str) -> Column
pub fn array_join(&self, separator: &str) -> Column
Join list of strings with separator (PySpark array_join).
Sourcepub fn element_at(&self, index: i64) -> Column
pub fn element_at(&self, index: i64) -> Column
Get element at 1-based index (PySpark element_at). Returns null if out of bounds.
Sourcepub fn array_sort(&self) -> Column
pub fn array_sort(&self) -> Column
Sort list elements (PySpark array_sort). Ascending, nulls last.
Sourcepub fn array_distinct(&self) -> Column
pub fn array_distinct(&self) -> Column
Distinct elements in list (PySpark array_distinct). Preserves first-occurrence order.
Sourcepub fn mode(&self) -> Column
pub fn mode(&self) -> Column
Mode aggregation - most frequent value (PySpark mode). Uses value_counts sorted by count descending, then first.
Sourcepub fn array_slice(&self, start: i64, length: Option<i64>) -> Column
pub fn array_slice(&self, start: i64, length: Option<i64>) -> Column
Slice list from start with optional length (PySpark slice). 1-based start.
Sourcepub fn explode_outer(&self) -> Column
pub fn explode_outer(&self) -> Column
Explode list; null/empty produces one row with null (PySpark explode_outer).
Sourcepub fn posexplode_outer(&self) -> (Column, Column)
pub fn posexplode_outer(&self) -> (Column, Column)
Posexplode with null preservation (PySpark posexplode_outer).
Sourcepub fn arrays_zip(&self, other: &Column) -> Column
pub fn arrays_zip(&self, other: &Column) -> Column
Zip two arrays element-wise into array of structs (PySpark arrays_zip).
Sourcepub fn arrays_overlap(&self, other: &Column) -> Column
pub fn arrays_overlap(&self, other: &Column) -> Column
True if two arrays have any element in common (PySpark arrays_overlap).
Sourcepub fn array_agg(&self) -> Column
pub fn array_agg(&self) -> Column
Collect to array (PySpark array_agg). Alias for implode in group context.
Sourcepub fn array_position(&self, value: Expr) -> Column
pub fn array_position(&self, value: Expr) -> Column
1-based index of first occurrence of value in list, or 0 if not found (PySpark array_position). Uses Polars list.eval with col(“”) as element (requires polars list_eval feature).
Sourcepub fn array_compact(&self) -> Column
pub fn array_compact(&self) -> Column
Remove null elements from list (PySpark array_compact). Preserves order.
Sourcepub fn array_remove(&self, value: Expr) -> Column
pub fn array_remove(&self, value: Expr) -> Column
New list with all elements equal to value removed (PySpark array_remove). Uses list.eval + drop_nulls (requires polars list_eval and list_drop_nulls).
Sourcepub fn array_repeat(&self, n: i64) -> Column
pub fn array_repeat(&self, n: i64) -> Column
Repeat each element n times (PySpark array_repeat). Implemented via map UDF.
Sourcepub fn array_flatten(&self) -> Column
pub fn array_flatten(&self) -> Column
Flatten list of lists to one list (PySpark flatten). Implemented via map UDF.
Sourcepub fn array_append(&self, elem: &Column) -> Column
pub fn array_append(&self, elem: &Column) -> Column
Append element to end of list (PySpark array_append).
Sourcepub fn array_prepend(&self, elem: &Column) -> Column
pub fn array_prepend(&self, elem: &Column) -> Column
Prepend element to start of list (PySpark array_prepend).
Sourcepub fn array_insert(&self, pos: &Column, elem: &Column) -> Column
pub fn array_insert(&self, pos: &Column, elem: &Column) -> Column
Insert element at 1-based position (PySpark array_insert).
Sourcepub fn array_except(&self, other: &Column) -> Column
pub fn array_except(&self, other: &Column) -> Column
Elements in first array not in second (PySpark array_except).
Sourcepub fn array_intersect(&self, other: &Column) -> Column
pub fn array_intersect(&self, other: &Column) -> Column
Elements in both arrays (PySpark array_intersect).
Sourcepub fn array_union(&self, other: &Column) -> Column
pub fn array_union(&self, other: &Column) -> Column
Distinct elements from both arrays (PySpark array_union).
Sourcepub fn zip_with(&self, other: &Column, merge: Expr) -> Column
pub fn zip_with(&self, other: &Column, merge: Expr) -> Column
Zip two arrays element-wise with merge function (PySpark zip_with). Shorter array padded with null. Merge Expr uses col(“”).struct_().field_by_name(“left”) and field_by_name(“right”).
Sourcepub fn array_exists(&self, predicate: Expr) -> Column
pub fn array_exists(&self, predicate: Expr) -> Column
True if any list element satisfies the predicate (PySpark exists). Uses list.eval(pred).list().any().
Sourcepub fn array_forall(&self, predicate: Expr) -> Column
pub fn array_forall(&self, predicate: Expr) -> Column
True if all list elements satisfy the predicate (PySpark forall). Uses list.eval(pred).list().all().
Sourcepub fn array_filter(&self, predicate: Expr) -> Column
pub fn array_filter(&self, predicate: Expr) -> Column
Filter list elements by predicate (PySpark filter). Keeps elements where predicate is true.
Sourcepub fn array_transform(&self, f: Expr) -> Column
pub fn array_transform(&self, f: Expr) -> Column
Transform list elements by expression (PySpark transform). list.eval(expr).
Sourcepub fn array_sum(&self) -> Column
pub fn array_sum(&self) -> Column
Sum of list elements (PySpark aggregate with sum). Uses list.sum().
Sourcepub fn array_aggregate(&self, zero: &Column) -> Column
pub fn array_aggregate(&self, zero: &Column) -> Column
Array fold/aggregate (PySpark aggregate). Simplified: zero + sum(list). Full (zero, merge, finish) deferred.
Sourcepub fn array_mean(&self) -> Column
pub fn array_mean(&self) -> Column
Mean of list elements (PySpark aggregate with avg). Uses list.mean().
Sourcepub fn posexplode(&self) -> (Column, Column)
pub fn posexplode(&self) -> (Column, Column)
Explode list with position (PySpark posexplode). Returns (pos_col, value_col). pos is 1-based; uses list.eval(cum_count()).explode() and explode().
Sourcepub fn map_keys(&self) -> Column
pub fn map_keys(&self) -> Column
Extract keys from a map column (PySpark map_keys). Map column is List(Struct{key, value}).
Sourcepub fn map_values(&self) -> Column
pub fn map_values(&self) -> Column
Extract values from a map column (PySpark map_values). Map column is List(Struct{key, value}).
Sourcepub fn map_entries(&self) -> Column
pub fn map_entries(&self) -> Column
Return map as list of structs {key, value} (PySpark map_entries). Identity for List(Struct) column.
Sourcepub fn map_from_arrays(&self, values: &Column) -> Column
pub fn map_from_arrays(&self, values: &Column) -> Column
Build map from two array columns (keys, values) (PySpark map_from_arrays). Implemented via map_many UDF.
Sourcepub fn map_concat(&self, other: &Column) -> Column
pub fn map_concat(&self, other: &Column) -> Column
Merge two map columns (PySpark map_concat). Last value wins for duplicate keys.
Sourcepub fn transform_keys(&self, key_expr: Expr) -> Column
pub fn transform_keys(&self, key_expr: Expr) -> Column
Transform each map key by expr (PySpark transform_keys). key_expr should use col(“”).struct_().field_by_name(“key”).
Sourcepub fn transform_values(&self, value_expr: Expr) -> Column
pub fn transform_values(&self, value_expr: Expr) -> Column
Transform each map value by expr (PySpark transform_values). value_expr should use col(“”).struct_().field_by_name(“value”).
Sourcepub fn map_zip_with(&self, other: &Column, merge: Expr) -> Column
pub fn map_zip_with(&self, other: &Column, merge: Expr) -> Column
Merge two maps by key with merge function (PySpark map_zip_with). Merge Expr uses col(“”).struct_().field_by_name(“value1”) and field_by_name(“value2”).
Sourcepub fn map_filter(&self, predicate: Expr) -> Column
pub fn map_filter(&self, predicate: Expr) -> Column
Filter map entries by predicate (PySpark map_filter). Keeps key-value pairs where predicate is true. Predicate uses col(“”).struct_().field_by_name(“key”) and field_by_name(“value”) to reference key/value.
Sourcepub fn map_from_entries(&self) -> Column
pub fn map_from_entries(&self) -> Column
Array of structs {key, value} to map (PySpark map_from_entries). Identity for List(Struct) format.
Sourcepub fn map_contains_key(&self, key: &Column) -> Column
pub fn map_contains_key(&self, key: &Column) -> Column
True if map contains key (PySpark map_contains_key).
Sourcepub fn get_json_object(&self, path: &str) -> Column
pub fn get_json_object(&self, path: &str) -> Column
Extract JSON path from string column (PySpark get_json_object). Uses Polars str().json_path_match.
Sourcepub fn from_json(&self, schema: Option<DataType>) -> Column
pub fn from_json(&self, schema: Option<DataType>) -> Column
Parse string column as JSON into struct (PySpark from_json). Uses Polars str().json_decode.
Sourcepub fn to_json(&self) -> Column
pub fn to_json(&self) -> Column
Serialize struct column to JSON string (PySpark to_json). Uses Polars struct().json_encode.
Sourcepub fn json_array_length(&self, path: &str) -> Column
pub fn json_array_length(&self, path: &str) -> Column
Length of JSON array at path (PySpark json_array_length). UDF.
Sourcepub fn json_object_keys(&self) -> Column
pub fn json_object_keys(&self) -> Column
Keys of JSON object (PySpark json_object_keys). Returns list of strings. UDF.
Sourcepub fn json_tuple(&self, keys: &[&str]) -> Column
pub fn json_tuple(&self, keys: &[&str]) -> Column
Extract keys from JSON as struct (PySpark json_tuple). UDF. Returns struct with one string field per key.
Sourcepub fn from_csv(&self) -> Column
pub fn from_csv(&self) -> Column
Parse CSV string to struct (PySpark from_csv). Minimal: split by comma, up to 32 columns. UDF.
Sourcepub fn parse_url(&self, part: &str, key: Option<&str>) -> Column
pub fn parse_url(&self, part: &str, key: Option<&str>) -> Column
Parse URL and extract part (PySpark parse_url). UDF. When part is QUERY/QUERYSTRING and key is Some(k), returns the value for that query parameter only.
Sourcepub fn isin(&self, other: &Column) -> Column
pub fn isin(&self, other: &Column) -> Column
Check if column values are in the other column’s list/series (PySpark isin).
Sourcepub fn url_decode(&self) -> Column
pub fn url_decode(&self) -> Column
Percent-decode URL-encoded string (PySpark url_decode). Uses UDF.
Sourcepub fn url_encode(&self) -> Column
pub fn url_encode(&self) -> Column
Percent-encode string for URL (PySpark url_encode). Uses UDF.
Sourcepub fn shift_left(&self, n: i32) -> Column
pub fn shift_left(&self, n: i32) -> Column
Bitwise left shift (PySpark shiftLeft). col << n = col * 2^n.
Sourcepub fn shift_right(&self, n: i32) -> Column
pub fn shift_right(&self, n: i32) -> Column
Bitwise signed right shift (PySpark shiftRight). col >> n = col / 2^n.
Sourcepub fn shift_right_unsigned(&self, n: i32) -> Column
pub fn shift_right_unsigned(&self, n: i32) -> Column
Bitwise unsigned right shift (PySpark shiftRightUnsigned). Logical shift.
Trait Implementations§
Auto Trait Implementations§
impl !Freeze for Column
impl !RefUnwindSafe for Column
impl Send for Column
impl Sync for Column
impl Unpin for Column
impl !UnwindSafe for Column
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more