pub struct GroupedData { /* private fields */ }Expand description
GroupedData - represents a DataFrame grouped by certain columns. Similar to PySpark’s GroupedData
Implementations§
Source§impl GroupedData
impl GroupedData
Sourcepub fn count(&self) -> Result<DataFrame, PolarsError>
pub fn count(&self) -> Result<DataFrame, PolarsError>
Count rows in each group
Sourcepub fn avg(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn avg(&self, column: &str) -> Result<DataFrame, PolarsError>
Average (mean) of a column in each group
Sourcepub fn min(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn min(&self, column: &str) -> Result<DataFrame, PolarsError>
Minimum value of a column in each group
Sourcepub fn max(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn max(&self, column: &str) -> Result<DataFrame, PolarsError>
Maximum value of a column in each group
Sourcepub fn first(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn first(&self, column: &str) -> Result<DataFrame, PolarsError>
First value of a column in each group (order not guaranteed unless explicitly sorted).
Sourcepub fn last(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn last(&self, column: &str) -> Result<DataFrame, PolarsError>
Last value of a column in each group (order not guaranteed unless explicitly sorted).
Sourcepub fn approx_count_distinct(
&self,
column: &str,
) -> Result<DataFrame, PolarsError>
pub fn approx_count_distinct( &self, column: &str, ) -> Result<DataFrame, PolarsError>
Approximate count of distinct values in each group (uses n_unique; same as count_distinct for exact).
Sourcepub fn any_value(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn any_value(&self, column: &str) -> Result<DataFrame, PolarsError>
Any value from the group (PySpark any_value). Uses first value.
Sourcepub fn bool_and(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn bool_and(&self, column: &str) -> Result<DataFrame, PolarsError>
Boolean AND across group (PySpark bool_and / every).
Sourcepub fn bool_or(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn bool_or(&self, column: &str) -> Result<DataFrame, PolarsError>
Boolean OR across group (PySpark bool_or / some).
Sourcepub fn product(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn product(&self, column: &str) -> Result<DataFrame, PolarsError>
Product of column values in each group (PySpark product).
Sourcepub fn collect_list(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn collect_list(&self, column: &str) -> Result<DataFrame, PolarsError>
Collect column values into list per group (PySpark collect_list).
Sourcepub fn collect_set(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn collect_set(&self, column: &str) -> Result<DataFrame, PolarsError>
Collect distinct column values into list per group (PySpark collect_set).
Sourcepub fn count_if(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn count_if(&self, column: &str) -> Result<DataFrame, PolarsError>
Count rows where condition column is true (PySpark count_if).
Sourcepub fn percentile(&self, column: &str, p: f64) -> Result<DataFrame, PolarsError>
pub fn percentile(&self, column: &str, p: f64) -> Result<DataFrame, PolarsError>
Percentile of column (PySpark percentile). p in 0.0..=1.0.
Sourcepub fn max_by(
&self,
value_col: &str,
ord_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn max_by( &self, value_col: &str, ord_col: &str, ) -> Result<DataFrame, PolarsError>
Value of value_col where ord_col is maximum (PySpark max_by).
Sourcepub fn min_by(
&self,
value_col: &str,
ord_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn min_by( &self, value_col: &str, ord_col: &str, ) -> Result<DataFrame, PolarsError>
Value of value_col where ord_col is minimum (PySpark min_by).
Sourcepub fn covar_pop(
&self,
col1: &str,
col2: &str,
) -> Result<DataFrame, PolarsError>
pub fn covar_pop( &self, col1: &str, col2: &str, ) -> Result<DataFrame, PolarsError>
Population covariance between two columns in each group (PySpark covar_pop).
Sourcepub fn covar_samp(
&self,
col1: &str,
col2: &str,
) -> Result<DataFrame, PolarsError>
pub fn covar_samp( &self, col1: &str, col2: &str, ) -> Result<DataFrame, PolarsError>
Sample covariance between two columns in each group (PySpark covar_samp). ddof=1.
Sourcepub fn corr(&self, col1: &str, col2: &str) -> Result<DataFrame, PolarsError>
pub fn corr(&self, col1: &str, col2: &str) -> Result<DataFrame, PolarsError>
Pearson correlation between two columns in each group (PySpark corr).
Sourcepub fn regr_count(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_count( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression count of (y, x) pairs where both non-null (PySpark regr_count).
Sourcepub fn regr_avgx(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_avgx( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression average of x (PySpark regr_avgx).
Sourcepub fn regr_avgy(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_avgy( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression average of y (PySpark regr_avgy).
Sourcepub fn regr_slope(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_slope( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression slope (PySpark regr_slope).
Sourcepub fn regr_intercept(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_intercept( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression intercept (PySpark regr_intercept).
Sourcepub fn regr_r2(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_r2( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression R-squared (PySpark regr_r2).
Sourcepub fn regr_sxx(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_sxx( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression sum (x - avg_x)^2 (PySpark regr_sxx).
Sourcepub fn regr_syy(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_syy( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression sum (y - avg_y)^2 (PySpark regr_syy).
Sourcepub fn regr_sxy(
&self,
y_col: &str,
x_col: &str,
) -> Result<DataFrame, PolarsError>
pub fn regr_sxy( &self, y_col: &str, x_col: &str, ) -> Result<DataFrame, PolarsError>
Regression sum (x - avg_x)(y - avg_y) (PySpark regr_sxy).
Sourcepub fn kurtosis(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn kurtosis(&self, column: &str) -> Result<DataFrame, PolarsError>
Kurtosis of a column in each group (PySpark kurtosis). Fisher definition, bias=true.
Sourcepub fn skewness(&self, column: &str) -> Result<DataFrame, PolarsError>
pub fn skewness(&self, column: &str) -> Result<DataFrame, PolarsError>
Skewness of a column in each group (PySpark skewness). bias=true.
Sourcepub fn agg(&self, aggregations: Vec<Expr>) -> Result<DataFrame, PolarsError>
pub fn agg(&self, aggregations: Vec<Expr>) -> Result<DataFrame, PolarsError>
Apply multiple aggregations at once (generic agg method)
Sourcepub fn grouping_columns(&self) -> &[String]
pub fn grouping_columns(&self) -> &[String]
Get grouping columns
Auto Trait Implementations§
impl !Freeze for GroupedData
impl !RefUnwindSafe for GroupedData
impl Send for GroupedData
impl Sync for GroupedData
impl Unpin for GroupedData
impl !UnwindSafe for GroupedData
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more