pub struct GroupBy<'df> {
    pub df: &'df DataFrame,
    /* private fields */
}
Expand description

Returned by a groupby operation on a DataFrame. This struct supports several aggregations.

Until described otherwise, the examples in this struct are performed on the following DataFrame:

use polars_core::prelude::*;

let dates = &[
"2020-08-21",
"2020-08-21",
"2020-08-22",
"2020-08-23",
"2020-08-22",
];
// date format
let fmt = "%Y-%m-%d";
// create date series
let s0 = DateChunked::parse_from_str_slice("date", dates, fmt)
        .into_series();
// create temperature series
let s1 = Series::new("temp", [20, 10, 7, 9, 1]);
// create rain series
let s2 = Series::new("rain", [0.2, 0.1, 0.3, 0.1, 0.01]);
// create a new DataFrame
let df = DataFrame::new(vec![s0, s1, s2]).unwrap();
println!("{:?}", df);

Outputs:

+------------+------+------+
| date       | temp | rain |
| ---        | ---  | ---  |
| Date     | i32  | f64  |
+============+======+======+
| 2020-08-21 | 20   | 0.2  |
+------------+------+------+
| 2020-08-21 | 10   | 0.1  |
+------------+------+------+
| 2020-08-22 | 7    | 0.3  |
+------------+------+------+
| 2020-08-23 | 9    | 0.1  |
+------------+------+------+
| 2020-08-22 | 1    | 0.01 |
+------------+------+------+

Fields

df: &'df DataFrame

Implementations

Select the column(s) that should be aggregated. You can select a single column or a slice of columns.

Note that making a selection with this method is not required. If you skip it all columns (except for the keys) will be selected for aggregation.

Get the internal representation of the GroupBy operation. The Vec returned contains: (first_idx, Vec) Where second value in the tuple is a vector with all matching indexes.

Get the internal representation of the GroupBy operation. The Vec returned contains: (first_idx, Vec) Where second value in the tuple is a vector with all matching indexes.

Safety

Groups should always be in bounds of the DataFrame hold by this [GroupBy]. If you mutate it, you must hold that invariant.

👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped series and compute the mean per group.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(&["temp", "rain"]).mean()
}

Returns:

+------------+-----------+-----------+
| date       | temp_mean | rain_mean |
| ---        | ---       | ---       |
| Date     | f64       | f64       |
+============+===========+===========+
| 2020-08-23 | 9         | 0.1       |
+------------+-----------+-----------+
| 2020-08-22 | 4         | 0.155     |
+------------+-----------+-----------+
| 2020-08-21 | 15        | 0.15      |
+------------+-----------+-----------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped series and compute the sum per group.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).sum()
}

Returns:

+------------+----------+
| date       | temp_sum |
| ---        | ---      |
| Date     | i32      |
+============+==========+
| 2020-08-23 | 9        |
+------------+----------+
| 2020-08-22 | 8        |
+------------+----------+
| 2020-08-21 | 30       |
+------------+----------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped series and compute the minimal value per group.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).min()
}

Returns:

+------------+----------+
| date       | temp_min |
| ---        | ---      |
| Date     | i32      |
+============+==========+
| 2020-08-23 | 9        |
+------------+----------+
| 2020-08-22 | 1        |
+------------+----------+
| 2020-08-21 | 10       |
+------------+----------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped series and compute the maximum value per group.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).max()
}

Returns:

+------------+----------+
| date       | temp_max |
| ---        | ---      |
| Date     | i32      |
+============+==========+
| 2020-08-23 | 9        |
+------------+----------+
| 2020-08-22 | 7        |
+------------+----------+
| 2020-08-21 | 20       |
+------------+----------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped Series and find the first value per group.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).first()
}

Returns:

+------------+------------+
| date       | temp_first |
| ---        | ---        |
| Date     | i32        |
+============+============+
| 2020-08-23 | 9          |
+------------+------------+
| 2020-08-22 | 7          |
+------------+------------+
| 2020-08-21 | 20         |
+------------+------------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped Series and return the last value per group.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).last()
}

Returns:

+------------+------------+
| date       | temp_last |
| ---        | ---        |
| Date     | i32        |
+============+============+
| 2020-08-23 | 9          |
+------------+------------+
| 2020-08-22 | 1          |
+------------+------------+
| 2020-08-21 | 10         |
+------------+------------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped Series by counting the number of unique values.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).n_unique()
}

Returns:

+------------+---------------+
| date       | temp_n_unique |
| ---        | ---           |
| Date     | u32           |
+============+===============+
| 2020-08-23 | 1             |
+------------+---------------+
| 2020-08-22 | 2             |
+------------+---------------+
| 2020-08-21 | 2             |
+------------+---------------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped Series and determine the quantile per group.

Example

fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).quantile(0.2, QuantileInterpolOptions::default())
}
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped Series and determine the median per group.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).median()
}
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped Series and determine the variance per group.

👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate grouped Series and determine the standard deviation per group.

Aggregate grouped series and compute the number of values per group.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.select(["temp"]).count()
}

Returns:

+------------+------------+
| date       | temp_count |
| ---        | ---        |
| Date     | u32        |
+============+============+
| 2020-08-23 | 1          |
+------------+------------+
| 2020-08-22 | 2          |
+------------+------------+
| 2020-08-21 | 2          |
+------------+------------+

Get the groupby group indexes.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    df.groupby(["date"])?.groups()
}

Returns:

+--------------+------------+
| date         | groups     |
| ---          | ---        |
| Date(days) | list [u32] |
+==============+============+
| 2020-08-23   | "[3]"      |
+--------------+------------+
| 2020-08-22   | "[2, 4]"   |
+--------------+------------+
| 2020-08-21   | "[0, 1]"   |
+--------------+------------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Aggregate the groups of the groupby operation into lists.

Example
fn example(df: DataFrame) -> PolarsResult<DataFrame> {
    // GroupBy and aggregate to Lists
    df.groupby(["date"])?.select(["temp"]).agg_list()
}

Returns:

+------------+------------------------+
| date       | temp_agg_list          |
| ---        | ---                    |
| Date     | list [i32]             |
+============+========================+
| 2020-08-23 | "[Some(9)]"            |
+------------+------------------------+
| 2020-08-22 | "[Some(7), Some(1)]"   |
+------------+------------------------+
| 2020-08-21 | "[Some(20), Some(10)]" |
+------------+------------------------+
👎Deprecated since 0.24.1: use polars.lazy aggregations

Apply a closure over the groups as a new DataFrame in parallel.

Apply a closure over the groups as a new DataFrame.

Trait Implementations

Returns a copy of the value. Read more
Performs copy-assignment from source. Read more
Formats the value using the given formatter. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The alignment of pointer.
The type for initializers.
Initializes a with the given initializer. Read more
Dereferences the given pointer. Read more
Mutably dereferences the given pointer. Read more
Drops the object pointed to by the given pointer. Read more
The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
Uses borrowed data to replace owned data, usually by cloning. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.