Struct polars::prelude::LazyFrame [−][src]
pub struct LazyFrame { /* fields omitted */ }
Expand description
Lazy abstraction over an eager DataFrame
.
It really is an abstraction over a logical plan. The methods of this struct will incrementally
modify a logical plan until output is requested (via collect)
Implementations
Create a LazyFrame directly from a parquet scan.
Get a dot language representation of the LogicalPlan.
Toggle projection pushdown optimization.
Toggle predicate pushdown optimization.
Toggle type coercion optimization.
Toggle expression simplification optimization on or off
Toggle aggregate pushdown.
Toggle global string cache.
Toggle join pruning optimization
Describe the logical plan.
Describe the optimized logical plan.
Add a sort operation to the logical plan.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; /// Sort DataFrame by 'sepal.width' column fn example(df: DataFrame) -> LazyFrame { df.lazy() .sort("sepal.width", false) }
Add a sort operation to the logical plan.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; /// Sort DataFrame by 'sepal.width' column fn example(df: DataFrame) -> LazyFrame { df.lazy() .sort_by_exprs(vec![col("sepal.width")], vec![false]) }
Reverse the DataFrame
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn example(df: DataFrame) -> LazyFrame { df.lazy() .reverse() }
Rename a column in the DataFrame
Shift the values by a given period and fill the parts that will be empty due to this operation
with Nones
.
See the method on Series for more info on the shift
operation.
Shift the values by a given period and fill the parts that will be empty due to this operation
with the result of the fill_value
expression.
See the method on Series for more info on the shift
operation.
Caches the result into a new LazyFrame. This should be used to prevent computations running multiple times
Fetch is like a collect operation, but it overwrites the number of rows read by every scan operation. This is a utility that helps debug a query on a smaller number of rows.
Note that the fetch does not guarantee the final number of rows in the DataFrame. Filter, join operations and a lower number of rows available in the scanned file influence the final number of rows.
pub fn optimize(
self,
lp_arena: &mut Arena<ALogicalPlan>,
expr_arena: &mut Arena<AExpr>
) -> Result<Node, PolarsError>
Execute all the lazy operations and collect them into a DataFrame. Before execution the query is being optimized.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn example(df: DataFrame) -> Result<DataFrame> { df.lazy() .groupby(vec![col("foo")]) .agg(vec!(col("bar").sum(), col("ham").mean().alias("avg_ham"))) .collect() }
Filter by some predicate expression.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn example(df: DataFrame) -> LazyFrame { df.lazy() .filter(col("sepal.width").is_not_null()) .select(&[col("sepal.width"), col("sepal.length")]) }
Select (and rename) columns from the query.
Columns can be selected with col;
If you want to select all columns use col("*")
.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; /// This function selects column "foo" and column "bar". /// Column "bar" is renamed to "ham". fn example(df: DataFrame) -> LazyFrame { df.lazy() .select(&[col("foo"), col("bar").alias("ham")]) } /// This function selects all columns except "foo" fn exclude_a_column(df: DataFrame) -> LazyFrame { df.lazy() .select(&[col("*"), except("foo")]) }
Group by and aggregate.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn example(df: DataFrame) -> LazyFrame { df.lazy() .groupby(vec![col("date")]) .agg(vec![ col("rain").min(), col("rain").sum(), col("rain").quantile(0.5).alias("median_rain"), ]) .sort("date", false) }
Join query with other lazy query.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn join_dataframes(ldf: LazyFrame, other: LazyFrame) -> LazyFrame { ldf .left_join(other, col("foo"), col("bar")) }
Join query with other lazy query.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn join_dataframes(ldf: LazyFrame, other: LazyFrame) -> LazyFrame { ldf .outer_join(other, col("foo"), col("bar")) }
Join query with other lazy query.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn join_dataframes(ldf: LazyFrame, other: LazyFrame) -> LazyFrame { ldf .inner_join(other, col("foo"), col("bar").cast(DataType::Utf8)) }
Generic join function that can join on multiple columns.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn example(ldf: LazyFrame, other: LazyFrame) -> LazyFrame { ldf .join(other, vec![col("foo"), col("bar")], vec![col("foo"), col("bar")], JoinType::Inner) }
Add a column to a DataFrame
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn add_column(df: DataFrame) -> LazyFrame { df.lazy() .with_column( when(col("sepal.length").lt(lit(5.0))) .then(lit(10)) .otherwise(lit(1)) .alias("new_column_name"), ) }
Add multiple columns to a DataFrame.
Example
use polars_core::prelude::*; use polars_lazy::prelude::*; fn add_columns(df: DataFrame) -> LazyFrame { df.lazy() .with_columns( vec![lit(10).alias("foo"), lit(100).alias("bar")] ) }
Aggregate all the columns as their quantile values.
Drop duplicate rows. See eager.
Drop null rows.
Equal to LazyFrame::filter(col("*").is_not_null())
Melt the DataFrame from wide to long format
Limit the DataFrame to the first n
rows. Note if you don’t want the rows to be scanned,
use fetch.
Apply a function/closure once the logical plan get executed.
Warning
This can blow up in your face if the schema is changed due to the operation. The optimizer relies on a correct schema.
You can toggle certain optimizations off.
Trait Implementations
Performs the conversion.
Auto Trait Implementations
impl !RefUnwindSafe for LazyFrame
impl !UnwindSafe for LazyFrame
Blanket Implementations
Mutably borrows from an owned value. Read more
pub fn vzip(self) -> V