df-derive
Procedural derive macros for converting your Rust types into Polars DataFrames.
What this crate does
Deriving ToDataFrame on your structs and tuple structs generates fast, allocation-conscious code to:
- Convert a single value to a
polars::prelude::DataFrame - Convert a slice of values via a columnar path (efficient batch conversion)
- Inspect the schema (column names and
DataTypes) at compile time via a generated method
It supports nested structs (flattened with dot notation), Option<T>, Vec<T>, tuple structs, and key domain types like chrono::DateTime<Utc> and rust_decimal::Decimal.
Installation
Add the macro crate and Polars. You will also need a trait defining the to_dataframe behavior (you can use your own runtime crate/traits; see the override section below). For a minimal inline trait you can copy, see the Quick start example.
[]
= "0.1"
= { = "0.50", = ["timezones", "dtype-decimal"] }
# If you use these types in your models
= { = "0.4", = ["serde"] }
= { = "1.36", = ["serde"] }
Quick start
Copy-paste runnable example without any external runtime traits. This is a complete working example that you can run with cargo run --example quickstart.
Cargo.toml:
[]
= "quickstart"
= "0.1.0"
= "2024"
[]
= "0.1"
= { = "0.50", = ["timezones", "dtype-decimal"] }
src/main.rs:
use ToDataFrame;
// Columnar path auto-infers to crate::dataframe::Columnar
Run it:
Features
- Nested structs (flattening): fields of nested structs appear as
outer.innercolumns - Vec of primitives and structs: becomes Polars
Listcolumns;Vec<Nested>becomes multipleouter.subfieldlist columns Option<T>: null-aware materialization for both scalars and lists- Tuple structs: supported; columns are named
field_0,field_1, ... - Empty structs: produce
(1, 0)for instances and(0, 0)for empty frames - Schema discovery:
T::schema() -> Vec<(&'static str, DataType)> - Columnar batch conversion:
[T]::to_dataframe()via theColumnarimplementation
Attribute helpers
Use #[df_derive(as_string)] to stringify values during conversion. This is particularly useful for enums:
// Required: implement Display for the enum
Columns will use DataType::String (or List<String> for Vec<_>), and values are produced via ToString. See the complete working example with cargo run --example as_string.
Supported types
- Primitives:
String,bool, integer types (i8/i16/i32/i64/isize,u8/u16/u32/u64/usize),f32,f64 - Time:
chrono::DateTime<Utc>→ materialized asDatetime(Milliseconds, None) - Decimal:
rust_decimal::Decimal→Decimal(38, 10) - Wrappers:
Option<T>,Vec<T>in any nesting order - Custom structs: any other struct deriving
ToDataFrame(supports nesting andVec<Nested>) - Tuple structs: unnamed fields are emitted as
field_{index}
Column naming
- Named struct fields:
field_name - Nested structs:
outer.inner(recursively) - Vec of custom structs:
vec_field.subfield(list dtype) - Tuple structs:
field_0,field_1, ...
Generated API
For every #[derive(ToDataFrame)] type T the macro generates implementations of two traits (paths configurable via #[df_derive(...)]):
ToDataFrameforT:fn to_dataframe(&self) -> PolarsResult<DataFrame>fn empty_dataframe() -> PolarsResult<DataFrame>fn schema() -> PolarsResult<Vec<(&'static str, DataType)>>
ColumnarforT:fn columnar_to_dataframe(items: &[Self]) -> PolarsResult<DataFrame>
Examples
This crate includes several runnable examples in the examples/ directory. You can run any example with:
Or run all examples to see the full feature set:
&& \
&& \
&& \
&& \
&& \
Available Examples
quickstart- Basic usage with single and batch DataFrame conversionnested- Nested structs with dot notation column namingvec_custom- Vec of custom structs creating List columnstuple- Tuple structs with field_0, field_1 namingdatetime_decimal- DateTime and Decimal type supportas_string-#[df_derive(as_string)]attribute for enum conversion
Example Code Snippets
Nested structs
// Columns: name, age, address.street, address.city, address.zip
Note: the runnable examples define a small
dataframemodule with the traits used by the macro. Some helper trait items are not used in every snippet (for exampleempty_dataframeorColumnar). To avoid noise duringcargo run --example …, the examples annotate that module with#[allow(dead_code)].
Vec of custom structs
// Columns include: symbol, quotes.ts, quotes.open, quotes.high, ... (each a List)
Tuple structs
;
// Columns: field_0 (Int32), field_1 (String), field_2 (Float64)
DateTime<Utc> and Decimal
// Schema dtypes: amount = Decimal(38, 10), ts = Datetime(Milliseconds, None)
Why
#[allow(dead_code)]in examples? The examples include a minimaldataframemodule to provide the traits that the macro implements. Not every example calls every method (e.g.,empty_dataframe,schema), and compile-time warnings would otherwise distract from the output. Adding#[allow(dead_code)]to that module keeps the examples clean while remaining fully correct.
As string attribute
// Columns use DataType::String or List<String>
Note: All examples require the trait definitions shown in the Quick start section. See the complete working examples in the
examples/directory.
Limitations and guidance
- Unsupported container types: maps/sets like
HashMap<_, _>are not supported. - Enums: derive on enums is not supported; use
#[df_derive(as_string)]on enum fields. - Generics: generic structs are not supported by the derive (see tests/fail for examples).
- All nested types must also derive: if you nest a struct, it must also derive
ToDataFrame.
Performance notes
- The derive implements an internal
Columnarpath used by the runtime to convert slices efficiently, avoiding per-row DataFrame builds. - Criterion benches in
benches/exercise wide, deep, and nested-Vec shapes (100k+ rows), demonstrating consistent performance across shapes.
Performance tracking
Performance is continuously monitored and tracked using Bencher:
Compatibility
- Rust edition: 2024
- Polars: 0.50 (tested)
- Enable Polars features
timezonesanddtype-decimalif you useDateTime<Utc>orDecimal.
License
MIT. See LICENSE.
Crate path override (about paft)
This crate currently resolves default trait paths to a dataframe module under the paft ecosystem. Concretely, it attempts to implement:
paft::dataframe::ToDataFrameandpaft::dataframe::Columnar(orpaft-core::dataframe::...) if those crates are present.
You can override these paths for any runtime by annotating your type with #[df_derive(...)]:
// Columnar will be inferred as my_runtime::dataframe::Columnar
If you need to override both explicitly: