ambers
Pure Rust SPSS .sav/.zsav reader — Arrow-native, zero C dependencies.
Features
- Read
.sav(bytecode) and.zsav(zlib) files - Arrow
RecordBatchoutput — zero-copy to Polars, DataFusion, DuckDB - Rich metadata: variable labels, value labels, missing values, MR sets, measure levels
- 2–3x faster than pyreadstat
- Python + Rust dual API from a single crate
Installation
Python:
Rust:
Quick Start
Python
# Read data + metadata
, =
# Explore metadata
# Read metadata only (fast, skips data)
=
Rust
use ;
// Read data + metadata
let = read_sav?;
println!;
// Read metadata only
let meta = read_sav_metadata?;
println!;
Metadata API (Python)
| Method | Description |
|---|---|
meta.summary() |
Formatted overview: file info, type distribution, annotations |
meta.describe("Q1") |
Deep-dive into a single variable (or list of variables) |
meta.diff(other) |
Compare two metadata objects, returns MetaDiff |
meta.label("Q1") |
Variable label |
meta.value("Q1") |
Value labels dict |
meta.format("Q1") |
SPSS format string (e.g. "F8.2", "A50") |
meta.measure("Q1") |
Measurement level ("nominal", "ordinal", "scale") |
meta.schema |
Full metadata as a nested Python dict |
All variable-name methods raise KeyError for unknown variables.
Streaming Reader (Rust)
let mut scanner = scan_sav?;
scanner.select?;
scanner.limit;
while let Some = scanner.next_batch?
Performance
| File | Size | Rows | Cols | ambers | pyreadstat | Speedup |
|---|---|---|---|---|---|---|
| 251001.sav | 147 MB | 22,070 | 677 | 1.27s | 3.07s | 2.4x |
| rpm_2025_tracking | 1.1 GB | 79,066 | 915 | 2.42s | 6.40s | 2.6x |