qvd
High-performance Rust library for reading, writing and converting Qlik QVD files. With Parquet/Arrow interop, DataFusion SQL, streaming reader, CLI tool, and Python bindings (PyArrow, pandas, Polars).
First and only QVD crate on crates.io.
Features
- Read/Write QVD — byte-identical roundtrip, zero-copy where possible
- Parquet ↔ QVD — convert in both directions with compression support (snappy, zstd, gzip, lz4)
- Arrow RecordBatch — convert QVD to/from Arrow for integration with DataFusion, DuckDB, Polars
- DataFusion SQL — register QVD files as tables and query them with SQL
- DuckDB integration — use QVD data in DuckDB via Arrow bridge (Rust and Python)
- Streaming reader — read QVD files in chunks without loading everything into memory
- EXISTS() index — O(1) hash lookup, like Qlik's
EXISTS()function - CLI tool —
qvd-cli convert,inspect,head,schema - Python bindings — PyArrow, pandas, Polars support via zero-copy Arrow bridge. 20-35x faster than PyQvd
- Zero dependencies for core QVD read/write (Parquet/Arrow/DataFusion/Python are optional features)
Performance
Tested on 20 real QVD files (11 KB to 2.8 GB):
| File | Size | Rows | Columns | Read | Write |
|---|---|---|---|---|---|
| sample_tiny.qvd | 11 KB | 12 | 5 | 0.0s | 0.0s |
| sample_small.qvd | 418 KB | 2,746 | 8 | 0.0s | 0.0s |
| sample_medium.qvd | 41 MB | 465,810 | 12 | 0.5s | 0.0s |
| sample_large.qvd | 587 MB | 5,458,618 | 15 | 6.1s | 0.4s |
| sample_xlarge.qvd | 1.7 GB | 87,617,047 | 6 | 36.8s | 1.6s |
| sample_huge.qvd | 2.8 GB | 11,907,648 | 42 | 24.3s | 2.4s |
All 20 files — byte-identical roundtrip (MD5 match).
vs PyQvd (Pure Python)
| File | PyQvd | qvd (Rust) | Speedup |
|---|---|---|---|
| 10 MB, 1.4M rows | 5.0s | 0.17s | 29x |
| 41 MB, 466K rows | 8.5s | 0.5s | 16x |
| 480 MB, 12M rows | 79.4s | 2.3s | 35x |
| 1.7 GB, 87M rows | >10 min | 29.6s | >20x |
Installation
Rust
# Core QVD read/write (zero dependencies)
[]
= "0.2"
# With Parquet/Arrow support
[]
= { = "0.2", = ["parquet_support"] }
# With DataFusion SQL support
[]
= { = "0.2", = ["datafusion_support"] }
CLI
Python
Or with uv:
Quick Start — Rust
Read/Write QVD
use ;
let table = read_qvd_file?;
println!;
// Byte-identical roundtrip
write_qvd_file?;
Convert Parquet ↔ QVD
use ;
// Parquet → QVD
convert_parquet_to_qvd?;
// QVD → Parquet (with zstd compression)
convert_qvd_to_parquet?;
Arrow RecordBatch
use ;
let table = read_qvd_file?;
let batch = qvd_to_record_batch?;
// Use with DataFusion, DuckDB, Polars, etc.
// Arrow → QVD
let qvd_table = record_batch_to_qvd?;
DataFusion SQL (feature datafusion_support)
use *;
use register_qvd;
async
You can also register multiple QVD files and JOIN them:
register_qvd?;
register_qvd?;
let df = ctx.sql.await?;
DuckDB via Arrow (Rust)
DuckDB can ingest Arrow RecordBatches directly — no file conversion needed:
use ;
let table = read_qvd_file?;
let batch = qvd_to_record_batch?;
// Pass the Arrow RecordBatch to DuckDB via its Arrow interface
// See: https://docs.rs/duckdb/latest/duckdb/
Streaming Reader
use open_qvd_stream;
let mut reader = open_qvd_stream?;
println!;
while let Some = reader.next_chunk?
EXISTS() — O(1) Lookup
Like Qlik's EXISTS() function — build an index of unique values from one table
and use it to check or filter another table in O(1) per row.
use ;
// Build index from the "clients" table
let clients = read_qvd_file?;
let index = from_column.unwrap;
// O(1) lookup — does this value exist?
assert!;
println!;
// Filter another table — get row indices where ClientID exists in the clients table
let facts = read_qvd_file?;
// By column name (convenient)
let matching_rows = filter_rows_by_exists;
println!;
// By column index (faster for large tables — pre-computes symbol matches)
let col_idx = 0; // index of "ClientID" column in facts table
let matching_rows = filter_rows_by_exists_fast;
// Access the filtered rows
for &row in &matching_rows
Quick Start — Python
Basic usage
# Read QVD
=
# Save QVD
# Parquet ↔ QVD
# Load Parquet as QvdTable
=
# EXISTS — O(1) lookup (like Qlik's EXISTS() function)
=
=
# Check if a value exists
# True/False
# same thing
# number of unique values
# Check multiple values at once
=
# [True, True, False]
# Filter rows from another table — returns list of matching row indices
=
=
PyArrow
# QVD → PyArrow RecordBatch (zero-copy via Arrow C Data Interface)
=
=
# Or directly:
=
# PyArrow → QVD
=
pandas
# QVD → pandas DataFrame (via Arrow, zero-copy where possible)
=
# Or directly:
=
# pandas → QVD (via PyArrow round-trip)
=
=
Polars
# QVD → Polars DataFrame
=
# Or directly:
=
# Polars → QVD (via PyArrow round-trip)
=
=
DuckDB (Python)
# QVD → DuckDB (via Arrow, zero-copy)
=
=
# Or query multiple QVD files:
=
=
=
CLI
Install:
Convert between formats
# Parquet → QVD
# QVD → Parquet (default compression: snappy)
# QVD → Parquet with specific compression
# Rewrite QVD (re-generate from internal representation)
# Recompress Parquet
Inspect QVD metadata
Output example:
File: data.qvd
Size: 41.3 MB
Table: SalesData
Rows: 465,810
Columns: 12
Created: 2024-01-15 10:30:00
Build: 14.0
RecordSize: 89 bytes
Read time: 0.50s
Column Symbols BitWidth Bias FmtType Tags
--------------------------------------------------------------------------------
OrderID 465810 20 0 0 $numeric, $integer
CustomerID 12500 14 0 0 $numeric, $integer
Region 5 3 0 0 $text
Amount 389201 19 0 2 $numeric
Preview rows
# Show first 10 rows (default)
# Show first 50 rows
Show Arrow schema
Output example:
Arrow Schema for 'data.qvd':
OrderID Int64
CustomerID Int64
Region Utf8
Amount Float64 (nullable)
OrderDate Date32
Architecture
src/
├── lib.rs — public API, re-exports
├── error.rs — error types (QvdError, QvdResult)
├── header.rs — XML header parser/writer (custom, zero-dep)
├── value.rs — QVD data types (QvdSymbol, QvdValue)
├── symbol.rs — symbol table binary reader/writer
├── index.rs — index table bit-stuffing reader/writer
├── reader.rs — high-level QVD reader
├── writer.rs — high-level QVD writer + QvdTableBuilder
├── exists.rs — ExistsIndex with HashSet + filter functions
├── streaming.rs — streaming chunk-based QVD reader
├── parquet.rs — Parquet/Arrow ↔ QVD conversion (optional)
├── datafusion.rs — DataFusion TableProvider for SQL on QVD (optional)
├── python.rs — PyO3 bindings with PyArrow/pandas/Polars (optional)
└── bin/qvd.rs — CLI binary (optional)
Feature Flags
| Feature | Dependencies | Description |
|---|---|---|
| (default) | none | Core QVD read/write |
parquet_support |
arrow, parquet, chrono | Parquet/Arrow conversion |
datafusion_support |
+ datafusion, tokio | SQL queries on QVD via DataFusion |
cli |
+ clap | CLI binary |
python |
+ pyo3, arrow/pyarrow | Python bindings with PyArrow/pandas/Polars |
Author
Stanislav Chernov (@bintocher)
License
MIT — see LICENSE