Skip to main content

Crate qvd

Crate qvd 

Source
Expand description

§qvd — High-performance Qlik QVD file library

Read, write, and convert Qlik QVD files with zero-copy roundtrip fidelity. First and only QVD crate on crates.io.

§Features

  • Read/Write QVD — byte-identical roundtrip (MD5 match on 20 real files up to 2.8 GB)
  • Parquet ↔ QVD — bidirectional conversion with compression (snappy, zstd, gzip, lz4). Requires feature parquet_support.
  • Arrow RecordBatch — convert QVD to/from Arrow for DataFusion, DuckDB, Polars integration. Requires feature parquet_support.
  • DataFusion SQL — register QVD as a table, query with SQL. Requires feature datafusion_support.
  • Streaming reader — read QVD in chunks without loading entire file into memory
  • EXISTS() index — O(1) hash lookup, like Qlik’s EXISTS() function
  • Concatenate — merge QVD tables with Qlik CONCATENATE semantics (schema union, NULL fill)
  • Concatenate with PK — upsert/dedup merge with primary key: Replace, Skip, or Error on conflict. First QVD library in any language with PK-based merge
  • write_arrow — write PyArrow RecordBatch/Table directly to QVD (no Parquet roundtrip)
  • Python bindings — PyArrow, pandas, Polars via zero-copy Arrow bridge
  • Zero dependencies for core read/write (Parquet/Arrow/DataFusion are optional)

§Quick Start

§Read and write QVD files

use qvd::{read_qvd_file, write_qvd_file};

let table = read_qvd_file("data.qvd").unwrap();
println!("Rows: {}, Cols: {}", table.num_rows(), table.num_cols());
println!("Columns: {:?}", table.column_names());

// Byte-identical roundtrip
write_qvd_file(&table, "output.qvd").unwrap();

§EXISTS() — O(1) lookup

use qvd::{read_qvd_file, ExistsIndex, filter_rows_by_exists_fast};

let clients = read_qvd_file("clients.qvd").unwrap();
let index = ExistsIndex::from_column(&clients, "ClientID").unwrap();

assert!(index.exists("12345"));

let facts = read_qvd_file("facts.qvd").unwrap();
// col_idx = column index for "ClientID" in facts table
let col_idx = 0;
let filtered = filter_rows_by_exists_fast(&facts, col_idx, &index);

§Streaming reader

use qvd::open_qvd_stream;

let mut reader = open_qvd_stream("huge_file.qvd").unwrap();
while let Some(chunk) = reader.next_chunk(65536).unwrap() {
    println!("Chunk: {} rows", chunk.num_rows);
}

§Parquet ↔ QVD (feature parquet_support)

use qvd::{convert_parquet_to_qvd, convert_qvd_to_parquet, ParquetCompression};

convert_parquet_to_qvd("input.parquet", "output.qvd").unwrap();
convert_qvd_to_parquet("input.qvd", "output.parquet", ParquetCompression::Zstd).unwrap();

§Arrow RecordBatch (feature parquet_support)

use qvd::{read_qvd_file, qvd_to_record_batch, record_batch_to_qvd};

let table = read_qvd_file("data.qvd").unwrap();
let batch = qvd_to_record_batch(&table).unwrap();
// Use with DataFusion, DuckDB, Polars...

§Concatenate — merge QVD tables

use qvd::{read_qvd_file, concatenate, write_qvd_file};

let a = read_qvd_file("data_jan.qvd").unwrap();
let b = read_qvd_file("data_feb.qvd").unwrap();
let merged = concatenate(&a, &b).unwrap();
write_qvd_file(&merged, "data_all.qvd").unwrap();

§Concatenate with PK — upsert/dedup merge

use qvd::{read_qvd_file, concatenate_with_pk, OnConflict, write_qvd_file};

let existing = read_qvd_file("master.qvd").unwrap();
let updates = read_qvd_file("delta.qvd").unwrap();
// New rows win on PK collision (upsert)
let merged = concatenate_with_pk(&existing, &updates, &["ID"], OnConflict::Replace).unwrap();
write_qvd_file(&merged, "master_updated.qvd").unwrap();

§DataFusion SQL (feature datafusion_support)

use datafusion::prelude::*;
use qvd::register_qvd;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ctx = SessionContext::new();
    register_qvd(&ctx, "sales", "sales.qvd")?;
    let df = ctx.sql("SELECT Region, SUM(Amount) FROM sales GROUP BY Region").await?;
    df.show().await?;
    Ok(())
}

§Feature Flags

FeatureDependenciesDescription
(default)noneCore QVD read/write, streaming, EXISTS
parquet_supportarrow, parquet, chronoParquet/Arrow ↔ QVD conversion
datafusion_support+ datafusion, tokioSQL queries on QVD via DataFusion
cli+ clapCLI binary qvd-cli
python+ pyo3, arrow/pyarrowPython bindings with PyArrow/pandas/Polars

Re-exports§

pub use error::QvdError;
pub use error::QvdResult;
pub use header::QvdTableHeader;
pub use reader::read_qvd;
pub use reader::read_qvd_file;
pub use reader::QvdTable;
pub use writer::write_qvd;
pub use writer::write_qvd_file;
pub use writer::QvdTableBuilder;
pub use exists::ExistsIndex;
pub use exists::filter_rows_by_exists;
pub use exists::filter_rows_by_exists_fast;
pub use value::QvdSymbol;
pub use value::QvdValue;
pub use streaming::QvdStreamReader;
pub use streaming::QvdChunk;
pub use streaming::open_qvd_stream;
pub use concat::concatenate;
pub use concat::concatenate_with_schema;
pub use concat::concatenate_with_pk;
pub use concat::concatenate_with_pk_schema;
pub use concat::OnConflict;
pub use concat::SchemaMode;

Modules§

concat
QVD table concatenation and merge operations. See concatenate for pure append and concatenate_with_pk for PK-based upsert. QVD table concatenation and merge operations.
error
Error types for QVD operations.
exists
O(1) EXISTS() index and fast row filtering. See ExistsIndex.
header
QVD XML header parser and writer.
index
Bit-stuffed index table reader and writer.
reader
High-level QVD file reader. See read_qvd_file and QvdTable.
streaming
Streaming chunk-based QVD reader for memory-efficient processing of large files. See QvdStreamReader, open_qvd_stream, and QvdStreamReader::read_filtered for EXISTS()-style filtered reads that are 2.5x faster than Qlik Sense.
symbol
Binary symbol table reader and writer.
value
QVD value types: QvdSymbol and QvdValue.
writer
High-level QVD file writer and QvdTableBuilder for creating QVD files from scratch.