Skip to main content

Crate datafusion_ducklake

Crate datafusion_ducklake 

Source
Expand description

§DataFusion-DuckLake

A DataFusion extension that adds support for DuckLake, an integrated data lake and catalog format.

§Overview

DuckLake uses:

  • Catalog Database: SQL database (DuckDB, SQLite, PostgreSQL, MySQL) storing metadata as SQL tables
  • Data Storage: Apache Parquet files stored on disk/object storage

This extension provides read-only access to DuckLake catalogs through DataFusion’s catalog and table provider interfaces.

§Example

use datafusion::prelude::*;
use datafusion_ducklake::{DuckLakeCatalog, DuckdbMetadataProvider};

// Create a DataFusion session context
let ctx = SessionContext::new();

// Create a DuckDB metadata provider
let provider = DuckdbMetadataProvider::new("path/to/catalog.ducklake")?;

// Register a DuckLake catalog with the provider
let catalog = DuckLakeCatalog::new(provider)?;
ctx.register_catalog("ducklake", std::sync::Arc::new(catalog));

// Query tables from the catalog
let df = ctx.sql("SELECT * FROM ducklake.main.my_table").await?;
df.show().await?;

Re-exports§

pub use catalog::DuckLakeCatalog;
pub use error::DuckLakeError;
pub use metadata_provider::MetadataProvider;
pub use schema::DuckLakeSchema;
pub use table::DuckLakeTable;
pub use table_functions::register_ducklake_functions;
pub use metadata_provider_duckdb::DuckdbMetadataProvider;

Modules§

catalog
DuckLake catalog provider implementation
column_rename
Custom execution plan for renaming columns
delete_filter
Custom execution plan for filtering deleted rows
encryption
Encryption support for reading encrypted Parquet files in DuckLake.
error
Error types for the DuckLake DataFusion extension
information_schema
Information schema implementation for DuckLake catalog metadata
metadata_provider
metadata_provider_duckdb
path_resolver
Path resolution utilities for DuckLake
schema
DuckLake schema provider implementation
table
DuckLake table provider implementation
table_changes
Table changes (CDC) functionality for DuckLake
table_deletions
Table deletions functionality for DuckLake
table_functions
User-Defined Table Functions (UDTFs) for DuckLake catalog metadata
types
Type mapping from DuckLake types to Arrow types

Type Aliases§

Result