Crate deltalake_core

Source
Expand description

Native Delta Lake implementation in Rust

§Usage

Load a Delta Table by path:

async {
  let table = deltalake_core::open_table("../test/tests/data/simple_table").await.unwrap();
  let version = table.version();
};

Load a specific version of Delta Table by path then filter files by partitions:

async {
  let table = deltalake_core::open_table_with_version("../test/tests/data/simple_table", 0).await.unwrap();
  let files = table.get_files_by_partitions(&[deltalake_core::PartitionFilter {
      key: "month".to_string(),
      value: deltalake_core::PartitionValue::Equal("12".to_string()),
  }]);
};

Load a specific version of Delta Table by path and datetime:

async {
  let table = deltalake_core::open_table_with_ds(
      "../test/tests/data/simple_table",
      "2020-05-02T23:47:31-07:00",
  ).await.unwrap();
  let version = table.version();
};

§Optional cargo package features

  • s3, gcs, azure - enable the storage backends for AWS S3, Google Cloud Storage (GCS), or Azure Blob Storage / Azure Data Lake Storage Gen2 (ADLS2). Use s3-native-tls to use native TLS instead of Rust TLS implementation.
  • datafusion - enable the datafusion::datasource::TableProvider trait implementation for Delta Tables, allowing them to be queried using DataFusion.
  • datafusion-ext - DEPRECATED: alias for datafusion feature.

§Querying Delta Tables with Datafusion

Querying from local filesystem:

use std::sync::Arc;
use datafusion::prelude::SessionContext;

async {
  let mut ctx = SessionContext::new();
  let table = deltalake_core::open_table("../test/tests/data/simple_table")
      .await
      .unwrap();
  ctx.register_table("demo", Arc::new(table)).unwrap();

  let batches = ctx
      .sql("SELECT * FROM demo").await.unwrap()
      .collect()
      .await.unwrap();
};

Re-exports§

pub use self::data_catalog::DataCatalog;
pub use self::data_catalog::DataCatalogError;
pub use self::table::builder::DeltaTableBuilder;
pub use self::table::builder::DeltaTableConfig;
pub use self::table::builder::DeltaVersion;
pub use self::table::config::TableProperty;
pub use self::table::DeltaTable;
pub use operations::DeltaOps;
pub use protocol::checkpoints;
pub use arrow;
pub use datafusion;
pub use parquet;
pub use self::errors::*;
pub use self::schema::partitions::*;
pub use self::schema::*;

Modules§

data_catalog
Catalog abstraction for Delta Table
delta_datafusion
Datafusion integration for Delta Table
errors
Exceptions for the deltalake crate
kernel
Delta Kernel module
logstore
DeltaLake storage system
operations
High level operations API to interact with Delta tables
protocol
Actions included in Delta table transaction logs
schema
Delta Table schema implementation.
table
Delta Table read and write implementation
writer
Abstractions and implementations for writing data to delta tables

Structs§

ObjectMeta
The metadata that describes an object.
Path
A parsed path representation that can be safely written to object storage

Enums§

ObjectStoreError
A specialized Error for object store-related errors

Traits§

ObjectStore
Universal API to multiple object store services.

Functions§

crate_version
Returns Rust core version or custom set client_version such as the py-binding
init_client_version
open_table
Creates and loads a DeltaTable from the given path with current metadata. Infers the storage backend to use from the scheme in the given table path.
open_table_with_ds
Creates a DeltaTable from the given path.
open_table_with_storage_options
Same as open_table, but also accepts storage options to aid in building the table for a deduced StorageService.
open_table_with_version
Creates a DeltaTable from the given path and loads it with the metadata from the given version. Infers the storage backend to use from the scheme in the given table path.