Crate lance

Source
Expand description

Lance Columnar Data Format

Lance columnar data format is an alternative to Parquet. It provides 100x faster for random access, automatic versioning, optimized for computer vision, bioinformatics, spatial and ML data. Apache Arrow and DuckDB compatible.

§Create a Dataset

use lance::{dataset::WriteParams, Dataset};

let schema = Arc::new(Schema::new(vec![Field::new("test", DataType::Int64, false)]));
let batches = vec![RecordBatch::new_empty(schema.clone())];
let reader = RecordBatchIterator::new(
    batches.into_iter().map(Ok), schema
);

let write_params = WriteParams::default();
Dataset::write(reader, &uri, Some(write_params)).await.unwrap();

§Scan a Dataset

use futures::StreamExt;
use lance::Dataset;

let dataset = Dataset::open(&path).await.unwrap();
let mut scanner = dataset.scan();
let batches: Vec<RecordBatch> = scanner
    .try_into_stream()
    .await
    .unwrap()
    .map(|b| b.unwrap())
    .collect::<Vec<RecordBatch>>()
    .await;

Re-exports§

pub use dataset::Dataset;

Modules§

arrow
Extend Arrow Functionality
datafusion
Extends DataFusion
dataset
Lance Dataset
datatypes
Lance data types, Schema and Field
error
index
Secondary Index
io
I/O utilities.
session
table
utils
Various utilities

Structs§

DIST_FIELD

Enums§

Error

Functions§

open_dataset
Creates and loads a Dataset from the given path. Infers the storage backend to use from the scheme in the given table path.

Type Aliases§

Result