Crate lance

source ·
Expand description

Lance Columnar Data Format

Lance columnar data format is an alternative to Parquet. It provides 100x faster for random access, automatic versioning, optimized for computer vision, bioinformatics, spatial and ML data. Apache Arrow and DuckDB compatible.

§Create a Dataset

use lance::{dataset::WriteParams, Dataset};

let schema = Arc::new(Schema::new(vec![Field::new("test", DataType::Int64, false)]));
let batches = vec![RecordBatch::new_empty(schema.clone())];
let reader = RecordBatchIterator::new(
    batches.into_iter().map(Ok), schema
);

let write_params = WriteParams::default();
Dataset::write(reader, &uri, Some(write_params)).await.unwrap();

§Scan a Dataset

use futures::StreamExt;
use lance::Dataset;

let dataset = Dataset::open(&path).await.unwrap();
let mut scanner = dataset.scan();
let batches: Vec<RecordBatch> = scanner
    .try_into_stream()
    .await
    .unwrap()
    .map(|b| b.unwrap())
    .collect::<Vec<RecordBatch>>()
    .await;

Re-exports§

Modules§

Structs§

  • Row ID field. This is nullable because its validity bitmap is sometimes used as a selection vector.

Enums§

Functions§

  • Creates and loads a Dataset from the given path. Infers the storage backend to use from the scheme in the given table path.

Type Aliases§