Expand description
Lance Columnar Data Format
Lance columnar data format is an alternative to Parquet. It provides 100x faster for random access, automatic versioning, optimized for computer vision, bioinformatics, spatial and ML data. Apache Arrow and DuckDB compatible.
§Create a Dataset
use lance::{dataset::WriteParams, Dataset};
let schema = Arc::new(Schema::new(vec![Field::new("test", DataType::Int64, false)]));
let batches = vec![RecordBatch::new_empty(schema.clone())];
let reader = RecordBatchIterator::new(
batches.into_iter().map(Ok), schema
);
let write_params = WriteParams::default();
Dataset::write(reader, &uri, Some(write_params)).await.unwrap();§Scan a Dataset
use futures::StreamExt;
use lance::Dataset;
let dataset = Dataset::open(&path).await.unwrap();
let mut scanner = dataset.scan();
let batches: Vec<RecordBatch> = scanner
.try_into_stream()
.await
.unwrap()
.map(|b| b.unwrap())
.collect::<Vec<RecordBatch>>()
.await;
Re-exports§
pub use dataset::Dataset;
Modules§
- arrow
- Arrow-related utilities and extensions for Lance
- datafusion
- Utilities for integrating Lance into DataFusion
- dataset
- Lance Dataset
- datatypes
- Lance data types, Schema and Field
- deps
- Re-exports of 3rd party dependencies used in lance public APIs
- index
- Secondary Index
- io
- I/O utilities.
- session
- table
- utils
- Various utilities
Enums§
Statics§
Functions§
- open_
dataset - Creates and loads a
Datasetfrom the given path. Infers the storage backend to use from the scheme in the given table path.