Skip to main content

TableWriter

Struct TableWriter 

Source
pub struct TableWriter { /* private fields */ }

Implementations§

Source§

impl TableWriter

Source

pub fn new( catalog: Arc<dyn CatalogProvider>, store: Arc<dyn Store>, policy: VectorStoragePolicy, table: TableIdent, ) -> Self

Source

pub fn with_parent_snapshot(self, id: SnapshotId) -> Self

Source

pub async fn write_batch_deferred( &mut self, batch: &RecordBatch, embeddings: &[Vec<f32>], ) -> AilakeResult<()>

Write batch as Parquet-only immediately, build HNSW in background.

Returns after the Parquet file is persisted (~LanceDB write speed). A tokio task runs concurrently to build the HNSW index, rewrite the file with the AILK section, and update the catalog entry.

During the build window, SearchSession serves this shard via flat scan (brute-force, exact) instead of HNSW. The transition is automatic once the background task commits the updated manifest entry.

Source

pub async fn write_batch_idempotent( &mut self, batch: &RecordBatch, embeddings: &[Vec<f32>], batch_id: &str, ) -> AilakeResult<()>

Idempotent variant of write_batch.

Before any I/O, checks if batch_id already appears in the current snapshot. If it does, this is a no-op — safe for Airflow/Kestra retries. If not found, writes the batch and tags the DataFileEntry with batch_id so future retries can detect it.

commit() is likewise a no-op when pending_files is empty.

Source

pub async fn write_batch( &mut self, batch: &RecordBatch, embeddings: &[Vec<f32>], ) -> AilakeResult<()>

Write a batch to a new AI-Lake file and stage it for commit.

Source

pub async fn write_batch_auto( &mut self, batch: &RecordBatch, embeddings: &[Vec<f32>], ) -> AilakeResult<()>

Write batch, auto-selecting the index based on detected hardware.

Picks IVF-PQ when a CUDA GPU or ≥8 CPU cores are present AND the batch has ≥5 000 vectors. Falls back to HNSW for weaker / local hardware. Uses IvfPqConfig::for_dataset to scale nlist with dataset size.

Source

pub async fn write_batch_ivf_pq( &mut self, batch: &RecordBatch, embeddings: &[Vec<f32>], ivf_config: IvfPqConfig, ) -> AilakeResult<()>

Write batch with IVF-PQ index built synchronously (no background task).

Smaller index than HNSW; better for S3 sequential-scan workloads.

Source

pub async fn write_batch_multi( &mut self, batch: &RecordBatch, columns: &[MultiVectorBatch<'_>], ) -> AilakeResult<()>

Write a batch with multiple vector columns into a single AI-Lake file.

The first entry in columns is treated as the primary column (used for geometric pruning). Additional columns each get their own HNSW section.

Source

pub async fn commit(self) -> AilakeResult<SnapshotId>

Commit all staged files as a new Iceberg snapshot.

No-op when pending_files is empty (e.g., all write_batch_idempotent calls were skipped because their batch_id was already committed). Returns the current snapshot id in that case (or 0 if no snapshot exists yet).

Source

pub async fn create_or_open( catalog: Arc<dyn CatalogProvider>, store: Arc<dyn Store>, policy: VectorStoragePolicy, table: TableIdent, ) -> AilakeResult<Self>

Create a table if it doesn’t exist, then return a writer for it.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<T> Read<Exclusive, BecauseExclusive> for T
where T: ?Sized,