Skip to main content

delete_rows

Function delete_rows 

Source
pub async fn delete_rows(
    catalog: Arc<dyn CatalogProvider>,
    store: Arc<dyn Store>,
    table: &TableIdent,
    file_path: &str,
    row_ids: &[u32],
) -> AilakeResult<()>
Expand description

Logically delete rows from a V3 AI-Lake table using Iceberg Deletion Vectors.

§What this does

  1. Verifies the table is format-version=3 (DVs require V3).
  2. Reads the current file list from the catalog.
  3. Finds file_path in the snapshot (exact match or suffix match for tables where the catalog prefixes absolute paths).
  4. Merges row_ids into the existing DV bitmap for that file (or creates a new one if the file has no DV yet).
  5. Writes a new Puffin .dvd file to {table_location}/metadata/dv-{snap_id}.dvd.
  6. Commits a Replace snapshot so all readers see the updated DV immediately.

After the call, scanner.rs (Phase B) will automatically exclude the deleted rows from HNSW and flat-scan results. The data file is not modified.

§Arguments

  • catalog — catalog for manifest reads and snapshot commits.
  • store — object store for Puffin file I/O.
  • table — fully-qualified table identifier (namespace.name).
  • file_path — path of the data file whose rows are being deleted. May be a relative path (e.g. "data/part-00001.parquet") or an absolute path as returned by catalog.list_files(). Suffix matching is applied.
  • row_ids — 0-based row positions to delete (within the data file).

§Errors

  • InvalidArgument if the table is format-version < 3.
  • Catalog if the table has no current snapshot or file_path is not found.