Skip to main content

persist_data

Function persist_data 

Source
pub async fn persist_data(
    processed: &ProcessedInput,
    database: &dyn IngestDb,
    dataset_name: &str,
    owner_id: Uuid,
    tenant_id: Option<Uuid>,
) -> Result<Data, Box<dyn Error>>
Expand description

Persist a ProcessedInput as a Data record: resolve or create the dataset, deduplicate by content hash, create the record if new, and attach it to the dataset.

Dataset resolution uses a deterministic UUID5 ID so the lookup + optional INSERT OR IGNORE is idempotent and cheap — safe to call once per item.

This is the second step of the ingest pipeline (Task 2).