pub struct Dataset {
pub id: Uuid,
pub original_id: String,
pub source_portal: String,
pub url: String,
pub title: String,
pub description: Option<String>,
pub embedding: Option<Vec<f32>>,
pub metadata: Value,
pub first_seen_at: DateTime<Utc>,
pub last_updated_at: DateTime<Utc>,
pub content_hash: Option<String>,
pub is_stale: bool,
}Expand description
Complete representation of a row from the ‘datasets’ table.
This structure represents a persisted dataset with all database fields, including system-generated identifiers and timestamps. It maps directly to the PostgreSQL schema and is used for reading data from the database.
§Fields
id- Unique identifier (UUID) generated by the databaseoriginal_id- Original identifier from the source portalsource_portal- Base URL of the originating CKAN portalurl- Public landing page URL for the datasettitle- Human-readable dataset titledescription- Optional detailed descriptionembedding- Optional vector of floats for semantic searchmetadata- Additional metadata as JSONfirst_seen_at- Timestamp when the dataset was first indexedlast_updated_at- Timestamp of the most recent update
Fields§
§id: UuidUnique identifier (UUID) generated by the database
original_id: StringOriginal identifier from the source portal
source_portal: StringBase URL of the originating CKAN portal
url: StringPublic landing page URL for the dataset
title: StringHuman-readable dataset title
description: Option<String>Optional detailed description
embedding: Option<Vec<f32>>Optional embedding vector for semantic search
metadata: ValueAdditional metadata as JSON
first_seen_at: DateTime<Utc>Timestamp when the dataset was first indexed
last_updated_at: DateTime<Utc>Timestamp of the most recent update
content_hash: Option<String>SHA-256 hash of title + description for delta detection
is_stale: boolWhether this dataset has been removed from its source portal