pub struct CkanClient { /* private fields */ }Expand description
HTTP client for interacting with CKAN open data portals.
CKAN (Comprehensive Knowledge Archive Network) is an open-source data management system used by many government open data portals worldwide.
§Examples
use ceres_client::CkanClient;
let client = CkanClient::new("https://dati.gov.it")?;
let dataset_ids = client.list_package_ids().await?;
println!("Found {} datasets", dataset_ids.len());Implementations§
Source§impl CkanClient
impl CkanClient
Sourcepub fn new(base_url_str: &str) -> Result<Self, AppError>
pub fn new(base_url_str: &str) -> Result<Self, AppError>
Creates a new CKAN client for the specified portal.
§Arguments
base_url_str- The base URL of the CKAN portal (e.g., https://dati.gov.it)
§Returns
Returns a configured CkanClient instance.
§Errors
Returns AppError::Generic if the URL is invalid or malformed.
Returns AppError::ClientError if the HTTP client cannot be built.
Sourcepub async fn list_package_ids(&self) -> Result<Vec<String>, AppError>
pub async fn list_package_ids(&self) -> Result<Vec<String>, AppError>
Fetches the complete list of dataset IDs from the CKAN portal.
This method calls the CKAN package_list API endpoint, which returns
all dataset identifiers available in the portal.
§Returns
A vector of dataset ID strings.
§Errors
Returns AppError::ClientError if the HTTP request fails.
Returns AppError::Generic if the CKAN API returns an error.
§Performance Note
TODO(performance): Add pagination for large portals
Large portals can have 100k+ datasets. CKAN supports limit/offset params.
Consider: list_package_ids_paginated(limit: usize, offset: usize)
Or streaming: list_package_ids_stream() -> impl Stream<Item = ...>
Sourcepub async fn show_package(&self, id: &str) -> Result<CkanDataset, AppError>
pub async fn show_package(&self, id: &str) -> Result<CkanDataset, AppError>
Sourcepub async fn search_modified_since(
&self,
since: DateTime<Utc>,
) -> Result<Vec<CkanDataset>, AppError>
pub async fn search_modified_since( &self, since: DateTime<Utc>, ) -> Result<Vec<CkanDataset>, AppError>
Searches for datasets modified since a given timestamp.
Uses CKAN’s package_search API with a metadata_modified filter to fetch
only datasets that have been updated since the last sync. This enables
incremental harvesting with ~99% fewer API calls in steady state.
§Arguments
since- Only return datasets modified after this timestamp
§Returns
A vector of CkanDataset containing all datasets modified since the given time.
Unlike list_package_ids() + show_package(), this returns complete dataset
objects in a single paginated query.
§Errors
Returns AppError::ClientError if the HTTP request fails.
Returns AppError::Generic if the CKAN API returns an error or doesn’t support
the package_search endpoint (some older CKAN instances).
Sourcepub async fn search_all_datasets(&self) -> Result<Vec<CkanDataset>, AppError>
pub async fn search_all_datasets(&self) -> Result<Vec<CkanDataset>, AppError>
Fetches all datasets from the portal using paginated package_search.
This makes ~N/1000 API calls instead of N individual package_show calls,
which is critical for large portals like HDX (~40k datasets) that enforce
strict rate limits.
Sourcepub async fn dataset_count(&self) -> Result<usize, AppError>
pub async fn dataset_count(&self) -> Result<usize, AppError>
Returns the total dataset count from a lightweight rows=0 query.
Sourcepub fn into_new_dataset(
dataset: CkanDataset,
portal_url: &str,
url_template: Option<&str>,
language: &str,
) -> NewDataset
pub fn into_new_dataset( dataset: CkanDataset, portal_url: &str, url_template: Option<&str>, language: &str, ) -> NewDataset
Converts a CKAN dataset into Ceres’ internal NewDataset model.
This helper method transforms CKAN-specific data structures into the format used by Ceres for database storage. Multilingual fields are resolved using the specified language preference.
§Arguments
dataset- The CKAN dataset to convertportal_url- The base URL of the CKAN portalurl_template- Optional URL template with{id}and{name}placeholderslanguage- Preferred language for resolving multilingual fields
§Returns
A NewDataset ready to be inserted into the database.
§Examples
use ceres_client::CkanClient;
use ceres_client::ckan::CkanDataset;
use ceres_core::LocalizedField;
let ckan_dataset = CkanDataset {
id: "abc-123".to_string(),
name: "air-quality-data".to_string(),
title: LocalizedField::Plain("Air Quality Monitoring".to_string()),
notes: Some(LocalizedField::Plain("Data from air quality sensors".to_string())),
extras: serde_json::Map::new(),
};
let new_dataset = CkanClient::into_new_dataset(
ckan_dataset,
"https://dati.gov.it",
None,
"en",
);
assert_eq!(new_dataset.original_id, "abc-123");
assert_eq!(new_dataset.url, "https://dati.gov.it/dataset/air-quality-data");
assert_eq!(new_dataset.title, "Air Quality Monitoring");Trait Implementations§
Source§impl Clone for CkanClient
impl Clone for CkanClient
Source§fn clone(&self) -> CkanClient
fn clone(&self) -> CkanClient
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more