Crate midas_fetcher

Source
Expand description

§MIDAS Fetcher

High-performance concurrent downloader for UK Met Office MIDAS Open weather data

MIDAS Fetcher is a Rust library and command-line tool designed to efficiently download large volumes of historical weather data from the UK Met Office MIDAS Open Archive. Built for climate researchers and data scientists who need reliable, fast, and resumable downloads while respecting CEDA’s infrastructure.

§Features

  • 🚀 Concurrent Downloads: Work-stealing queue prevents worker starvation
  • 📦 Intelligent Caching: Hierarchical organization with deduplication and verification
  • Data Integrity: Atomic file operations with MD5 verification
  • 🔄 Resumable Downloads: Continues from exactly where interrupted
  • 📊 Real-time Progress: ETA calculations and comprehensive status reporting
  • 🛡️ CEDA-Respectful: Built-in rate limiting and exponential backoff
  • 🎯 Selective Downloads: Filter by dataset, county, station, or time period

This library is specifically designed to support both command-line usage and future integration with Tauri-based GUI applications, providing a clean separation between core functionality and presentation layers.

§Key Features

  • Authenticated CEDA client with automatic session management
  • Work-stealing concurrent downloads preventing worker starvation
  • Atomic file operations ensuring data integrity
  • Rate limiting with exponential backoff to respect server limits
  • Manifest-based verification for instant cache validation
  • Progress monitoring with real-time statistics
  • Graceful error handling with detailed error reporting

§Architecture Overview

The library is organized into several key modules:

  • app - Core application logic including download coordination
  • auth - CEDA authentication and credential management
  • errors - Comprehensive error types and handling
  • prelude - Common imports for convenient usage

§Quick Start

use midas_fetcher::prelude::*;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    // Setup authentication (interactive)
    if !check_credentials() {
        setup_credentials().await?;
    }

    // Create shared components
    let cache = Arc::new(CacheManager::new(CacheConfig::default()).await?);
    let client = Arc::new(CedaClient::new().await?);
    let queue = Arc::new(WorkQueue::new());

    // Load files from manifest
    let files = collect_all_files("manifest.txt", ManifestConfig::default()).await?;
    for file in files.into_iter().take(100) {
        queue.add_work(file).await?;
    }

    // Setup and run downloads
    let config = CoordinatorConfig::default();
    let mut coordinator = Coordinator::new(config, queue, cache, client);
    let result = coordinator.run_downloads().await?;

    println!("Downloaded {} files successfully", result.stats.files_completed);
    Ok(())
}

§For Tauri Integration

This library is designed to work seamlessly with Tauri applications. The core types implement Serialize and Deserialize for easy JSON communication:

// Example for Tauri integration (requires tauri dependency)
use midas_fetcher::prelude::*;

// Tauri command example - returns Result as JSON-serializable String
async fn start_download(dataset: String, workers: usize) -> Result<String, String> {
    let cache = Arc::new(CacheManager::new(CacheConfig::default()).await
        .map_err(|e| e.to_string())?);
    let client = Arc::new(CedaClient::new().await
        .map_err(|e| e.to_string())?);
    let queue = Arc::new(WorkQueue::new());

    // Configure downloads
    let config = CoordinatorConfig {
        worker_count: workers,
        ..Default::default()
    };

    let mut coordinator = Coordinator::new(config, queue, cache, client);
    let result = coordinator.run_downloads().await
        .map_err(|e| e.to_string())?;
     
    // Serialize result for Tauri
    serde_json::to_string(&result).map_err(|e| e.to_string())
}

§Progress Monitoring

For applications that need real-time progress updates:

// Example showing progress monitoring pattern
use midas_fetcher::prelude::*;

async fn download_with_progress() -> Result<()> {
    // Setup coordinator
    let cache = Arc::new(CacheManager::new(CacheConfig::default()).await?);
    let client = Arc::new(CedaClient::new().await?);
    let queue = Arc::new(WorkQueue::new());
    let mut coordinator = Coordinator::new(CoordinatorConfig::default(), queue, cache, client);

    // For progress monitoring, poll coordinator.get_stats() periodically
    // from a separate task or in your UI update loop
     
    // Run downloads
    let result = coordinator.run_downloads().await?;
    println!("Downloaded {} files", result.stats.files_completed);
    Ok(())
}

§Error Handling

The library provides comprehensive error types that are both machine-readable and provide helpful human-readable messages:

// Example showing error handling patterns
use midas_fetcher::prelude::*;

async fn handle_errors() -> Result<()> {
    match CedaClient::new().await {
        Ok(_client) => {
            println!("Authentication successful");
            Ok(())
        }
        Err(error) => {
            eprintln!("Error occurred: {}", error);
            eprintln!("Error category: {}", error.category());
            eprintln!("Is recoverable: {}", error.is_recoverable());
            Err(error)
        }
    }
}

Re-exports§

pub use errors::AppError;
pub use errors::Result;
pub use app::collect_all_files;
pub use app::collect_datasets_and_years;
pub use app::filter_manifest_files;
pub use app::CacheConfig;
pub use app::CacheManager;
pub use app::CacheStats;
pub use app::CedaClient;
pub use app::ClientConfig;
pub use app::Coordinator;
pub use app::CoordinatorConfig;
pub use app::DatasetFileInfo;
pub use app::DownloadStats;
pub use app::FileInfo;
pub use app::ManifestConfig;
pub use app::ManifestStreamer;
pub use app::Md5Hash;
pub use app::QualityControlVersion;
pub use app::QueueStats;
pub use app::SessionResult;
pub use app::WorkQueue;
pub use app::WorkQueueConfig;
pub use app::WorkerConfig;
pub use auth::check_credentials;
pub use auth::get_auth_status;
pub use auth::setup_credentials;
pub use auth::verify_credentials;
pub use auth::AuthStatus;
pub use config::AppConfig;
pub use config::LoggingConfig;
pub use constants::DEFAULT_RATE_LIMIT_RPS;
pub use constants::DEFAULT_WORKER_COUNT;
pub use constants::ENV_PASSWORD;
pub use constants::ENV_USERNAME;
pub use constants::USER_AGENT;

Modules§

app
Core application logic for MIDAS Fetcher
auth
Authentication management for CEDA credentials
cli
Command-line interface components
config
Configuration management for MIDAS Fetcher
constants
Application constants for MIDAS Fetcher
errors
Error types for MIDAS Fetcher
prelude
Prelude module for MIDAS Fetcher Library

Constants§

DESCRIPTION
Library description
NAME
Library name
VERSION
Library version information