oxigdal-cloud 0.1.2

Advanced cloud storage backends for OxiGDAL - Pure Rust cloud integration
Documentation

oxigdal-cloud

Crates.io Documentation License

Advanced cloud storage backends for OxiGDAL - 100% Pure Rust cloud integration with multi-cloud abstraction, intelligent caching, prefetching, and comprehensive retry logic for seamless geospatial data access.

Features

  • Multi-Cloud Providers: AWS S3, Azure Blob Storage, Google Cloud Storage with unified API
  • HTTP/HTTPS Backend: Enhanced HTTP/HTTPS support with authentication and retry mechanisms
  • Unified Abstraction: Single API for accessing data across different cloud providers
  • Advanced Caching: Multi-level cache with memory and disk tiers, LRU+LFU eviction, compression
  • Intelligent Prefetching: Predictive prefetch with access pattern analysis and bandwidth management
  • Robust Retry Logic: Exponential backoff with jitter, circuit breaker, and retry budgets
  • Comprehensive Authentication: OAuth 2.0, service accounts, API keys, SAS tokens, IAM roles
  • Performance Optimized: Zero-copy operations, efficient streaming, bandwidth throttling
  • 100% Pure Rust: No C/Fortran dependencies, everything written in safe Rust

Installation

Add to your Cargo.toml:

[dependencies]
oxigdal-cloud = { version = "0.1", features = ["s3", "cache", "retry"] }

Feature Flags

Feature Description Dependencies
s3 AWS S3 support aws-sdk-s3
azure-blob Azure Blob Storage support azure_storage, azure_identity
gcs Google Cloud Storage support google-cloud-storage
http HTTP/HTTPS backend reqwest
cache Multi-level caching lru, dashmap, flate2
prefetch Intelligent prefetching cache + async
retry Retry logic with backoff async
oauth2 OAuth 2.0 authentication oauth2
async Async/await support tokio, async-trait
std Standard library support (default) -

Quick Start

AWS S3

use oxigdal_cloud::backends::{S3Backend, CloudStorageBackend};

#[tokio::main]
async fn main() -> oxigdal_cloud::Result<()> {
    // Create S3 backend
    let backend = S3Backend::new("my-bucket", "data/zarr")
        .with_region("us-west-2");

    // Get object
    let data = backend.get("file.tif").await?;
    println!("Downloaded {} bytes", data.len());

    // Put object
    backend.put("output.tif", &data).await?;

    Ok(())
}

Multi-Cloud Abstraction

use oxigdal_cloud::CloudBackend;

#[tokio::main]
async fn main() -> oxigdal_cloud::Result<()> {
    // Parse URL and automatically create appropriate backend
    let backend = CloudBackend::from_url("s3://my-bucket/data/file.tif")?;
    let data = backend.get().await?;

    println!("Data size: {} bytes", data.len());

    Ok(())
}

Supported URL formats:

  • s3://bucket/key - AWS S3
  • az://account@container/blob - Azure Blob Storage
  • gs://bucket/object - Google Cloud Storage
  • http://example.com/path or https://example.com/path - HTTP/HTTPS

Advanced Caching

use oxigdal_cloud::cache::{CacheConfig, MultiLevelCache};
use bytes::Bytes;

#[tokio::main]
async fn main() -> oxigdal_cloud::Result<()> {
    // Configure multi-level cache
    let config = CacheConfig::new()
        .with_max_memory_size(100 * 1024 * 1024) // 100 MB
        .with_cache_dir("/tmp/oxigdal-cache")
        .with_compression_threshold(1024 * 1024); // Compress > 1MB

    let cache = MultiLevelCache::new(config)?;

    // Cache data
    cache.put("key".to_string(), Bytes::from("data")).await?;

    // Retrieve from cache (memory tier first, then disk)
    if let Some(data) = cache.get(&"key".to_string()).await? {
        println!("Retrieved from cache: {} bytes", data.len());
    }

    Ok(())
}

Intelligent Prefetching

use oxigdal_cloud::prefetch::PrefetchConfig;

#[tokio::main]
async fn main() -> oxigdal_cloud::Result<()> {
    let config = PrefetchConfig::new()
        .with_max_prefetch_size(10 * 1024 * 1024) // 10 MB
        .with_prediction_window(100) // Analyze last 100 accesses
        .with_bandwidth_limit(50 * 1024 * 1024); // 50 MB/s max

    // Prefetch configuration is applied to backends automatically

    Ok(())
}

Retry Configuration

use oxigdal_cloud::retry::{RetryConfig, RetryStrategy};
use std::time::Duration;

fn configure_retry() -> RetryConfig {
    RetryConfig::new()
        .with_strategy(RetryStrategy::ExponentialBackoff {
            initial_delay: Duration::from_millis(100),
            max_delay: Duration::from_secs(30),
            multiplier: 2.0,
        })
        .with_max_retries(5)
        .with_jitter(0.1)
}

Usage

Basic Backend Operations

use oxigdal_cloud::backends::{S3Backend, CloudStorageBackend};

#[tokio::main]
async fn main() -> oxigdal_cloud::Result<()> {
    let backend = S3Backend::new("bucket", "prefix");

    // Get object
    let data = backend.get("file.tif").await?;

    // Put object
    backend.put("output.tif", &data).await?;

    // Delete object
    backend.delete("temp.tmp").await?;

    // Check existence
    let exists = backend.exists("file.tif").await?;
    println!("File exists: {}", exists);

    // List objects with prefix
    let objects = backend.list_prefix("data/").await?;
    for obj in objects {
        println!("Object: {}", obj);
    }

    Ok(())
}

Azure Blob Storage

use oxigdal_cloud::backends::{AzureBlobBackend, CloudStorageBackend};

#[tokio::main]
async fn main() -> oxigdal_cloud::Result<()> {
    let backend = AzureBlobBackend::new("myaccount", "mycontainer")
        .with_prefix("data/blobs")
        .with_sas_token("?sv=2020-08-04&ss=b");

    let data = backend.get("file.tif").await?;

    Ok(())
}

Google Cloud Storage

use oxigdal_cloud::backends::{GcsBackend, CloudStorageBackend};

#[tokio::main]
async fn main() -> oxigdal_cloud::Result<()> {
    let backend = GcsBackend::new("my-bucket")
        .with_prefix("data/objects")
        .with_project_id("my-project");

    let data = backend.get("file.tif").await?;

    Ok(())
}

HTTP/HTTPS Backend

use oxigdal_cloud::backends::http::{HttpBackend, HttpAuth};

#[tokio::main]
async fn main() -> oxigdal_cloud::Result<()> {
    let backend = HttpBackend::new("https://example.com/data")
        .with_auth(HttpAuth::Bearer {
            token: "token".to_string(),
        })
        .with_header("X-Custom-Header", "value");

    let data = backend.get("file.tif").await?;

    Ok(())
}

Error Handling

The crate follows the "no unwrap" policy. All operations return Result<T, CloudError>:

use oxigdal_cloud::{CloudBackend, CloudError};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    match CloudBackend::from_url("s3://bucket/file.tif") {
        Ok(backend) => {
            match backend.get().await {
                Ok(data) => println!("Downloaded {} bytes", data.len()),
                Err(CloudError::NotFound { key }) => println!("File not found: {}", key),
                Err(CloudError::Timeout { .. }) => println!("Operation timed out"),
                Err(e) => println!("Error: {}", e),
            }
        }
        Err(CloudError::InvalidUrl { url }) => println!("Invalid URL: {}", url),
        Err(e) => println!("Error: {}", e),
    }

    Ok(())
}

API Overview

Core Modules

Module Description
backends Cloud storage backend implementations (S3, Azure, GCS, HTTP)
auth Authentication providers (OAuth2, service accounts, API keys)
cache Multi-level caching with memory and disk tiers
prefetch Intelligent prefetching with pattern analysis
retry Retry strategies with exponential backoff
error Comprehensive error types

Key Types

CloudBackend

Unified abstraction over multiple cloud providers:

  • CloudBackend::from_url() - Create from URL string
  • backend.get() - Download object
  • backend.put() - Upload object
  • backend.exists() - Check object existence

CloudStorageBackend Trait

Implemented by all backend types:

  • get(key) - Retrieve object
  • put(key, data) - Store object
  • delete(key) - Remove object
  • exists(key) - Check existence
  • list_prefix(prefix) - List objects

CacheConfig

Configure multi-level cache:

  • with_max_memory_size() - Memory tier size
  • with_cache_dir() - Disk cache location
  • with_compression_threshold() - Compression trigger size

RetryConfig

Configure retry behavior:

  • with_strategy() - Retry strategy (exponential backoff, etc.)
  • with_max_retries() - Maximum retry attempts
  • with_jitter() - Jitter factor for backoff

Performance

Performance characteristics on modern hardware (AWS EC2 m7i.xlarge, 100 Mbps network):

Operation Time Throughput
S3 GET (1 MB) ~50ms 20 MB/s
S3 PUT (1 MB) ~60ms 16.7 MB/s
Azure GET (1 MB) ~55ms 18.2 MB/s
Cache HIT (1 MB) <1ms >1 GB/s
Cache MISS (1 MB) ~50ms 20 MB/s

With prefetching enabled:

  • Sequential reads: 50-80% faster
  • Random access: 20-40% improvement
  • Cache hit rate: 60-85% typical

Examples

See the tests directory for integration test examples:

  • S3 Backend: Basic creation and configuration
  • Azure Backend: Container and account setup
  • GCS Backend: Bucket and project configuration
  • HTTP Backend: URL and authentication setup
  • URL Parsing: Multi-cloud abstraction usage

Run tests with:

# All tests
cargo test --all-features

# Specific feature tests
cargo test --features s3,cache,retry

Documentation

Full documentation is available at docs.rs.

For OxiGDAL integration, see the main repository.

Error Handling

This library follows the "no unwrap" policy. All fallible operations return Result<T, E> with descriptive error types:

pub enum CloudError {
    Io(IoError),
    S3(S3Error),
    Azure(AzureError),
    Gcs(GcsError),
    Http(HttpError),
    Auth(AuthError),
    Retry(RetryError),
    Cache(CacheError),
    InvalidUrl { url: String },
    UnsupportedProtocol { protocol: String },
    NotFound { key: String },
    PermissionDenied { message: String },
    Timeout { message: String },
    RateLimitExceeded { message: String },
    InvalidConfiguration { message: String },
    NotSupported { operation: String },
}

Security Considerations

  • Credentials Management: Use environment variables or credential files rather than hardcoding
  • HTTPS Only: HTTP backend strongly recommended for production over HTTPS
  • SAS/Access Tokens: Azure SAS tokens and API keys should be stored securely
  • IAM Roles: Prefer IAM roles over hardcoded credentials on cloud VMs
  • Encryption: Data in transit uses TLS 1.3, at-rest encryption depends on cloud provider settings

Contributing

Contributions are welcome! Please ensure:

  • No unwrap() calls in production code
  • 100% Pure Rust (no C/Fortran dependencies)
  • Comprehensive error handling
  • Tests for new features
  • Documentation with examples

License

This project is licensed under Apache-2.0.

Related Projects

  • OxiGDAL - Geospatial data abstraction layer
  • OxiBLAS - Pure Rust BLAS operations
  • OxiFFT - Pure Rust FFT library
  • SciRS2 - Scientific computing ecosystem

Part of the COOLJAPAN ecosystem - Pure Rust geospatial and scientific computing.