Crate oxigdal_cluster

Expand description

OxiGDAL Cluster - Distributed geospatial processing at scale.

This crate provides a comprehensive distributed computing framework for geospatial data processing, including:

Distributed Scheduler: Work-stealing scheduler with priority queues and dynamic load balancing
Task Graph Engine: DAG-based task dependencies with parallel execution planning
Worker Pool Management: Worker registration, health monitoring, and automatic failover
Data Locality Optimization: Minimize data transfer through intelligent task placement
Fault Tolerance: Task retry, speculative execution, and checkpointing
Distributed Cache: Cache coherency with compression and distributed LRU
Data Replication: Quorum-based reads/writes with automatic re-replication
Cluster Coordinator: Raft-based consensus and leader election
Advanced Scheduling: Gang scheduling, fair-share, deadline-based, and multi-queue scheduling
Resource Management: Quota management, resource reservation, and usage accounting
Network Optimization: Topology-aware scheduling, bandwidth tracking, and congestion control
Workflow Engine: Complex workflow orchestration with conditional execution and loops
Autoscaling: Dynamic cluster sizing based on load metrics and predictive scaling
Monitoring & Alerting: Real-time metrics, custom alerts, and anomaly detection
Security & Access Control: Authentication, RBAC, audit logging, and secret management

§Examples

§Basic Cluster Setup

use oxigdal_cluster::{Cluster, ClusterBuilder, SchedulerConfig};

#[tokio::main]
async fn main() -> oxigdal_cluster::Result<()> {
    // Create a cluster with default configuration
    let cluster = Cluster::new();

    // Start the cluster
    cluster.start().await?;

    // Get cluster statistics
    let stats = cluster.get_statistics();
    println!("Active workers: {}", stats.worker_pool.total_workers);

    // Stop the cluster
    cluster.stop().await?;
    Ok(())
}

§Custom Cluster Configuration

use oxigdal_cluster::{ClusterBuilder, SchedulerConfig, LoadBalanceStrategy};
use oxigdal_cluster::worker_pool::WorkerPoolConfig;
use std::time::Duration;

#[tokio::main]
async fn main() -> oxigdal_cluster::Result<()> {
    // Configure the scheduler
    let scheduler_config = SchedulerConfig {
        max_queue_size: 10000,
        work_steal_threshold: 10,
        scheduling_interval: Duration::from_millis(100),
        load_balance_strategy: LoadBalanceStrategy::LeastLoaded,
        task_timeout: Duration::from_secs(300),
        enable_work_stealing: true,
        enable_backpressure: true,
        max_concurrent_tasks_per_worker: 1000,
    };

    // Configure the worker pool
    let worker_config = WorkerPoolConfig {
        heartbeat_timeout: Duration::from_secs(30),
        health_check_interval: Duration::from_secs(10),
        max_unhealthy_duration: Duration::from_secs(120),
        min_workers: 1,
        max_workers: 100,
    };

    // Build the cluster with custom configuration
    let cluster = ClusterBuilder::new()
        .with_scheduler_config(scheduler_config)
        .with_worker_pool_config(worker_config)
        .build();

    cluster.start().await?;

    // Your cluster operations here

    cluster.stop().await?;
    Ok(())
}

Re-exports§

pub use autoscale::AutoscaleConfig;
pub use autoscale::AutoscaleStats;
pub use autoscale::Autoscaler;
pub use autoscale::MetricsSnapshot as AutoscaleMetrics;
pub use autoscale::ScaleDecision;
pub use cache_coherency::CacheConfig;
pub use cache_coherency::CacheKey;
pub use cache_coherency::DistributedCache;
pub use coordinator::ClusterCoordinator;
pub use coordinator::CoordinatorConfig;
pub use coordinator::NodeId;
pub use coordinator::NodeRole;
pub use data_locality::DataLocalityOptimizer;
pub use data_locality::LocalityConfig;
pub use data_locality::PlacementRecommendation;
pub use error::ClusterError;
pub use error::Result;
pub use fault_tolerance::Bulkhead;
pub use fault_tolerance::BulkheadConfig;
pub use fault_tolerance::BulkheadRegistry;
pub use fault_tolerance::BulkheadStats;
pub use fault_tolerance::CircuitBreaker;
pub use fault_tolerance::CircuitBreakerConfig;
pub use fault_tolerance::CircuitBreakerRegistry;
pub use fault_tolerance::CircuitBreakerStats;
pub use fault_tolerance::CircuitState;
pub use fault_tolerance::Deadline;
pub use fault_tolerance::DegradationConfig;
pub use fault_tolerance::DegradationLevel;
pub use fault_tolerance::DegradationManager;
pub use fault_tolerance::DegradationStats;
pub use fault_tolerance::FaultToleranceConfig;
pub use fault_tolerance::FaultToleranceManager;
pub use fault_tolerance::FaultToleranceStatistics;
pub use fault_tolerance::HealthCheck;
pub use fault_tolerance::HealthCheckConfig;
pub use fault_tolerance::HealthCheckManager;
pub use fault_tolerance::HealthCheckResult;
pub use fault_tolerance::HealthCheckStats;
pub use fault_tolerance::HealthStatus;
pub use fault_tolerance::RequestPriority;
pub use fault_tolerance::RetryDecision;
pub use fault_tolerance::TimeoutBudget;
pub use fault_tolerance::TimeoutConfig;
pub use fault_tolerance::TimeoutManager;
pub use fault_tolerance::TimeoutRegistry;
pub use fault_tolerance::TimeoutStats;
pub use metrics::ClusterMetrics;
pub use metrics::MetricsSnapshot;
pub use metrics::WorkerMetrics;
pub use monitoring::Alert;
pub use monitoring::AlertRule;
pub use monitoring::AlertSeverity;
pub use monitoring::MonitoringManager;
pub use monitoring::MonitoringStats;
pub use network::BandwidthTracker;
pub use network::CompressionManager;
pub use network::CongestionController;
pub use network::TopologyManager;
pub use replication::ReplicaSet;
pub use replication::ReplicationConfig;
pub use replication::ReplicationManager;
pub use resources::AccountingManager;
pub use resources::QuotaManager;
pub use resources::ReservationManager;
pub use scheduler::LoadBalanceStrategy;
pub use scheduler::Scheduler;
pub use scheduler::SchedulerConfig;
pub use scheduler::SchedulerStats;
pub use security::Role;
pub use security::SecurityManager;
pub use security::SecurityStats;
pub use security::User;
pub use task_graph::ExecutionPlan;
pub use task_graph::ResourceRequirements;
pub use task_graph::Task;
pub use task_graph::TaskGraph;
pub use task_graph::TaskId;
pub use task_graph::TaskStatus;
pub use worker_pool::SelectionStrategy;
pub use worker_pool::Worker;
pub use worker_pool::WorkerCapabilities;
pub use worker_pool::WorkerCapacity;
pub use worker_pool::WorkerId;
pub use worker_pool::WorkerPool;
pub use worker_pool::WorkerStatus;
pub use worker_pool::WorkerUsage;
pub use workflow::Workflow;
pub use workflow::WorkflowEngine;
pub use workflow::WorkflowExecution;
pub use workflow::WorkflowStatus;

Modules§

autoscale: Cluster autoscaling for dynamic resource management.
cache_coherency: Distributed cache with coherency protocol.
coordinator: Cluster coordinator with leader election and membership management.
data_locality: Data locality optimization for minimizing data transfer.
error: Error types for the oxigdal-cluster crate.
fault_tolerance: Advanced fault tolerance system for handling failures and recovery.
metrics: Metrics collection and monitoring for the cluster.
monitoring: Advanced monitoring and alerting for cluster management.
network: Network optimization for distributed computing.
replication: Data replication for reliability and availability.
resources: Resource management modules for quota, reservation, and accounting.
scheduler: Distributed task scheduler modules.
security: Security and access control for cluster operations.
task_graph: Task graph engine for managing task dependencies and execution order.
worker_pool: Worker pool management for the cluster.
workflow: Workflow orchestration engine for complex distributed tasks.

Structs§

Cluster: Complete cluster instance.
ClusterBuilder: Cluster builder for easy setup.
ClusterStatistics: Cluster-wide statistics.

Crate oxigdal_cluster

Crate oxigdal_cluster Copy item path

§Examples

§Basic Cluster Setup

§Custom Cluster Configuration

Re-exports§

Modules§

Structs§

Crate oxigdal_cluster