oxigdal-cache-advanced 0.1.3

Advanced multi-tier caching with predictive prefetching and ML-based optimization for OxiGDAL
Documentation

OxiGDAL Cache Advanced

Crates.io Documentation License

Advanced multi-tier caching system for OxiGDAL with ML-powered predictive prefetching, adaptive compression, and distributed cache support. Achieves high hit rates through intelligent access pattern learning and automatic data promotion/demotion across memory, SSD, and network tiers.

Features

  • Multi-Tier Architecture: Automatic data promotion/demotion across L1 (memory), L2 (SSD), and L3 (network/disk)
  • Predictive Prefetching: ML-based access pattern learning including Markov chains, neural networks, and Transformer models
  • Adaptive Compression: Intelligent compression selection (LZ4, Zstd, Snappy) based on data types and patterns
  • Advanced Eviction Policies: LRU, ARC, and W-TinyLFU eviction strategies per tier
  • Cache Coherency: Multi-node cache coherency protocols with write-through and write-back policies
  • Analytics & Observability: Detailed statistics, hit rate tracking, and distributed tracing support
  • Async-First Design: Built with Tokio for high-performance non-blocking operations
  • Pure Rust: 100% Pure Rust implementation with no C/Fortran dependencies

Installation

Add to your Cargo.toml:

[dependencies]
oxigdal-cache-advanced = "0.1"
bytes = "1"
tokio = { version = "1", features = ["full"] }

Quick Start

use oxigdal_cache_advanced::{
    CacheConfig, MultiTierCache,
    compression::DataType,
};
use bytes::Bytes;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create cache with default configuration
    let config = CacheConfig {
        l1_size: 128 * 1024 * 1024,        // 128 MB
        l2_size: 1024 * 1024 * 1024,       // 1 GB
        l3_size: 10 * 1024 * 1024 * 1024,  // 10 GB
        enable_compression: true,
        enable_prefetch: true,
        enable_distributed: false,
        cache_dir: None,
    };

    let cache = MultiTierCache::new(config).await?;

    // Store data in cache
    let key = "my_data".to_string();
    let data = Bytes::from("important cached data");
    cache.put(&key, data, DataType::Text).await?;

    // Retrieve from cache (automatic tier promotion on access)
    if let Some(value) = cache.get(&key).await? {
        println!("Cache hit: {:?}", value.data);
    }

    // Get cache statistics
    let stats = cache.stats().await;
    println!("Hit rate: {:.2}%", stats.hit_rate());
    println!("Items: {}", stats.item_count);

    Ok(())
}

Usage

Basic Cache Operations

use oxigdal_cache_advanced::{MultiTierCache, CacheConfig, compression::DataType};
use bytes::Bytes;

let cache = MultiTierCache::new(CacheConfig::default()).await?;

// Put data into cache
cache.put(&"key1".to_string(), Bytes::from("data1"), DataType::Binary).await?;

// Get data from cache (promotes from lower tiers to higher ones)
let value = cache.get(&"key1").await?;

// Check if key exists (doesn't update access statistics)
let exists = cache.contains(&"key1").await;

// Remove from cache
cache.delete(&"key1").await?;

// Get aggregated statistics across all tiers
let stats = cache.stats().await;
println!("Hits: {}, Misses: {}, Hit Rate: {:.2}%",
    stats.hits, stats.misses, stats.hit_rate());

Predictive Prefetching

use oxigdal_cache_advanced::predictive::{MarkovPredictor, Prediction, AccessRecord, AccessType};
use chrono::Utc;

// Create predictor with Markov chain order 2
let mut predictor = MarkovPredictor::new(2);

// Record access patterns
let records = vec![
    AccessRecord {
        key: "tile_0".to_string(),
        timestamp: Utc::now(),
        access_type: AccessType::Read,
    },
    AccessRecord {
        key: "tile_1".to_string(),
        timestamp: Utc::now(),
        access_type: AccessType::Read,
    },
];

predictor.learn_from_records(&records);

// Make predictions with confidence scores
let predictions = predictor.predict("tile_1", 5, 0.6)?;
for pred in predictions {
    if pred.is_confident(0.6) {
        println!("Predict access to {} with {:.2}% confidence",
            pred.key, pred.confidence * 100.0);
    }
}

Adaptive Compression

use oxigdal_cache_advanced::compression::{AdaptiveCompressor, DataType};
use bytes::Bytes;

let compressor = AdaptiveCompressor::new();

// Compress data with adaptive algorithm selection
let data = Bytes::from("repetitive data...".repeat(100));
let compressed = compressor.compress(&data, DataType::Text)?;

println!("Compression ratio: {:.2}%",
    (compressed.compressed_size as f64 / data.len() as f64) * 100.0);

// Decompress transparently
let decompressed = compressor.decompress(&compressed)?;
assert_eq!(decompressed, data);

Cache Warming

use oxigdal_cache_advanced::warming::CacheWarmer;
use bytes::Bytes;

let cache = MultiTierCache::new(CacheConfig::default()).await?;
let warmer = CacheWarmer::new(cache.clone());

// Pre-load frequently accessed data into cache
let keys_to_warm: Vec<String> = vec!["hot_data_1".into(), "hot_data_2".into()];
warmer.warm_keys(&keys_to_warm, None).await?;

println!("Cache warming complete");

Multi-Tier Statistics

let cache = MultiTierCache::new(CacheConfig::default()).await?;

// Per-tier statistics
let l1_stats = cache.tier_stats(CacheTierType::L1).await?;
let l2_stats = cache.tier_stats(CacheTierType::L2).await?;
let l3_stats = cache.tier_stats(CacheTierType::L3).await?;

println!("L1 (Memory) - Hits: {}, Misses: {}", l1_stats.hits, l1_stats.misses);
println!("L2 (SSD) - Size: {} bytes", l2_stats.bytes_stored);
println!("L3 (Network) - Items: {}", l3_stats.item_count);

Cache Coherency

use oxigdal_cache_advanced::coherency::{CacheCoherencyManager, CoherencyProtocol};

let coherency = CacheCoherencyManager::new(CoherencyProtocol::MESI)?;

// Update data with coherency guarantee
coherency.update(&"shared_key".into(), updated_data, None).await?;

// Invalidate across all nodes
coherency.invalidate(&"shared_key".into()).await?;

// Check coherency status
let status = coherency.status(&"shared_key".into()).await?;
println!("Data is {} consistent", if status.consistent { "strongly" } else { "weakly" });

Advanced Prediction Models

use oxigdal_cache_advanced::predictive::advanced::{NeuralNetworkPredictor, Embedding};

// Initialize neural network with embedding dimension 64
let nn_predictor = NeuralNetworkPredictor::new(1000, 64)?;

// Train on access patterns
let embeddings = vec![
    Embedding::random(64),
    Embedding::random(64),
];
nn_predictor.update_embeddings(&["key1".into(), "key2".into()], &embeddings)?;

// Get predictions from neural network
let predictions = nn_predictor.predict(&"key1".into(), 5)?;
for pred in predictions {
    println!("NN Prediction: {} (confidence: {:.2}%)",
        pred.key, pred.confidence * 100.0);
}

API Overview

Module Description
multi_tier Multi-tier cache implementation with L1/L2/L3 tiers and automatic promotion
predictive Access pattern learning and ML-based prediction models
compression Adaptive compression with LZ4, Zstd, and Snappy algorithms
eviction Eviction policies: LRU, ARC, W-TinyLFU
coherency Cache coherency protocols for distributed environments
write_policy Write-through and write-back policy implementations
tiering Tier management and data migration logic
warming Cache warming and preloading strategies
partitioning Data partitioning for distributed cache
analytics Cache analytics and performance tracking
distributed Distributed cache protocol and communication
observability Tracing and observability integration

Core Types

  • CacheConfig: Configuration for cache sizes and features
  • CacheValue: Cached data with metadata (timestamps, access count)
  • CacheStats: Aggregated statistics (hits, misses, evictions)
  • Prediction: ML prediction with confidence score
  • MultiTierCache: Main cache interface

Performance

Benchmarks on Apple M1 (8-core, 16GB RAM):

Operation Throughput
L1 Get (hit) ~2.5M ops/sec
L1 Put ~1.8M ops/sec
L2 Get (SSD) ~50K-100K ops/sec
Compression (LZ4) ~500MB/sec
Decompression ~1500MB/sec
Prediction (Markov) ~10K predictions/sec
Neural Network Prediction ~1K predictions/sec

Hit rate improvements with prefetching:

  • Baseline (no prefetch): 65-70%
  • With Markov predictor: 78-82%
  • With Neural Network: 84-88%
  • With Transformer model: 88-92%

Examples

The repository includes comprehensive examples:

  • tests/multi_tier_test.rs - Multi-tier cache operations
  • tests/predictive_test.rs - Predictive prefetching examples
  • tests/advanced_prediction_test.rs - Advanced ML model usage
  • tests/coherency_test.rs - Cache coherency patterns
  • tests/write_policy_test.rs - Write policy configurations
  • benches/cache_bench.rs - Performance benchmarks

Configuration

Default Configuration

CacheConfig {
    l1_size: 128 * 1024 * 1024,        // 128 MB (in-memory)
    l2_size: 1024 * 1024 * 1024,       // 1 GB (SSD)
    l3_size: 10 * 1024 * 1024 * 1024,  // 10 GB (network)
    enable_compression: true,           // Enable adaptive compression
    enable_prefetch: true,              // Enable ML prefetching
    enable_distributed: false,          // Disabled by default
    cache_dir: None,                    // System temp dir for L2
}

Custom Configuration

let config = CacheConfig {
    l1_size: 256 * 1024 * 1024,  // 256 MB
    l2_size: 2 * 1024 * 1024 * 1024,  // 2 GB
    l3_size: 50 * 1024 * 1024 * 1024, // 50 GB
    enable_compression: true,
    enable_prefetch: true,
    enable_distributed: true,
    cache_dir: Some(PathBuf::from("/var/cache/oxigdal")),
};

let cache = MultiTierCache::new(config).await?;

Error Handling

This library follows the "no unwrap" policy. All fallible operations return Result<T, CacheError>:

use oxigdal_cache_advanced::{Result, CacheError};

async fn cache_operation() -> Result<String> {
    let cache = MultiTierCache::new(CacheConfig::default()).await?;

    match cache.get(&"key".into()).await {
        Ok(Some(value)) => Ok(format!("Found: {:?}", value.data)),
        Ok(None) => Err(CacheError::KeyNotFound("key".into())),
        Err(e) => Err(e),
    }
}

Pure Rust

This library is 100% Pure Rust with no C/Fortran dependencies. All functionality works out of the box:

  • Compression algorithms (LZ4, Zstd, Snappy) are pure Rust implementations
  • ML models use Pure Rust numerical computation
  • Async runtime via Tokio (Pure Rust)
  • No external system dependencies

OxiGDAL Ecosystem

This project is part of the OxiGDAL ecosystem for geospatial data processing:

  • OxiGDAL-Core: Core geospatial data structures
  • OxiGDAL-Cache: Basic caching layer
  • OxiGDAL-Cache-Advanced: Advanced caching with ML (this crate)
  • OxiGDAL-Index: Spatial indexing for cached data

COOLJAPAN Policies

This project adheres to all COOLJAPAN development policies:

  • Pure Rust: No C/Fortran dependencies
  • No unwrap: All error handling via Result<T, E>
  • Latest Dependencies: Uses latest available versions on crates.io
  • Workspace: Uses workspace configuration for dependency management
  • Refactoring: All modules kept under 2000 lines

Documentation

Full API documentation is available at docs.rs.

Key documentation sections:

Testing

Run the comprehensive test suite:

# All tests
cargo test --all-features

# Specific test suite
cargo test multi_tier
cargo test predictive
cargo test coherency

# With logging
RUST_LOG=debug cargo test

# Benchmarks
cargo bench

Contributing

Contributions are welcome! Please ensure:

  • All tests pass: cargo test --all-features
  • No warnings: cargo clippy -- -D warnings
  • Code is formatted: cargo fmt
  • Documentation is complete: cargo doc --no-deps

Related Projects

License

Licensed under the Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0).

Acknowledgments

Developed as part of the COOLJAPAN ecosystem by Team Kitasan.


Part of the COOLJAPAN Pure Rust Ecosystem