Atlas Common
⚠️ Disclaimer: This project is currently in active development. The code is not stable and not intended for use in production environments. Interfaces, features, and behaviors are subject to change without notice.
Core functionality for machine learning provenance tracking with C2PA (Coalition for Content Provenance and Authenticity) support.
Atlas Common provides essential building blocks for creating content authenticity systems that track the provenance of machine learning models, datasets, and related assets throughout their lifecycle.
Features
- 🔐 Cryptographic Hashing: SHA-256/384/512 with constant-time comparison
- 📋 C2PA Metadata: Types and utilities for C2PA manifest management
- 💾 Storage Abstractions: Backend-agnostic storage interfaces
- ✅ Validation: Validation for manifests, URNs, and hashes
- 🛡️ Secure File Operations: Protection against symlink and hardlink attacks
- ⚡ Async Support: Optional async/await for storage operations
Installation
Add this to your Cargo.toml:
[]
= "0.1.0"
Feature Flags
hash(default): Cryptographic hash functionsc2pa(default): C2PA manifest and asset typesstorage: Storage backend abstractionsvalidation: Validation utilitiesfile-utils: Secure file operation utilitiesasync: Async support for storage operationsfull: Enable all features
To use specific features:
[]
= { = "0.1.0", = ["all"] }
Quick Start
Hashing
use ;
// Calculate hash with default algorithm (SHA-384)
let data = b"important data";
let hash = calculate_hash;
// Verify hash
assert!;
// Use specific algorithm
let sha256_hash = calculate_hash_with_algorithm;
// Hardware-optimized hashing for large data
let optimized_hash = data.hash_optimized;
Hardware Optimization
Atlas Common includes hardware-optimized hashing implementations that automatically detect and utilize available CPU features:
- Intel Xeon: SHA-NI extensions and AVX-512 parallel processing
- Apple Silicon: ARM crypto extensions
- Multi-core systems: Parallel processing for large datasets
Optimizations are automatically selected at runtime based on available hardware and data size. Use hash_optimized() methods or the BatchHasher for optimal performance with large files or multiple inputs.
C2PA Manifests
use ;
// Create a manifest ID
let manifest_id = new;
println!;
// Create manifest metadata
let metadata = ManifestMetadata ;
Asset Type Detection
use ;
use Path;
let model_path = new;
let asset_type = determine_asset_type?;
// Returns AssetType::ModelOnnx
Secure File Operations
use ;
use ;
// Safely create a file (blocks symlink attacks)
let mut file = safe_create_file?;
file.write_all?;
// Safely read a file
let mut file = safe_open_file?;
let mut contents = Stringnew;
file.read_to_string?;
Validation
use ;
// Validate a manifest ID
validate_manifest_id?;
// Ensure proper URN format
let urn = ensure_c2pa_urn;
assert!;
Advanced Usage
Incremental Hashing
use ;
let mut builder = new;
builder.update;
builder.update;
builder.update;
let hash = builder.finalize;
Hash Trait
use ;
let text = "Hello, World!";
let hash = text.hash;
let bytes = b"raw bytes";
let hash2 = bytes.hash_default; // Uses SHA-384
Storage Backend
use ;
let config = StorageConfig ;
Examples
The repository includes several examples demonstrating various features:
basic_hashing- Hash operations and verificationc2pa_manifest- Working with C2PA manifestsfull_example- Complete demonstration of all features
Run examples with:
Benchmarks
Performance benchmarks are available for hash operations:
Security Considerations
- Constant-time comparison: Hash verification uses constant-time comparison to prevent timing attacks
- Path validation: File operations validate paths to prevent symlink and hardlink attacks
- Input validation: All inputs are validated to prevent injection attacks
- Secure defaults: SHA-384 is the default hash algorithm for optimal security/performance balance
Supported Formats
Model Formats
- TensorFlow:
.pb,.savedmodel,.tf - PyTorch:
.pt,.pth,.pytorch - ONNX:
.onnx - OpenVINO:
.bin,.xml - Keras/HDF5:
.h5,.keras,.hdf5
Dataset Formats
- Tabular:
.csv,.tsv,.txt - JSON:
.json,.jsonl - Big Data:
.parquet,.orc,.avro - TensorFlow:
.tfrecord,.tfrec - NumPy:
.npy,.npz
License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request