easy-async-cl3
A high-level, async-first Rust wrapper for OpenCL with intelligent multi-device management and declarative task execution.
Overview
easy-async-cl3 provides a modern, ergonomic interface to OpenCL that embraces Rust's async/await paradigm. The library automatically manages resources, distributes work across multiple devices, and provides compile-time safety guarantees.
Key Features
- Async/Await Integration: All GPU operations return futures for seamless async workflows
- Automatic Multi-Device Support: Intelligent work distribution across multiple GPUs based on device capabilities
- Type-Safe API: Compile-time guarantees prevent common errors (e.g., using unbuilt programs)
- Declarative Task Building: Fluent builder pattern for constructing GPU tasks
- Zero-Cost Abstractions: RAII-based resource management with no runtime overhead
- Comprehensive OpenCL Support: Full support for OpenCL 1.1 through 3.0 features including Pipes, SVM, and Images
- Built-in Profiling: Optional performance measurement with negligible overhead
Installation
Add this to your Cargo.toml:
[]
= "0.1"
= { = "1", = ["macros", "rt-multi-thread"] }
Quick Start
use ;
async
Advanced Features
Multi-Device Execution
The library automatically detects and utilizes all available compute devices:
let executor = new_best_platform_with_options?; // Enable profiling
let report = executor.create_task
.arg_buffer
.global_work_dims
.run
.await?;
println!;
Shared Virtual Memory (OpenCL 2.0+)
Zero-copy memory sharing between CPU and GPU:
let mut svm_buffer = executor.?;
executor.create_task
.arg_svm
.global_work_dims
.run
.await?;
// Direct CPU access without explicit copy
let queue = &executor.get_queues;
let mapped = svm_buffer.map_mut?;
println!;
Image Processing
Native support for OpenCL images with hardware-accelerated filtering:
use ;
let format = rgba_unorm_int8;
let desc = ClImageDesc ;
let image = executor.create_image?;
executor.create_task
.arg_image
.global_work_dims
.run
.await?;
Pipes for Inter-Kernel Communication (OpenCL 2.0+)
Stream data between kernels without CPU involvement:
use ClPipe;
let pipe = new?;
// Producer writes to pipe
executor.create_task
.arg_pipe
.global_work_dims
.run
.await?;
// Consumer reads from pipe
executor.create_task
.arg_pipe
.global_work_dims
.run
.await?;
Architecture
The library is structured in three main layers:
- AsyncExecutor: High-level interface managing platforms, devices, and command queues
- TaskBuilder: Declarative API for constructing and executing GPU tasks
- CL Types: Type-safe wrappers around OpenCL objects (buffers, images, kernels, etc.)
Work is automatically distributed across available devices based on their compute capabilities and memory capacity.
Documentation
- API Documentation - Complete API reference
- Examples - Comprehensive usage examples
- OpenCL Specification - Kernel programming reference
Requirements
- Rust 1.70 or later
- OpenCL runtime (provided by GPU vendor drivers)
- Tokio async runtime
Use Cases
- High-performance scientific computing
- Real-time image and video processing
- Machine learning inference and training
- Cryptographic operations
- Financial modeling and simulations
- Parallel data analytics
Contributing
Contributions are welcome. Please ensure all tests pass and follow the existing code style.
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Acknowledgments
Built on the cl3 library.