pyo3-dlpack
Zero-copy DLPack tensor interop for PyO3.
This crate provides a safe and ergonomic way to exchange tensor data between Rust and Python ML frameworks (PyTorch, JAX, TensorFlow, CuPy, etc.) using the DLPack protocol.
Features
- Zero-copy: Tensors are shared directly without copying data
- PyO3 0.27+: Uses the modern API (no deprecation warnings)
- Bidirectional: Import tensors from Python and export tensors to Python
- Device-agnostic: Works with CPU, CUDA, ROCm, and other devices
Installation
Add to your Cargo.toml:
[]
= "0.1"
= "0.27"
Usage
Importing a tensor from Python
use *;
use PyTensor;
Exporting a tensor to Python
use *;
use ;
use c_void;
Python side:
# Call your Rust function that returns a DLPack capsule
=
# Convert to PyTorch tensor (zero-copy)
=
Supported Data Types
- Float: f16, f32, f64, bf16
- Integer: i8, i16, i32, i64
- Unsigned: u8, u16, u32, u64
- Boolean
Supported Devices
- CPU
- CUDA
- CUDA Host (pinned memory)
- ROCm
- Metal
- Vulkan
- And more (see
DLDeviceType)
Performance
DLPack enables true zero-copy tensor sharing. Benchmark results on Apple M3:
| Operation | Time | vs Copy |
|---|---|---|
| DLPack capsule export (1M f32) | 8.3 µs | 7.3x faster |
| DLPack capsule import (1M f32) | 7.9 µs | 7.7x faster |
| Vec clone baseline (1M f32) | 60.9 µs | - |
The DLPack overhead is constant regardless of tensor size - only metadata is processed, not the actual data. This makes it ideal for large tensors where copying would be expensive.
# Rust criterion benchmarks (cargo bench)
export_capsule_1k time: [155.44 ns 159.74 ns 166.84 ns]
export_capsule_1m time: [7.71 µs 8.26 µs 8.89 µs]
import_capsule_1m time: [7.44 µs 7.89 µs 8.41 µs]
vec_clone_1m time: [60.45 µs 60.90 µs 61.38 µs]
Run benchmarks yourself:
make bench-rust- Rust criterion benchmarksmake bench-python- Python benchmarksmake bench- All benchmarks
Testing
Validate correctness and zero-copy behavior:
make test- Unit + integration tests (105 tests)- Tests verify data pointers are preserved across transfers
Python environment
The test module is built with maturin using the same interpreter as tests.
Override it with PYTHON=/path/to/python if needed (e.g., a venv).
Default tests include PyTorch (pip install -e ".[test]"). For CI or lightweight runs, use pip install -e ".[test-lite]".
License
Licensed under the MIT license. See LICENSE for details.