🐙 Poulpy-HAL
Poulpy-HAL is a Rust crate that provides backend-agnostic layouts and trait-based low-level lattice arithmetic. This allows developers to implement lattice-based schemes generically, with the ability to plug in optimized backends (e.g. CPU, GPU, FPGA) at runtime.
The important design point is that the public API is centered on backend-native borrows rather than host byte slices. Shared crates should be written against *ToBackendRef / *ToBackendMut and the corresponding ...BackendRef / ...BackendMut view types. This remains true even for host backends: generic HAL-facing code should still go through ToBackendRef / ToBackendMut, not to_ref() / to_mut(). Host-view helpers are only escape hatches for explicitly host-side tasks.
Crate Organization
poulpy-hal/layouts
This module defines backend-agnostic layouts. There are two main categories: user-facing types and backend types. User-facing types, such as vec_znx, serve as both inputs and outputs of computations, while backend types, such as svp_ppol (a.k.a. scalar vector product prepared polynomial), are pre-processed, write-only types stored in a backend-specific representation for optimized evaluation. For example, in the FFT64 AVX2 CPU implementation, an svp_ppol (the prepared form of scalar_znx) is stored in the DFT domain with an AVX-optimized data ordering.
This module also provides helpers over these types, as well as serialization for the front-end types scalar_znx, vec_znx and mat_znx.
Backend Model
Each backend defines:
OwnedBuf: the backend-owned storage typeBufRef<'a>/BufMut<'a>: backend-native shared and mutable borrows
This means a layout like VecZnx<BE::OwnedBuf> is the owned form, while:
VecZnxBackendRef<'a, BE>is the shared backend-native borrowVecZnxBackendMut<'a, BE>is the mutable backend-native borrow
The generic adapter traits follow the same pattern:
VecZnxToBackendRef<BE>VecZnxToBackendMut<BE>VecZnxDftToBackendRef<BE>VecZnxDftToBackendMut<BE>SvpPPolToBackendRef<BE>SvpPPolToBackendMut<BE>VmpPMatToBackendRef<BE>VmpPMatToBackendMut<BE>- etc...
Host-visible code should construct HostBytesBackend views directly, either through backend-native *ToBackendRef/*ToBackendMut impls or the small *_host_backend_ref/mut helpers used by shared host utilities. Generic HAL compute code should still be written against backend views, not raw host slices.
Core Layouts
Module: stores backend-specific precomputations such as DFT tables and handles.ScalarZnx: front-end scalar polynomial layout, mainly used for secrets and small plaintexts. Generic code typically consumes it throughScalarZnxToBackendRef<BE>/ScalarZnxToBackendMut<BE>.VecZnx: front-end vector-of-polynomials layout used for LWE/GLWE plaintexts and ciphertexts. Precision is represented by limbs in base2^k. Generic execution usesVecZnxBackendRef/VecZnxBackendMutviaVecZnxToBackendRef<BE>/VecZnxToBackendMut<BE>.MatZnx: front-end matrix-of-polynomials layout, used for GGLWE and GGSW-style objects. Generic backends consume it throughMatZnxToBackendRef<BE>/MatZnxToBackendMut<BE>.VecZnxDft: backend-specific prepared-domain representation ofVecZnx. Its storage layout is backend-defined.VecZnxBig: backend-specific big-coefficient representation, typically used after multiplication or convolution and later normalized back intoVecZnx.SvpPPol: backend-specific prepared form ofScalarZnxfor scalar-vector products.VmpPMat: backend-specific prepared form ofMatZnxfor vector-matrix products.ScratchArena: backend-native scratch view over aScratchOwnedbuffer, used to carve typed temporary storage during execution.
poulpy-hal/api
This module provides the user-facing traits-based API of the hardware acceleration layer. These are the traits used to implement poulpy-core, poulpy-ckks, poulpy-bin-fhe, and any other crate built on Poulpy. These currently include the module instantiation, arithmetic over vec_znx, vec_znx_big, vec_znx_dft, svp_ppol, vmp_pmat and scratch space management.
At this layer, APIs are expected to be backend-generic. In practice that means:
- inputs and outputs are described via
*ToBackendRef/*ToBackendMut - prepared-domain objects (
VecZnxDft,SvpPPol,VmpPMat, convolution prepared types) are treated as opaque backend-owned storage - host-visible byte access is only required for explicitly host-side operations such as serialization, encoding, stats, or test/reference paths
poulpy-hal/oep
This module provides open extension points that can be implemented to provide a concrete backend to any crate built on poulpy-hal/api and poulpy-hal/layouts — including poulpy-core, poulpy-ckks, poulpy-bin-fhe, or any external project. Poulpy-HAL itself is dispatch-only: portable default implementations live in poulpy-cpu-ref, and accelerated backends (e.g. poulpy-cpu-avx) selectively override hot paths while inheriting everything else.
poulpy-hal/delegates
This module provides a link between the open extension points and public API, forwarding trait calls on Module<BE> to the matching per-family OEP trait implemented by BE (for example HalVecZnxImpl<BE>, HalVmpImpl<BE>, or HalConvolutionImpl<BE>).
Pipeline Example
flowchart TD
A[VecZnx] -->|DFT|B[VecZnxDft]-->E
C[ScalarZnx] -->|prepare|D[SvpPPol]-->E
E{SvpApply}-->VecZnxDft-->|IDFT|VecZnxBig-->|Normalize|VecZnx
E2E Dispatch Example
User-facing backend-native call:
use ;
use FFT64Avx;
let module = new;
module.vec_znx_add_into_backend;
Delegate in poulpy-hal:
Backend implementation (AVX keeps defaults unless it overrides):
unsafe
Default in poulpy-cpu-ref:
Host Views vs Backend Views
As a rule of thumb:
- use
*ToBackendRef/*ToBackendMutin public HAL-facing compute APIs, including when the backend itself is host-resident - treat
to_ref()/to_mut()as host-view escape hatches, not as the normal API for generic backend code
Examples of legitimate host-side use:
- serialization and deserialization
- encoding / decoding helpers
- reference arithmetic that directly manipulates
&[i64] - tests that compare host materialized values
Interfacing a device backend with the host should happen through backend transfer hooks such as from_host_bytes, to_host_bytes, copy_from_host, and copy_to_host, or through higher-level upload_* / download_* APIs built on top of them.
Examples of backend-native use:
VecZnx -> VecZnxDftScalarZnx -> SvpPPolMatZnx -> VmpPMat- pointwise ops in prepared domains
- backend scratch allocation and subview carving
Backend Interoperability
Backends are also expected to define how values move between host memory and backend-owned storage.
At the raw buffer level, every backend implements:
Backend::from_host_bytesBackend::to_host_bytesBackend::copy_from_hostBackend::copy_to_host
These are the fundamental upload/download hooks used to move layout storage across the host/backend boundary. For example:
let gpu_buf = from_host_bytes;
let roundtrip = to_host_bytes;
For cross-backend buffer transfer, poulpy-hal provides TransferFrom<From>. This is destination-owned: the destination backend declares how to import a source backend buffer.
The default implementation only covers simple host-resident Vec<u8> backends. Device backends are expected to add explicit impls for the source backends they support.
At the structured layout level, the canonical upload_* / download_* APIs live one layer above, in poulpy-core::api::ModuleTransfer. Those methods are built on top of TransferFrom and let modules move typed values such as GLWE, LWE, GGLWE, GGSW, and prepared keys between backends.
In practice:
- use
from_host_bytes/to_host_byteswhen you need a low-level buffer bridge - use
TransferFromwhen implementing backend-to-backend storage movement - use
ModuleTransfer::upload_*/download_*in higher-level code that moves full typed objects between backends
Tests
A fully generic cross-backend test suite is available in src/test_suite.