Skip to main content

Crate uni_plugin_pyo3

Crate uni_plugin_pyo3 

Source
Expand description

PyO3 live-callable plugin loader for uni-db.

This crate bridges Python callables held in the host process to the uni_plugin trait surfaces. Plugins are session-scoped by default (Scope::Session) and run at host process privilege (no sandbox). Capabilities are declared metadata and enforced at the PluginRegistrar gate; there is no structural sandbox layer for PyO3 because the callable is a live Python object in the host process.

§Crate layout

  • errorPyPluginError, the structured error type. From<PyErr> captures Python tracebacks under the GIL.
  • arrow_bridge — Arrow ↔ PyArrow zero-copy via the Arrow PyCapsule Interface. No pyo3-arrow dependency.
  • Manifest / loader / adapters land in later M8 sub-milestones.

§Loader execution model

Two GIL strategies map onto the proposal’s two modes:

  • Vectorized (vectorized=True): one Python::with_gil per RecordBatch. Each input column is marshaled to a pyarrow Array via the PyCapsule protocol (zero-copy); the user fn runs once per batch and returns a pyarrow Array; the result is marshaled back to Arrow. Recommended ceiling: ~5M+ rows/sec on trivial Float64 fns over 8192-row batches.

  • Row-by-row (vectorized=False): one Python::with_gil per batch still (we hold the GIL across the rows in a batch — design decision #6 in plans/magical-rolling-pinwheel.md); inside the closure the host iterates rows and calls the Python fn once per row with native PyObject args. Approximate ceiling: ~100k rows/sec.

Both modes serialize on the GIL, so a multi-partition DataFusion scan with a PyO3 UDF collapses to single-core throughput. This is the dominant operational concern with PyO3 UDFs and is documented in the proposal at §5.4.1. Mitigations (sub-interpreter parallelism, free-threading) are deferred to follow-up milestones.

Modules§

error
Error types for the PyO3 loader.

Enums§

PyPluginError
Errors specific to the PyO3 loader.