☁ Puff ☁
The deep stack framework.
- What is Puff?
What is Puff?
Puff is a batteries included "deep stack" for Python. It's an experiment to minimize the barrier between Python and Rust to unlock the full potential of high level languages. Build your own Runtime using standard CPython and extend it with Rust. Imagine if GraphQL, Postgres, Redis and PubSub were part of the standard library. That's Puff.
The old approach for integrating Rust in Python would be to make a Python package that uses rust and import it from Python. This approach has some flaws as the rust packages can't cooperate. Puff gives Rust its own layer, so you can build a cohesive set of tools in Rust that all work flawlessly together without having to re-enter Python.
High level overview is that Puff gives Python
- Greenlets on Rust's Tokio.
- High performance HTTP Server - combine Axum with Python WSGI apps (Flask, Django, etc.)
- Rust / Python natively in the same process, no sockets or serialization.
- AsyncIO / uvloop / ASGI integration with Rust
- An easy-to-use GraphQL service
- Multi-node pub-sub
- Rust level Redis Pool
- Rust level Postgres Pool
- Websockets
- HTTP Client
- Distributed, at-least-once, priority and scheduled task queue
- semi-compatible with Psycopg2 (hopefully good enough for most of Django)
- A safe convenient way to drop into rust for maximum performance
The idea is Rust and Python are near perfect complements to each other and building a framework to let them talk leads to greater efficiency in terms of productivity, scalability and performance.
| Python | Rust |
|---|---|
| ✅ High-Level | ✅ Low-Level |
| ✅ Lots of tools and packages | ✅ Lots of tools and packages |
| ✅ Easy to get started | ✅ Easy to get started |
| 🟡 Interpreted (productivity at the cost of speed) | 🟡 Compiled (speed at the cost of productivity) |
| ✅ Easy to get master | ❌ The learning curve gets steep quickly. |
| ✅ Fast iteration to prototype | ❌ Requires planning for correctness |
| ✅ Google a problem, copy paste, it works. | ❌ Less examples floating in the wild |
| ❌ Weak type system | ✅ Great Type System |
| ❌ GIL prevents threading | ✅ High Performance |
| ❌ Not-so safe | ✅ Safe |
The Zen of deepstack is recognizing that no language is the ultimate answer. Seek progress instead of perfection by using Python for rapid development and Rust to optimize the most critical paths once you find them later. Find the balance.
Quick Start
Puff requires Python >= 3.10, Rust / Cargo. Python's Poetry is optional.
Your Rust Puff Project needs to find your Python project. Even if they are in the same folder, they need to be added to the PYTHONPATH.
One way to set up a Puff project is like this:
And add the puff plugin for poetry.
[]
= "puff.poetry_plugins:run_cargo"
Now from my_puff_proj_py you can run your project with poetry run run_cargo to access cargo from poetry and expose the virtual environment to Puff.
The Python project doesn't need to be inside off your Rust package. It only needs to be on the PYTHONPATH. If you don't want to use poetry, you will have to set up a virtual environment if needed and set PYTHONPATH when running Puff.
Puff ♥ Python
Python programs in Puff are run by building a Program in Rust and registering the Python function there.
The Python method is bootstrapped and run as a greenlet in the Puff runtime.
use *;
use PythonCommand;
Python:
# Standard python functions run on Puff greenlets. You can only use special Puff async functions without pausing other greenlets.
=
= # Puff async function that runs in Tokio.
= # Python blocking that spawns a thread to prevent pausing the greenlet thread.
# 100% of python packages are compatible by wrapping them in blocking decorator.
return
Puff ♥ Rust
The primary feature of Puff is the seamless ability to go from python into Rust with little configuration.
This makes the full Rust ecosystem available to your Python program with very little integration overhead or performance degradation.
use *;
use PythonCommand;
// Use pyo3 to generate Python compatible Rust classes.
;
Python:
=
# Rust functions that execute async code need to be passed a special return function.
Puff ♥ Django
While it can run any WSGI app, Puff has a special affection for Django. Puff believes that business logic should be implemented on a higher level layer and Rust should be used as an optimization. Django is a perfect high level framework to use with Puff as it handles migrations, admin, etc. Puff mimics the psycopg2 drivers and cache so that Django uses the Puff Database and Redis pool.
Transform your sync Django project into a highly concurrent Puff program with a few lines of code. Puff wraps the management commands so migrate, etc. all work as expected. Simply run poetry run run_cargo django [command] instead of using ./manage.py [command]. For example poetry run run_cargo django migrate. Don't use django's dev server, instead use Puff's with poetry run run_cargo runserver.
use DjangoManagementCommand;
use PytestCommand;
use WSGIServerCommand;
use *;
Use Puff everywhere in your Django app. Even create Django management commands that use Rust!
See puff-py repo for a more complete Django example.
Puff ♥ Graphql
Puff exposes Graphql Mutations, Queries and Subscriptions based on Python Class definitions. A core "killer feature" of the Puff Graphql engine is that it works on a "layer base" instead of a Node base. This allows each step of Graphql to gather the complete data necessary to query all data it needs at once. This avoids the dreaded n+1 and dataloader overhead traditionally associated with GraphQL.
GrapqhQL python functions can pass off Pure SQL queries to Puff and puff will render and transform the query without needing to return to python. This allows the Python Graphql interface to be largely IO free, but still flexible to have access to Puff resources when needed.
=
=
:
:
:
:
:
:
# Extract column values from the previous layer to use in this one.
=
=
# returning a sql query along with 2 lists allow you to correlate and join the parent with the child.
return ..., , , ,
# Return a Raw query for Puff to execute in Postgres.
# The elipsis is a placeholder allowing the Python type system to know which Field type it should tranform into.
return ..., ,
=
# Return some normal Python objects.
return ...,
# Get a new connection identifier for pubsub.
return
# Authoritzation bearer token passed in the context
return
:
:
:
=
=
= 0
=
# Filter out messages from yourself.
yield
+= 1
:
:
:
Rust:
use ServerCommand;
use ;
use *;
Produces a Graphql Schema like so:

In addition to making it easier to write the fastest queries, a layer based design allows Puff to fully exploit the multithreaded async Rust runtime and solve branches independently. This gives you a performance advantages out of the box.
Puff ♥ Pytest
Integrate with pytest to easily test your Graphql and Puff apps. Simply add the PytestCommand to your Program and write tests as normal only run them with cargo run pytest
=
assert ==
=
=
assert in
assert not in
assert ==
assert == 3
Puff ♥ AsyncIO
Puff has built in integrations for ASGI and asyncio. You first need to configure the RuntimeConfig to use it. Puff will automatically use uvloop if installed when starting the event loop.
asgiref.sync.async_to_sync and asgiref.sync.sync_to_async have both been patched so that you can call puff greenlets from async or async from puff greenlets easily.
=
=
= await
return
Rust
use *;
use ASGIServerCommand;
// Use pyo3 to generate Python compatible Rust classes.
;
// Basic handler that responds with a static string
async
Puff ♥ Django + Graphql
Puff GraphQL integrates seamlessly with Django. Convert Django querysets to SQL to offload all computation to Rust. Or decorate with borrow_db_context and let Django have access to the GraphQL connection, allowing you fallback to the robustness of django for complicated lookups.
:
:
:
:
:
:
:
# Extract column values from the previous layer to use in this one.
=
# Convert a Django queryset to sql and params to pass off to Puff. This function does 0 IO in Python.
=
, =
return ..., , , ,
# Convert a Django queryset to sql and params to pass off to Puff. This function does 0 IO in Python.
=
, =
return ..., ,
# Decorate with borrow_db_context to use same DB connection in Django as the rest of GQL
# You can also compute the python values with Django and hand them off to Puff.
# This version of the same `questions` field, is slower since Django is constructing the objects.
=
return ...,
# Decorate with borrow_db_context to use same DB connection in Django as the rest of GQL
=
return
pass
:
:
:
Puff ♥ Distributed Tasks
Sometimes you need to execute a function in the future, or you need to execute it, but you don't care about the result right now. For example, you might have a webhook or an email to send.
Puff provides a distributed queue abstraction as part of the standard library. It is powered by Redis and has the ability to distribute tasks across nodes with priorities, delays and retries. Jobs submitted to the queue can be persisted (additionally so if Redis is configured to persist to disk), so you can shut down and restart your server without worrying about losing your queued functions.
The distributed queue runs in the background of every Puff instance. In order to have a worker instance, use the WaitForever command. Your HTTP server can also handle distributing, processing and running background tasks which is handy for small projects and scales out well by using wait_forever to add more processing power if needed.
A task is a python function that takes a JSONable payload that you care executes, but you don't care exactly when or where. JSONable types are simple Python structures (dicts, lists, strings, etc) that can be serialized to JSON. Queues will monitor tasks and retry them if they don't get a result in timeout_ms. Beware that you might have the same task running multiple times if you don't configure timeouts correctly, so if you are sending HTTP requests or other task that might take a while to respond configure timeouts correctly. Tasks should return a JSONable result which will be kept for keep_results_for_ms seconds.
Only pass in top-level functions into schedule_function that can be imported (no lambda's or closures). This function should be accessible on all Puff instances.
Implement priorities by utilizing scheduled_time_unix_ms. The worker sorts all tasks by this value and executes the first one up until the current time. So if you schedule scheduled_time_unix_ms=1, that function will be the next to execute on the first availability. Use scheduled_time_unix_ms=1, scheduled_time_unix_ms=2. scheduled_time_unix_ms=3, etc for different task types that are high priority. Be careful that you don't starve the other tasks if you aren't processing these high priority tasks fast enough. By default, Puff schedules new tasks with the current unix time to be "fair" and provide a sense of "FIFO" order. You can also set this value to a unix timestamp in the future to delay execution of a task.
You can have as many tasks running as you want (use set_task_queue_concurrent_tasks), however there is a small overhead in terms of monitoring and finding new tasks by increasing this value. The default is num_cpu x 4
See additional design patterns in Building RPC with Puff.
=
=
# Schedule some tasks on any coroutine thread of any Puff instance connected through Redis.
=
# Override `scheduled_time_unix_ms` so that async tasks execute with priority over the coroutine tasks.
# Notice that since all of these tasks have the same priority, they may be executed out of the order they were scheduled.
=
# These tasks will keep their order since their priorities as defined by `scheduled_time_unix_ms` match the order scheduled.
=
=
return
return
Rust
use *;
use ;
Puff ♥ HTTP
Puff has a built-in asynchronous HTTP client based on reqwests that can handle HTTP2 (also served by the Puff WSGI/ASGI integrations) and reuse connections. It uses rust to encode and decode JSON ultra-fast.
=
= await
return await
"""greenlets can use the same async functions. Puff will automatically handle awaiting and context switching."""
=
return
You can set the HTTP client options through RuntimeConfig. If your program is only talking to other Puff instances or HTTP2 services, it can make sense to turn on HTTP2 only. You can also configure user-agents as well as many other HTTP options through this method.
use RuntimeConfig;
use ClientBuilder;
// Force HTTP2
default
.set_http_client_builder_fn;
FAQ
Why a monolithic project?
Puff follows the Django model of having everything you need built-in. A modern SaaS App expects HTTP server, Postgres, Redis, PubSub and an API. Instead of saying that the default is an environment with none of these, Puff takes a practical approach and says that it is actually more of an edge case not to need those resources. Eventually these dependencies should be configurable via feature flags.
While it has a heavy upfront compilation cost, with the following config the binary for the puff runtime ends up being around 4mb.
[]
= 3
= true
= false
= 1
= true
Architecture
Puff consists of multithreaded Tokio Runtime and a single thread which runs all Python computations on Greenlets. Python offloads the IO to Tokio which schedules it and returns it if necessary.

Why is Greenlet environment single threaded?
Only one thread can have the GIL at any particular time. All IO is done outside the GIL in the Rust layer and so the greenlets will only be utilizing the GIL efficiently most of the time. Adding more Python threads will not increase performance, however you can dispatch blocking greenlets which will run on their own thread if you need to do IO blocking work with the Python standard library or 3rd party packages.
What is a Deep Stack Framework?
Currently, you have the frontend and backend that makes up your "full stack". Deep stack is about safely controlling the runtime that your full stack app executes on. Think of an ASGI or WSGI server that is probably written in C or another low level language that executes your higher level Python backend code. Deep stack is about giving you full (and safe) control over that lower level server to run your higher level operations. Its aggressively embracing that different levels of languages have different pros and cons.

Deep Stack Teams
The thesis of the deepstack project is to have two backend engineering roles: Scale and Deepstack. Deepstack engineers' primary goal should be to optimize and abstract the hot paths of the application into Rust to enable the Scale engineers who implement business logic in Python to achieve an efficient ratio between productivity, safety, and performance. Nothing is absolute and the decision to move code between Rust and Python is a gradient rather than binary.
Benefits of Deep Stack
- Control High performance async Rust Computations with Flexible Python Abstractions.
- Maximum Performance: Only enter the python GIL to get the query and parameters and exit to execute the query and compute the results in Rust.
- Django compatible: Full Admin, Migrations, Views, Tests, etc...
- Axum Compatible: All extractors are ready to be used.
- Rapid iteration on the data control layer (Pyton / Django / Flask) without total recompilation of the deep stack layer.
- Quickly scale the Flexibility of Python with the Performance and Safety of Rust.
Performance
Right now there hasn't been too much focus on raw performance in GraphQL, because ultimately performance comes from SQL query optimizations (indexes, no n+1, etc). Puff's structure encourages you write your queries in a layer basis without having to rely on dataloaders or complicated optimizers allowing you to directly express the proper SQL. Ultimately the performance of the GQL server is based on how optimized your queries are to the indexes and structure of your DB.
Puff won't magically make WSGI faster, but where Puff really excels is pushing down tight loops or iterations into the Rust Layer. Think about if you wanted to run multiple queries in parallel and perform some computation on them, resulting in Bytes. It's better to push this function into Puff. See the /examples/wsgi.rs::get_many function for an example.
AsyncIO / uvloop performance is about the same if not 10-25% faster as using hypercorn or similar when using 1 worker.
When thinking about performance on Deep Stack, remember about the extra dimension it makes possible. For example, if you are making a micro-benchmark and wanted to compare a static json response from FastAPI, it would be more idiomatic for Puff to write the static JSON response as an Axum handler in Rust and instead of a FastAPI endpoint. This can deliver a 2-80x performance boost. Specializing a route in Rust is not cheating; it's how its designed.
Status
This is extremely early in development. The scope of the project is ambitious. Expect things to break.
Probably the end game of puff is to have something like gevent's monkeypatch to automatically make existing projects compatible.