# coren
Compute and resource normalization.
Measures what your machine can do. Built for
[irohds](https://codeberg.org/cab/irohds) -- to tell you whether a computation
is faster to run locally or fetch from the network.
Any two machines looking at the same function can independently agree on
how much work it requires (an **_almost_** deterministic op count). Each machine knows
its own capabilities (benchmarked once at startup). The verdict is local
arithmetic: compare estimated compute time against estimated fetch time.
Integration into [irohds](https://codeberg.org/cab/irohds), a decentralized
memoization system for scientific computing, was the initial reason
for making this package. It is also useful as a standalone tool for
roofline analysis, ETL buffer sizing, and build parallelism decisions.
## Install
```sh
uv add coren
```
Or from source:
```sh
uv run maturin develop --features python
```
## Usage
```python
from coren import FnCost, MachCap
# Describe the work (deterministic, same on every machine)
cost = FnCost.sort(1_000_000, 64, 64_000_000)
# Measure this machine (benchmarks run once, cached)
cap = MachCap.read()
# Get the answer
v = cap.verdict(cost)
print(v) # "compute (saves 0.712s)" or "fetch (saves 2.3s)"
if v.should_fetch():
download_result()
else:
compute_locally()
```
## How it works
Two layers:
**FnCost** describes a function's resource requirements in absolute
physical units. Four integers: `ops` (total arithmetic operations),
`mem_bytes` (memory traffic), `peak_mem` (peak RAM footprint),
`result_bytes` (output size). These are properties of the algorithm
and its inputs. Bitwise identical on every machine.
**MachCap** describes what this machine can do. Measured via
micro-benchmarks (FMA throughput, STREAM triad, disk sequential I/O)
and OS queries (NIC link speed, battery state, core count, RAM).
Produces a roofline model: peak ops/s and memory bandwidth.
The **verdict** compares estimated compute time (from the roofline
model) against estimated fetch time (result_bytes / NIC bandwidth).
The score is the difference in seconds: positive means fetch is
faster, negative means compute is faster. Infinity means one option
is impossible (no RAM, or no network).
## FnCost constructors
```
FnCost.new(ops, mem_bytes, peak_mem, result_bytes) raw values
FnCost.scan(n_bytes, result_bytes) linear scan
FnCost.sort(n, item_bytes, result_bytes) merge sort
FnCost.hash(n_bytes) crypto hash
FnCost.matmul(m, n, k, result_bytes) dense GEMM
FnCost.etl(rows, row_bytes, ops_per_row, result_bytes) row processing
FnCost.copy(size) file copy (ops=0)
```
## Combinators
```python
a.then(b) # sequential: ops sum, peak_mem = max, result = b's output
a + b # same as then
a.par(b) # parallel: ops = max, peak_mem sums, result sums
a.repeat(k) # k iterations, peak_mem unchanged
```
## Normalizing wall-clock measurements
When a function is executed and you only know the wall-clock time (not
the algorithmic complexity), `MachCap.normalize()` converts the
measurement into a FnCost suitable for local verdict computation.
**WARNING: `normalize()` output is NOT deterministic across machines.**
Different machines produce different ops/mem_bytes values for the same
function. Do NOT use `normalize()`-produced FnCost as cache keys,
content addresses, or any identifier that must match across peers.
For cache keys, use the static constructors (`sort`, `hash`, `matmul`,
etc.) or `FnCost.new()` with values derived from the function's
definition and parameters.
```python
cost = cap.normalize(compute_ns=15_000_000_000, peak_mem=4_000_000_000, result_bytes=500_000_000)
# cost is a safe overestimate, suitable for verdict() but NOT for cache keys
```
## CLI
```sh
$ coren
coren [desktop]
cores 8p / 16l
frequency 3600 MHz
roofline (measured)
peak (all cores) 89.2 GFLOPS
mem bandwidth 38.1 GB/s
disk bandwidth 1.52 GB/s
ridge point 2.34 ops/byte
verdicts (what should this machine do?)
task ops Q R score neck action
sort 1M x 64B 200.0M 1.2GB 10.0MB -0.712 memory compute
matmul 1k^3 2.0G 22.9MB 10.0MB -0.779 compute compute
$ coren --json # machine-readable output
```
## Rust
```rust
use coren::{FnCost, MachCap};
let cost = FnCost::sort(1_000_000, 64, 64_000_000);
let cap = MachCap::read(".");
let v = cap.verdict(&cost);
if v.should_fetch() {
// download from peer
} else {
// compute locally
}
```
## License
MIT OR Apache-2.0