---
description: Throughput and latency benchmarks across routing configurations.
icon: chart-line
layout:
width: default
title:
visible: true
description:
visible: true
tableOfContents:
visible: true
outline:
visible: true
pagination:
visible: true
metadata:
visible: true
tags:
visible: true
---
# Performance
This page documents Fynd solver performance benchmarks across different configurations. All benchmarks use the `fynd-benchmark scale` subcommand, which builds a solver in-process for each worker count, runs a sustained load test, and reports throughput and latency statistics. See [benchmarking.md](guides/benchmarking.md "mention") for how to run these yourself.
All results below were produced using `scripts/bench-remote.sh`, which provisions an EC2 instance, builds the solver from source, and runs the full scaling sweep automatically. Pool configuration files used by each benchmark are in [`tools/benchmark/`](../tools/benchmark). To reproduce the `most_liquid` results:
```bash
WORKER_COUNTS="1,2,3,4,6,8" \
NUM_REQUESTS=10000 \
POOL_CONFIG="tools/benchmark/most_liquid_2hop.toml" \
TYCHO_URL="$TYCHO_URL" \
TYCHO_API_KEY="$TYCHO_API_KEY" \
RPC_URL="$RPC_URL" \
bash scripts/bench-remote.sh
```
## CPU Scaling: 2-Hop Routing
Measures how throughput scales with worker thread count for 2-hop route finding using the `most_liquid` algorithm.
### Setup
| Instance | AWS `c7a.8xlarge` (32 vCPU, AMD EPYC) |
| Algorithm | `most_liquid` |
| Max hops | 2 |
| Protocols | `uniswap_v2`, `uniswap_v3`, `uniswap_v4`, `sushiswap_v2`, `pancakeswap_v2`, `pancakeswap_v3`, `ekubo_v2`, `fluid_v1` |
| Requests per iteration | 10,000 |
| Concurrency | `fixed:48` |
| Warmup | 30s after health check |
| Config | [`tools/benchmark/most_liquid_2hop.toml`](../tools/benchmark/most_liquid_2hop.toml) |
### Results
| 1 | 397.19 | 120 | 129 | 397.19 |
| 2 | 743.16 | 64 | 69 | 371.58 |
| 3 | 1035.84 | 46 | 50 | 345.28 |
| 4 | 1444.04 | 33 | 36 | 361.01 |
| 6 | 2109.26 | 22 | 25 | 351.54 |
| 8 | 2820.08 | 16 | 18 | 352.51 |
### Analysis
Throughput scales nearly linearly across all tested worker counts (~350-397 req/s per worker). The solver crosses 1000 req/s at **3 workers** (1036 req/s). Latency stays tight throughout — P99 is only 18ms at 8 workers.
**Recommendation.** For `most_liquid` 2-hop routing at 1000 req/s sustained throughput, provision at least 3 CPU cores. Use 4 cores for comfortable headroom.
## CPU Scaling: 3-Hop Routing
Measures how throughput scales with worker thread count for 3-hop route finding using the `most_liquid` algorithm.
### Setup
| Instance | AWS `c7a.8xlarge` (32 vCPU, AMD EPYC) |
| Algorithm | `most_liquid` |
| Max hops | 3 |
| Protocols | `uniswap_v2`, `uniswap_v3`, `uniswap_v4`, `sushiswap_v2`, `pancakeswap_v2`, `pancakeswap_v3`, `ekubo_v2`, `fluid_v1` |
| Requests per iteration | 10,000 |
| Concurrency | `fixed:48` |
| Warmup | 30s after health check |
| Config | [`tools/benchmark/most_liquid_3hop.toml`](../tools/benchmark/most_liquid_3hop.toml) |
### Results
| 1 | 22.30 | 2138 | 2531 | 22.30 |
| 2 | 39.46 | 1208 | 1465 | 19.73 |
| 4 | 75.95 | 627 | 784 | 18.99 |
| 8 | 146.52 | 319 | 440 | 18.32 |
| 12 | 177.23 | 260 | 386 | 14.77 |
| 16 | 298.22 | 146 | 243 | 18.64 |
| 20 | 352.63 | 117 | 226 | 17.63 |
| 24 | 243.00 | 163 | 320 | 10.13 |
| 28 | 384.13 | 92 | 251 | 13.72 |
| 32 | 366.06 | 86 | 272 | 11.44 |
### Analysis
Throughput is non-monotonic across worker counts. The solver peaks at **384 req/s at 28 workers** and does not reach 1000 req/s on this instance. P99 stays bounded (≤440ms), a significant improvement over the pre-lock-PR results where P99 spiked to 794ms at 24 workers.
**Recommendation.** For `most_liquid` 3-hop routing with this full protocol set, provision at least 28 CPU cores. The increased computational complexity of 3-hop search means throughput variability is expected.
## Comparison: 2-Hop vs 3-Hop
| 1,000 | 3 | — | 3-hop peaks at ~384 req/s at 28 workers; 1000 req/s not reached |
`most_liquid` 3-hop does not reach 1000 req/s on a 32-vCPU instance with 8 protocols. The combinatorial growth in the 3-hop search space creates a hard throughput ceiling for this algorithm.
## CPU Scaling: Bellman-Ford 2-Hop
Measures how throughput scales with worker thread count for 2-hop route finding using the `bellman_ford` algorithm.
### Setup
| Instance | AWS `c7a.8xlarge` (32 vCPU, AMD EPYC) |
| Algorithm | `bellman_ford` |
| Max hops | 2 |
| Protocols | `uniswap_v2`, `uniswap_v3`, `uniswap_v4`, `sushiswap_v2`, `pancakeswap_v2`, `pancakeswap_v3`, `ekubo_v2`, `fluid_v1` |
| Requests per iteration | 10,000 |
| Concurrency | `fixed:48` |
| Warmup | 30s after health check |
| Config | [`tools/benchmark/bellman_ford_2hop.toml`](../tools/benchmark/bellman_ford_2hop.toml) |
To reproduce:
```bash
WORKER_COUNTS="1,2,3,4,6,8" \
NUM_REQUESTS=10000 \
POOL_CONFIG="tools/benchmark/bellman_ford_2hop.toml" \
PROTOCOLS="uniswap_v2,uniswap_v3,uniswap_v4,sushiswap_v2,pancakeswap_v2,pancakeswap_v3,ekubo_v2,fluid_v1" \
TYCHO_URL="$TYCHO_URL" \
TYCHO_API_KEY="$TYCHO_API_KEY" \
RPC_URL="$RPC_URL" \
bash scripts/bench-remote.sh
```
### Results
| 1 | 85.31 | 562 | 586 | 85.31 |
| 2 | 154.58 | 310 | 328 | 77.29 |
| 3 | 220.12 | 217 | 228 | 73.37 |
| 4 | 290.93 | 164 | 173 | 72.73 |
| 6 | 406.87 | 117 | 124 | 67.81 |
| 8 | 518.54 | 92 | 99 | 64.82 |
### Analysis
Throughput scales near-linearly across all tested worker counts. Per-worker efficiency declines gradually from ~85 req/s at 1 worker to ~65 req/s at 8 workers. The solver does not cross 1000 req/s within the tested 8-worker range; linear extrapolation places that threshold at approximately **16 workers**.
**Recommendation.** For Bellman-Ford 2-hop routing at 1000 req/s sustained throughput, provision at least 16 CPU cores.
## CPU Scaling: Bellman-Ford 3-Hop
Measures how throughput scales with worker thread count for 3-hop route finding using the `bellman_ford` algorithm.
### Setup
| Instance | AWS `c7a.8xlarge` (32 vCPU, AMD EPYC) |
| Algorithm | `bellman_ford` |
| Max hops | 3 |
| Protocols | `uniswap_v2`, `uniswap_v3`, `uniswap_v4`, `sushiswap_v2`, `pancakeswap_v2`, `pancakeswap_v3`, `ekubo_v2`, `fluid_v1` |
| Requests per iteration | 10,000 |
| Concurrency | `fixed:48` |
| Warmup | 30s after health check |
| Config | [`tools/benchmark/bellman_ford_3hop.toml`](../tools/benchmark/bellman_ford_3hop.toml) |
To reproduce:
```bash
WORKER_COUNTS="1,2,4,8,12,16,20,24,28,32" \
NUM_REQUESTS=10000 \
POOL_CONFIG="tools/benchmark/bellman_ford_3hop.toml" \
PROTOCOLS="uniswap_v2,uniswap_v3,uniswap_v4,sushiswap_v2,pancakeswap_v2,pancakeswap_v3,ekubo_v2,fluid_v1" \
TYCHO_URL="$TYCHO_URL" \
TYCHO_API_KEY="$TYCHO_API_KEY" \
RPC_URL="$RPC_URL" \
bash scripts/bench-remote.sh
```
### Results
| 1 | 65.36 | 735 | 780 | 65.36 |
| 2 | 121.68 | 394 | 416 | 60.84 |
| 4 | 233.09 | 205 | 221 | 58.27 |
| 8 | 429.44 | 111 | 121 | 53.68 |
| 12 | 584.86 | 81 | 90 | 48.74 |
| 16 | 760.92 | 62 | 71 | 47.56 |
| 20 | 874.51 | 54 | 65 | 43.73 |
| 24 | 974.94 | 48 | 57 | 40.62 |
| 28 | 1201.92 | 39 | 49 | 42.93 |
| 32 | 1219.96 | 38 | 49 | 38.12 |
### Analysis
Throughput scales near-linearly up to 8 workers (~54 req/s per worker). Beyond that, per-worker efficiency gradually declines — from ~49 req/s at 12 workers to ~38 req/s at 32 workers — as the instance approaches its CPU ceiling. The solver crosses 1000 req/s at **28 workers** (1202 req/s). Median latency falls from 735ms at 1 worker to 38ms at 32 workers; P99 stabilises at 49ms from 28 workers onward.
**Recommendation.** For Bellman-Ford 3-hop routing at 1000 req/s sustained throughput, provision at least 28 CPU cores. Use 32 cores for headroom under variable load.
## Comparison: Bellman-Ford 2-Hop vs 3-Hop
| 1,000 | ~16 | 28 | ~1.8x |
Bellman-Ford 3-hop requires roughly **1.8× more CPU cores** than 2-hop to reach the same throughput target. This is a much smaller penalty than seen with `most_liquid` (8×), reflecting Bellman-Ford's more uniform search cost growth across hop counts — it already explores the full path space at 2 hops, so adding a third hop grows the search space less dramatically relative to the base cost.
## Algorithm Comparison: most_liquid vs bellman_ford
All results in this section use identical hardware, protocol set, and request load.
### 2-Hop
| 1 | 397.19 | 85.31 | 4.7x |
| 2 | 743.16 | 154.58 | 4.8x |
| 3 | 1035.84 | 220.12 | 4.7x |
| 4 | 1444.04 | 290.93 | 5.0x |
| 6 | 2109.26 | 406.87 | 5.2x |
| 8 | 2820.08 | 518.54 | 5.4x |
`most_liquid` is consistently **~5× faster** for 2-hop routing. Its greedy liquidity-ranked search terminates early once the best path is found, while Bellman-Ford explores all paths exhaustively.
### 3-Hop
| 8 | 146.52 | 429.44 | bellman_ford (2.9x) |
| 16 | 298.22 | 760.92 | bellman_ford (2.6x) |
| 20 | 352.63 | 874.51 | bellman_ford (2.5x) |
| 24 | 243.00 | 974.94 | bellman_ford (4.0x) |
| 28 | 384.13 | 1201.92 | bellman_ford (3.1x) |
| 32 | 366.06 | 1219.96 | bellman_ford (3.3x) |
Both algorithms use the same 8-protocol set. `bellman_ford` is consistently **2.5–4× faster** at 3 hops. The advantage reflects the increased computational complexity of `most_liquid`'s 3-hop search.
At 3 hops, the results **reverse**: `bellman_ford` is **2–2.5× faster** than `most_liquid` and keeps scaling while `most_liquid` plateaus. The `most_liquid` greedy search requires pre-computed edge weights (spot price × depth) that are recomputed on every block, creating a periodic pause that limits scaling; Bellman-Ford carries no pre-computed edge state so its per-block update is a no-op.