hinge 0.1.0

SQL-native ELT engine — dependency graph resolved automatically from FROM/JOIN clauses, parallel execution, single binary
Documentation
# Hinge

[![CI](https://github.com/lsalenson/hinge/actions/workflows/ci.yml/badge.svg)](https://github.com/lsalenson/hinge/actions/workflows/ci.yml)
[![Crates.io](https://img.shields.io/crates/v/hinge)](https://crates.io/crates/hinge)
[![docs.rs](https://img.shields.io/docsrs/hinge)](https://docs.rs/hinge)
[![License](https://img.shields.io/crates/l/hinge)](LICENSE)

An ELT engine written in Rust. You write SQL, Hinge resolves the dependency graph and runs everything in order — fast, with no runtime other than the binary itself.

![hinge plan demo](demo.svg)

---

## The problem with Python-based data tooling

Most ELT frameworks run on Python. For pure SQL transformation work, that means:

- A virtual environment to maintain per project
- `pip install` conflicts that silently break pipelines
- Slow interpreter startup on every run
- A Python process mediating every query to your database
- Dependency hell when two tools need incompatible package versions

Hinge removes that layer entirely. It's a single binary. No interpreter, no virtual environment, no package manager involved at runtime.

---

## hinge vs dbt-core

| | hinge | dbt Core |
|---|---|---|
| **Runtime** | Single binary (~5 MB) | Python + venv (~500 MB) |
| **Setup** | `cargo install hinge_cli` | `pip install dbt-core dbt-postgres` + adapter |
| **Cold start** | < 10 ms | ~1–3 s |
| **SQL syntax** | Plain SQL | Jinja-templated SQL |
| **Dependency declaration** | Automatic (parsed from `FROM`/`JOIN`) | Manual `{{ ref() }}` calls |
| **Parallel execution** | Yes (wave-based) | Yes (dbt 1.x+) |
| **Dry run** | `hinge plan` | `dbt compile` + `dbt ls` |
| **ClickHouse atomic swap** | Yes (`EXCHANGE TABLES`) | Third-party adapter |
| **DuckDB support** | Yes (embedded, no server) | No |

---

## Benchmarks

Measured on macOS Apple M-series with 50 SQL models (3-layer graph):

| Operation | hinge | Notes |
|---|---|---|
| `hinge plan` (50 models) | **~11 ms** | Graph resolution + sorted output, no DB connection |
| Python 3 cold start | ~15 ms | Just `python3 -c "pass"` — before importing anything |
| dbt-core cold start | ~1–3 s | Community benchmarks; imports Click, networkx, Jinja2 |

`hinge plan` resolves a 50-model dependency graph, computes the topological sort, and prints the full execution order in about the same time it takes Python to start its interpreter — before loading a single package.

---

## Benefits

**Zero runtime dependencies**
The compiled binary is the runtime. Ship it, run it anywhere.

**Dependency resolution from plain SQL**
Hinge parses `FROM` and `JOIN` clauses in your `.sql` files to build the dependency graph automatically. No `ref()` calls, no manifest files to maintain — your SQL is the source of truth.

**Parallel execution**
Assets with no dependency between them run concurrently in the same wave. Sequential only where it's actually required.

**Precise rebuild modes**
Run only what changed — upstream, downstream, or both — instead of rebuilding the full graph every time.

**Atomic table replacement**
On ClickHouse, Hinge uses `EXCHANGE TABLES` to swap a rebuilt table with the live one atomically. No query downtime during a refresh.

**Structured observability**
Every asset run emits a structured `tracing` event with kind, rows affected, and duration. Slow queries are surfaced at `WARN` automatically.

---

## Installation

### Prebuilt binary (recommended)

Download the latest release for your platform from the [Releases page](https://github.com/lsalenson/hinge/releases).

```bash
# macOS / Linux
tar xzf hinge-v*.tar.gz
mv hinge /usr/local/bin/
```

### Via cargo

```bash
cargo install hinge_cli
```

### As a library

```toml
# Cargo.toml
[dependencies]
hinge = "0.1"
```

---

## Quick start (CLI)

Organize your SQL files by schema:

```
models/
  raw/
    events.sql
  staging/
    orders.sql          ← SELECT id FROM raw.events
  mart/
    revenue.sql         ← SELECT sum FROM staging.orders
```

Create a `hinge.yaml` at the project root:

```yaml
connection: postgresql://user:pass@localhost/mydb
models: models
```

Preview the execution plan without touching the database:

```bash
hinge plan
# 1.  raw.events
# 2.  staging.orders
# 3.  mart.revenue
```

Run everything:

```bash
hinge run
```

Run a subset:

```bash
# Everything staging.orders depends on, then staging.orders itself
hinge run --upstream staging.orders

# staging.orders and everything that depends on it
hinge run --downstream staging.orders

# Both directions from a node
hinge run --bidirectional staging.orders
```

The connection string can also come from an environment variable:

```bash
HINGE_CONNECTION=postgresql://... hinge run
```

See [`examples/quickstart/`](examples/quickstart/) for a complete working example.

---

## Quick start (library)

```rust
use hinge::{BuildGraph, RunAll, PostgresExecutor};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let graph    = BuildGraph::from_dir("models")?;
    let executor = PostgresExecutor::new("postgresql://user:pass@localhost/mydb").await?;
    RunAll::new(graph, executor).execute().await?;
    // Runs: raw.events → staging.orders → mart.revenue
    Ok(())
}
```

---

## Asset types

The default kind is `VIEW`. Override with a header comment at the top of the file:

```sql
-- @kind: table
SELECT ...
```

```sql
-- @kind: materialized_view
SELECT ...
```

ClickHouse table options are passed through as headers:

```sql
-- @kind: table
-- @clickhouse.order_by: (event_date, user_id)
-- @clickhouse.partition_by: toYYYYMM(event_date)
SELECT ...
```

---

## Run modes

| API | CLI flag | Behaviour |
|---|---|---|
| `RunAll` | *(default)* | Full graph in dependency order |
| `RunUpstream` | `--upstream schema.model` | All ancestors of a node, then the node itself |
| `RunDownstream` | `--downstream schema.model` | The node, then all its descendants |
| `RunBidirectional` | `--bidirectional schema.model` | Both directions from a node |

Each use case also exposes `.plan()` to preview the execution order without running anything.

---

## Connectors

| Database | Status |
|---|---|
| PostgreSQL | stable |
| ClickHouse | stable |
| DuckDB | stable |
| Snowflake | stable |

---

## Roadmap

- [ ] Config file for connection strings *(in progress)*
- [ ] Delta / incremental runs
- [ ] Homebrew tap
- [ ] Watch mode (`hinge watch`)

---

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md).

---

## License

Apache-2.0 — see [LICENSE](LICENSE).