Cairn
Offline, airgap-ready geocoder written in Rust.
Cairn (n.) — a pile of stones marking a trail. Each tile is a stone. Drop the pile on disk. The geocoder reads it.
Status
Alpha. Forward search, autocomplete, fuzzy, layer filter, focus bias, structured search, and reverse geocoding all working end-to-end on a Liechtenstein dataset (OSM PBF + WhosOnFirst SQLite).
Goals
- Forward + reverse geocoding, autocomplete, structured search
- Single static binary + single bundle artifact (tar)
- Zero network at runtime — full airgap deploy
- Region extracts via tile-tree subset (Valhalla-style 3-level grid)
- Single-machine commodity hardware, no cluster
Non-goals
- Multi-tenant SaaS
- Live OSM diff replication (planned post-MVP)
- Cloud-native horizontal scaling
Architecture
Three layers:
- Builder (
cairn-build) — ingests OSM PBF, WhosOnFirst SQLite, OpenAddresses CSV. Emits per-tilerkyvblobs, a tantivy text index, an admin polygon layer (bincode), and a centroid layer for nearest fallback. Writes amanifest.tomlwith blake3 hashes. - Bundle — flat directory of immutable mmap-ready files.
- Server (
cairn-serve) —axumHTTP API. Loads the bundle once at startup; no DB, no daemon dependencies.
Tile model
64-bit PlaceId: [level: 3 | tile_id: 22 | local_id: 39]
| Level | Cell size | Contents |
|---|---|---|
| 0 | 4° × 4° | Countries, regions |
| 1 | 1° × 1° | Cities, counties, postcodes |
| 2 | 0.25° × 0.25° | Streets, addresses, POIs, neighborhoods |
Workspace
crates/
cairn-geocoder/ umbrella, re-exports
cairn-place/ Place, PlaceId, schema (rkyv-archived)
cairn-tile/ tile coords, manifest, blob IO, blake3 verify
cairn-text/ tantivy index + autocomplete + fuzzy + geo-bias
cairn-spatial/ R*-tree PIP for admin polygons + nearest centroids
cairn-parse/ address parsing (libpostal FFI deferred)
cairn-import-osm/ OSM PBF: place / POI nodes + named highway ways
cairn-import-wof/ WhosOnFirst SPR + multilingual names + polygons
cairn-import-oa/ OpenAddresses CSV
cairn-import-geonames/ Geonames TSV (stub)
cairn-api/ axum handlers
bins/
cairn-build/ CLI build / extract / verify / info
cairn-serve/ HTTP runtime
Quick start
# 1. Fetch source data (one-time, can be mirrored offline after)
# 2. Build the workspace
# 3. Build a bundle
# 4. Verify integrity
# OK: 6 tiles verified, text=ok, admin=ok, points=ok
# 5. Inspect
# 6. Serve
Endpoints
GET /healthz
GET /readyz 200 ready / 503 if no text index
GET /v1/search forward + autocomplete
GET /v1/structured field-by-field search
GET /v1/reverse PIP + nearest fallback
/v1/search
| Param | Type | Notes |
|---|---|---|
q |
string (required) | Free-text query. |
mode |
search|autocomplete |
Default search. |
limit |
int (1–100) | Default 10. |
fuzzy |
int 0–2 | Edit distance. Forward mode only. |
layer |
csv | Restrict to kinds (e.g. country,city,street). |
focus.lat, focus.lon |
float | Focus point for distance-biased rerank. |
focus.weight |
float | Distance penalty weight (default 0.5). |
/v1/structured
| Param | Type | Notes |
|---|---|---|
house_number / road / unit |
string | Address parts. |
postcode / city / district / region / country |
string | Admin parts. |
limit, focus.* |
as above |
Builds a concatenated query, picks a layer hint based on the finest non-empty field (address → street → city → region → country).
/v1/reverse
| Param | Type | Notes |
|---|---|---|
lat, lon |
float (required) | |
limit |
int 1–50 | Default 10. |
nearest |
int 0–50 | Fallback K-nearest centroids when PIP empty. |
Response includes source: "pip" \| "nearest". PIP results are sorted finest
containing polygon first; admin chain available via admin_path.
Bundle layout
bundle/
├── manifest.toml schema, source hashes, per-tile blake3
├── tiles/<level>/<row>/<col>/<id>.bin rkyv-archived Place blobs
├── index/text/ tantivy segments (mmap'd at runtime)
└── spatial/
├── admin.bin bincode AdminLayer (polygons + metadata)
└── points.bin bincode PointLayer (centroids for nearest fallback)
Build sources
| Source | Format | Coverage | Loaded by |
|---|---|---|---|
| OpenStreetMap | *.osm.pbf |
Global | --osm |
| WhosOnFirst | SQLite | Per-country admin bundles | --wof |
| OpenAddresses | CSV | Per-region authoritative addresses | --oa |
| Geonames | TSV | Global populated places | --geonames (stub) |
Quality gates
26 unit tests cover Place ID encoding, tile blob roundtrip, tile blake3 corruption detection, OSM tag classification, OA row validation, WoF parent-chain walking, tantivy search/autocomplete/fuzzy/layer/focus, admin PIP ordering, and nearest-K queries.
Roadmap
See ROADMAP.md for deferred phases (libpostal FFI, address interpolation, OSM admin relations, per-tile spatial partitioning, distribution tooling).
License
Dual-licensed: MIT OR Apache-2.0. Pick whichever fits.