crw-server
Firecrawl-compatible API server for the CRW web scraper.
Overview
crw-server is the main CRW binary — an Axum-based HTTP server that provides a Firecrawl-compatible REST API and built-in MCP transport. Single binary, ~6 MB idle RAM, no Redis, no Node.js.
- Firecrawl-compatible API —
/v1/scrape,/v1/crawl,/v1/mapwith identical request/response format - MCP transport — Built-in Streamable HTTP MCP endpoint at
/mcpfor Claude Code, Cursor, Windsurf - Auth middleware — Optional Bearer token auth with constant-time comparison (no timing leaks)
- JS rendering — Auto-detect SPAs, render via LightPanda/Playwright/Chrome (CDP)
- LLM extraction — JSON schema → structured data via Anthropic tool_use or OpenAI function calling
- One-command setup —
crw-server setupdownloads LightPanda and configures JS rendering
Installation
Quick start
# Start the server
# Enable JS rendering (downloads LightPanda)
API endpoints
| Method | Endpoint | Description |
|---|---|---|
POST |
/v1/scrape |
Scrape a single URL, optionally with LLM extraction |
POST |
/v1/crawl |
Start async BFS crawl (returns job ID) |
GET |
/v1/crawl/:id |
Check crawl status and retrieve results |
DELETE |
/v1/crawl/:id |
Cancel a running crawl job |
POST |
/v1/map |
Discover all URLs on a site |
GET |
/health |
Health check (no auth required) |
POST |
/mcp |
Streamable HTTP MCP transport |
Usage examples
Scrape a page:
Start a crawl:
LLM structured extraction:
Discover URLs on a site:
Configuration
CRW uses layered TOML configuration with environment variable overrides:
[]
= "0.0.0.0"
= 3000
= 120
= 10 # Max requests/second (global). 0 = unlimited.
[]
= "auto" # auto | lightpanda | playwright | chrome | none
[]
= 10
= 10.0
= true
[]
# api_keys = ["fc-key-1234"]
[]
= "anthropic" # "anthropic" or "openai"
# api_key = "sk-..." # or CRW_EXTRACTION__LLM__API_KEY env var
Override with environment variables:
CRW_SERVER__PORT=8080 CRW_CRAWLER__MAX_CONCURRENCY=20
Docker
# Pre-built image
# With JS rendering sidecar
Using as a library
use create_app;
use AppState;
async
Part of CRW
This crate is part of the CRW workspace — a fast, lightweight, Firecrawl-compatible web scraper built in Rust.
| Crate | Description |
|---|---|
| crw-core | Core types, config, and error handling |
| crw-renderer | HTTP + CDP browser rendering engine |
| crw-extract | HTML → markdown/plaintext extraction |
| crw-crawl | Async BFS crawler with robots.txt & sitemap |
| crw-server | Firecrawl-compatible API server (this crate) |
| crw-cli | Standalone CLI (crw binary) |
| crw-mcp | MCP stdio proxy binary |
License
AGPL-3.0 — see LICENSE.