<p align="center">
<a href="https://cascadinglabs.com/voidcrawl/">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="media/logo-dark.svg">
<source media="(prefers-color-scheme: light)" srcset="media/logo-light.svg">
<img src="media/logo-dark.svg" alt="VoidCrawl" width="200">
</picture>
</a>
</p>
<p align="center">
<a href="https://discord.gg/ftykDhmAQN"><img src="https://img.shields.io/badge/Discord-Join-b07adf?labelColor=120a24&logo=discord&logoColor=white" alt="Discord"></a>
<a href="https://opensource.org/licenses/Apache-2.0"><img src="https://img.shields.io/badge/License-Apache_2.0-b07adf?labelColor=120a24" alt="License"></a>
<a href="https://github.com/CascadingLabs/VoidCrawl/actions"><img src="https://img.shields.io/github/actions/workflow/status/CascadingLabs/VoidCrawl/CI.yaml?label=CI&labelColor=120a24&color=b07adf" alt="CI"></a>
<a href="https://pypi.python.org/pypi/voidcrawl"><img src="https://img.shields.io/pypi/v/voidcrawl?labelColor=120a24&color=b07adf" alt="PyPI"></a>
<a href="https://pypi.python.org/pypi/voidcrawl"><img src="https://img.shields.io/pypi/pyversions/voidcrawl?labelColor=120a24&color=b07adf" alt="Python versions"></a>
<a href="https://cascadinglabs.com/voidcrawl/"><img src="https://img.shields.io/badge/docs-cascadinglabs.com%2Fvoidcrawl-b07adf?labelColor=120a24" alt="docs"></a>
<a href="https://github.com/CascadingLabs/VoidCrawl/pkgs/container/voidcrawl"><img src="https://img.shields.io/badge/ghcr.io-cascadinglabs%2Fvoidcrawl-b07adf?labelColor=120a24&logo=docker&logoColor=white" alt="GHCR"></a>
</p>
# VoidCrawl
**CDP browser automation for Python** — a Rust-native Chrome DevTools Protocol client exposed to Python via PyO3.
`void_crawl` replaces Playwright/Selenium with a permissively-licensed (Apache-2.0) stack for rendering JavaScript-heavy pages. Built on [chromiumoxide](https://github.com/mattsse/chromiumoxide) with a shared Tokio runtime.
> **Used by [Yosoi](https://github.com/CascadingLabs/Yosoi)** — an AI-powered selector discovery tool for resilient web scraping.
## Requirements
- **Rust** ≥ 1.86 (edition 2024)
- **Python** ≥ 3.10
- **Chrome/Chromium** installed on the system
- **maturin** ≥ 1.7 (`cargo install maturin`)
## Installation
```bash
# Build and install into your venv
./build.sh
# Or manually:
maturin develop --release --manifest-path crates/pyo3_bindings/Cargo.toml
```
## Quick Start
### BrowserPool (recommended)
Tabs are recycled, not closed — near-instant reuse across requests.
```python
import asyncio
from void_crawl import BrowserPool
async def main():
async with BrowserPool.from_env() as pool:
async with pool.acquire() as tab:
await tab.navigate("https://example.com")
print(await tab.title())
print(len(await tab.content()))
asyncio.run(main())
```
### BrowserSession (low-level)
```python
import asyncio
from void_crawl import BrowserSession
async def main():
async with BrowserSession(headless=True) as browser:
page = await browser.new_page("https://example.com")
print(await page.title())
await page.close()
asyncio.run(main())
```
## MCP server for Claude Code
`voidcrawl-mcp` is a stdio MCP server that exposes the full pool + session API as tools any MCP-speaking agent can call. Point Claude Code at it and the agent can fetch, screenshot, click, type, eval JS, detect captchas, and drive multi-step sessions — all via the same stealth-patched Chrome pool.
Install (pick one):
```bash
# Claude Code user — binary only, isolated venv:
uv tool install voidcrawl-mcp
# or
pipx install voidcrawl-mcp
# Python lib + MCP server together:
pip install 'voidcrawl[mcp]'
# From crates.io (builds from source):
cargo install voidcrawl-mcp
# From git HEAD (builds from source):
cargo install --git https://github.com/CascadingLabs/VoidCrawl voidcrawl-mcp
```
Wire it into Claude Code via `.mcp.json`:
```json
{
"mcpServers": {
"voidcrawl": {
"command": "voidcrawl-mcp"
}
}
}
```
Optional: pin the whole server to a warm Chrome profile (so every session inherits its cookies/logins):
```bash
voidcrawl-mcp --profile "Default"
# or: VOIDCRAWL_PROFILE="Default" voidcrawl-mcp
```
A ready-made Claude Code skill at `.claude/skills/voidcrawl/SKILL.md` teaches the agent when to pick voidcrawl over manual browsing, the `click_visual_coords` recipe for React forms, and how to react to typed `CaptchaDetected` errors. Claude Code picks it up automatically.
See [`docs/mcp-server.md`](docs/mcp-server.md) for the full tool list and [`docs/profiles.md`](docs/profiles.md) for profile leasing.
## Docker
Pre-built multi-arch images (`linux/amd64`, `linux/arm64`) are published to GHCR on every push to `main` and every tagged release:
```bash
docker run -d --network host --shm-size=2g \
ghcr.io/cascadinglabs/voidcrawl:headless-latest
export CHROME_WS_URLS="http://localhost:9222,http://localhost:9223"
python examples/basic_navigation.py
```
Or via compose:
```bash
docker compose -f docker/docker-compose.yml up -d
```
Available tags: `headless-latest`, `headless-<version>`, `headless-<sha>`, and the same set prefixed `headful-` for GPU + VNC (linux/amd64 only). See the [Docker & VNC guide](https://cascadinglabs.com/voidcrawl/guides/docker/) and the [Docker Config reference](https://cascadinglabs.com/voidcrawl/reference/docker-config/) for every runtime knob.
## Testing
```bash
# Rust integration tests (serial — Chrome singleton lock)
cargo test -p void_crawl_core -- --test-threads=1
# Python integration tests (requires built extension + Chrome)
uv run pytest tests/ -v
```
## Documentation
- [Full API reference](docs/api-reference.md)
- [Anti-bot / CDN detection](docs/antibot.md)
- [Cross-origin & closed-shadow frames](docs/cross-origin-frames.md)
- [Examples](examples/)
## Contact
[contact@cascadinglabs.com](mailto:contact@cascadinglabs.com)
## License
Apache-2.0