What is Stagehand?
Stagehand is a browser automation framework used to control web browsers with natural language and code. By combining the power of AI with the precision of code, Stagehand makes web automation flexible, maintainable, and actually reliable.
Why Stagehand?
Most existing browser automation tools either require you to write low-level code in a framework like Selenium, Playwright, or Puppeteer, or use high-level agents that can be unpredictable in production. By letting developers choose what to write in code vs. natural language (and bridging the gap between the two) Stagehand is the natural choice for browser automations in production.
-
Choose when to write code vs. natural language: use AI when you want to navigate unfamiliar pages, and use code when you know exactly what you want to do.
-
Go from AI-driven to repeatable workflows: Stagehand lets you preview AI actions before running them, and also helps you easily cache repeatable actions to save time and tokens.
-
Write once, run forever: Stagehand's auto-caching combined with self-healing remembers previous actions, runs without LLM inference, and knows when to involve AI whenever the website changes and your automation breaks.
Stagehand Rust SDK [ALPHA]
A Rust client library for Stagehand, the AI-powered browser automation framework. This SDK provides an async-first, type-safe interface for controlling Browserbase browsers and performing AI-driven web interactions.
[!CAUTION] This is an ALPHA release and is not production-ready. Please provide feedback and let us know if you have feature requests / bug reports!
Features
- Browserbase Cloud Support: Drive Browserbase cloud browser sessions (local coming soon)
- AI-Driven Actions: Use natural language instructions to interact with web pages
- Structured Data Extraction: Extract typed data from pages using Serde schemas
- Element Observation: Identify and analyze interactive elements on pages
- Agent Execution: Run multi-step AI agents with the
executemethod - Streaming Responses: Real-time progress updates via Server-Sent Events (SSE)
- CDP Access: Get the CDP WebSocket URL to connect external tools like chromiumoxide
Table of Contents
Installation
Add to your Cargo.toml:
[]
= "0.3"
= { = "1", = ["macros", "rt-multi-thread"] }
= "0.3"
= { = "1", = ["derive"] }
= "1"
= "0.15"
Runtime Support
The SDK supports both tokio and async-std runtimes. Tokio is enabled by default.
Using tokio (default):
[]
= "0.3"
= { = "1", = ["macros", "rt-multi-thread"] }
Using async-std:
[]
= { = "0.3", = false, = ["async-std-runtime"] }
= { = "1", = ["attributes"] }
Quick Start
use ;
use ;
use StreamExt;
use ;
use HashMap;
async
Configuration
Environment Variables
Create a .env file in your project root:
# Browserbase API credentials (required)
BROWSERBASE_API_KEY=your_browserbase_api_key_here
BROWSERBASE_PROJECT_ID=your_browserbase_project_id_here
# Model API key
MODEL_API_KEY=your_api_key # OpenAI, Anthropic, Gemini, etc. key
# Optional: Custom API URLs
STAGEHAND_BASE_URL=https://api.stagehand.browserbase.com/v1 # Stagehand API (default)
BROWSERBASE_API_URL=https://api.browserbase.com/v1 # Browserbase API (default)
The SDK checks for model API keys in the order listed above and uses the first one found.
V3Options
The main configuration struct for initializing Stagehand:
Model Configuration
Specify AI models in two ways:
// Simple string format (recommended)
let model = String;
// Detailed configuration with custom API key/base URL
let model = Config ;
API Reference
Stagehand::connect
Establishes a connection to the Stagehand service.
pub async
Parameters:
transport_choice-TransportChoice::Rest(base_url)for REST API with explicit URL, or useTransportChoice::default_rest()to use theSTAGEHAND_BASE_URLenv var (falls back to default)
Example:
// Using default (recommended) - checks STAGEHAND_BASE_URL env var, falls back to default
let stagehand = connect.await?;
// Or with explicit URL
let stagehand = connect.await?;
start
Starts a browser session.
pub async
Example:
let opts = V3Options ;
stagehand.start.await?;
println!;
act
Performs browser actions based on natural language instructions.
pub async
Parameters:
instruction- Natural language instruction (e.g., "Click the login button")model- Override the default AI modelvariables- Variable substitution map for the instructiontimeout- Operation timeout in millisecondsframe_id- Target a specific iframe
Response Events:
ActResponseEvent::Log(LogLine)- Progress logsActResponseEvent::Success(bool)- Action completion status
Example:
let mut stream = stagehand.act.await?;
while let Some = stream.next.await
extract
Extracts structured data from web pages using a schema.
pub async
Parameters:
instruction- What data to extractschema- A Serde-serializable struct defining the expected shapemodel- Override the default AI modeltimeout- Operation timeoutselector- CSS selector to narrow extraction scopeframe_id- Target a specific iframe
Response Events:
ExtractResponseEvent::Log(LogLine)- Progress logsExtractResponseEvent::DataJson(String)- JSON string matching the schema
Example:
let schema = ProductInfo ;
let mut stream = stagehand.extract.await?;
while let Some = stream.next.await
observe
Identifies interactive elements on a page.
pub async
Parameters:
instruction- Optional AI instruction for analysismodel- Override the default AI modeltimeout- Operation timeoutselector- CSS selector to narrow observation scopeframe_id- Target a specific iframe
Response Events:
ObserveResponseEvent::Log(LogLine)- Progress logsObserveResponseEvent::ElementsJson(String)- JSON array of observed elements
Example:
let mut stream = stagehand.observe.await?;
while let Some = stream.next.await
execute
Executes an AI agent with multi-step capabilities.
pub async
Parameters:
agent_config- Agent configuration (provider, model, system prompt, CUA mode)execute_options- Execution options (instruction, max steps, highlight cursor)frame_id- Target a specific iframe
Response Events:
ExecuteResponseEvent::Log(LogLine)- Execution progressExecuteResponseEvent::ResultJson(String)- Final result
Example:
use ;
let agent_config = AgentConfig ;
let execute_options = AgentExecuteOptions ;
let mut stream = stagehand.execute.await?;
while let Some = stream.next.await
end
Ends the browser session.
pub async
Example:
stagehand.end.await?;
browserbase_cdp_url
Returns the CDP WebSocket URL for connecting external tools like chromiumoxide.
The URL format is: wss://connect.browserbase.com?sessionId={sessionId}&apiKey={apiKey}
Example:
// After init(), get the CDP URL to connect chromiumoxide
let cdp_url = stagehand.browserbase_cdp_url
.expect;
// Connect chromiumoxide to the remote browser
let = connect.await?;
See tests/chromiumoxide_integration.rs for a complete example.
Examples
Full Integration Example
See tests/browserbase_live.rs for a complete working example that demonstrates act, extract, and execute.
Chromiumoxide Integration
See tests/chromiumoxide_integration.rs for connecting chromiumoxide to a Browserbase session:
use Browser;
use ;
async
Error Handling
The SDK uses StagehandError for all error cases:
All errors implement std::error::Error and Display.
Running Tests
# Set up environment variables
# Edit .env with your credentials
# Run all tests
# Run specific integration test with output
# Run chromiumoxide integration test
License
Apache-2.0