# Browser Emulation Design
This document is the primary design and maintenance reference for browser-like
network emulation in `ugi`.
## Executive Summary
- For `ugi`'s current goal, a BoringSSL-class backend is the correct primary
strategy for the TLS layer.
- `rustls` should remain the default general-purpose backend, but not the
primary backend for high-fidelity browser TLS impersonation.
- TLS alone is not sufficient. HTTP/2 fingerprinting and connection-pool
isolation are mandatory if we want coherent browser-like network behavior.
- Request-context headers such as navigation vs XHR vs image can be deferred
for now because the current focus is API traffic, not full-page navigation.
- Browser emulation should live behind a dedicated Cargo feature so that extra
code paths and dependencies stay out of ordinary builds.
- The public API can remain simple for ordinary users:
```rust
let client = ugi::Client::builder()
.emulation(ugi::Emulation::Chrome136)
.build()?;
```
That is feasible, but only if the library performs protocol preference,
backend enforcement, connection isolation, and merge rules automatically.
## Scope
This design only covers network-observable behavior:
- TLS / ClientHello behavior
- ALPN and protocol negotiation
- HTTP/1 request serialization behavior
- HTTP/2 connection fingerprinting
- HTTP/3 / QUIC entry points
- connection reuse and pool isolation
This design does not cover:
- site-specific cookies or business tokens
- challenge solvers
- CAPTCHA
- JavaScript runtime fingerprinting
- DOM / Canvas / WebGL / Audio / Navigator APIs
For the current anti-bot goal, that boundary is intentional.
## Why BoringSSL Is the Right TLS Strategy
### The real requirement is not "TLS support"
The real requirement is control over browser-relevant handshake behavior:
- ALPN list and ordering
- cipher list
- curves / key shares
- signature algorithms
- GREASE
- extension permutation / ordering
- certificate compression
- OCSP stapling
- signed certificate timestamps
- ECH grease / ECH config handling
`cloudflare/boring` exposes a significant part of that control surface.
Examples of relevant APIs include:
- `set_alpn_protos`
- `set_cipher_list`
- `set_sigalgs_list`
- `set_curves_list`
- `set_grease_enabled`
- `set_permute_extensions`
- `add_certificate_compression_algorithm`
- `enable_ocsp_stapling`
- `enable_signed_cert_timestamps`
- `set_enable_ech_grease`
That is the core reason to use a BoringSSL-based backend.
### Why not rely on upstream `rustls` for this
This is not a statement that `rustls` is weak or low quality. It is a statement
about API surface and maintenance economics.
For browser TLS impersonation, upstream `rustls` is a poor primary foundation
because:
- the necessary control points are narrower
- some relevant internals are not meant to be long-term public tuning knobs
- closing the gap usually pushes the project toward patches or forks
- maintaining a patched TLS + HTTP/2 stack is expensive
Therefore:
- `rustls` remains the default compatibility and portability backend
- `btls-backend` becomes the high-fidelity emulation backend
## `ugi` Design Constraints
`ugi` is not a Tokio-first client. That matters.
The repository already contains:
- an optional `boring-backend` Cargo feature
- `TlsBackend::Boring`
- a custom `BoringTlsStream` built on top of `boring::ssl::SslStream`
- an `async-io`-driven handshake loop using `WouldBlock` plus
`poll_readable` / `poll_writable`
This means:
- `ugi` does **not** need to switch to `tokio-boring` as a prerequisite
- `ugi` can continue to own its runtime-agnostic I/O abstraction
- Tokio adapters remain a fallback option, not the preferred first move
That is important because a runtime migration would be a much larger change
than what the current goal actually requires.
## Design Goals
### Primary goals
- make API traffic less distinguishable from browser-originated traffic at the
TLS and HTTP/2 layers
- keep the ordinary-user API simple
- keep advanced tuning available for maintainers and power users
- keep feature gating explicit
- keep connection reuse coherent with the selected emulation profile
### Secondary goals
- preserve the existing `ugi` builder model
- keep `rustls` as the default backend outside emulation
- make the capability boundary explicit instead of implicit
## Non-Goals for the Current Phase
- full browser runtime impersonation
- navigation-oriented request-context automation
- page-load sequencing
- WebSocket fingerprinting beyond baseline TLS behavior
- complete QUIC / HTTP/3 browser fingerprint parity
These may be future work, but they are not blockers for the current API traffic
use case.
## Proposed Architecture
The current `BrowserProfile` shape is too coarse. It mixes:
- request defaults
- cookies
- TLS fingerprinting
- HTTP/2 fingerprinting
That should be replaced with a layered model.
```rust
pub struct EmulationProfile {
pub metadata: ProfileMetadata,
pub request: RequestProfile,
pub connection: ConnectionProfile,
}
pub struct ProfileMetadata {
pub profile_id: String,
pub browser_family: BrowserFamily,
pub browser_version: String,
pub platform: String,
pub captured_at: Option<String>,
pub source: Option<String>,
pub fidelity: EmulationFidelity,
}
pub struct RequestProfile {
pub default_headers: Vec<(String, String)>,
pub http1: Option<Http1Fingerprint>,
}
pub struct ConnectionProfile {
pub tls: Option<TlsFingerprint>,
pub boring_tls: Option<BoringTlsFingerprint>,
pub http2: Option<Http2Fingerprint>,
pub http3: Option<Http3Fingerprint>,
}
```
### Important rule
Cookies must not live inside the generic browser emulation profile.
Reason:
- cookies are site- and account-state
- they do not belong to a reusable browser network fingerprint
- they make profile reuse brittle and misleading
## Proposed Cargo Feature Model
`ugi` already has backend and protocol features such as:
- `rustls`
- `native-tls`
- `boring-backend`
- `h2`
- `h3`
The browser emulation system should sit above those as a dedicated feature.
### Locked decision
The public browser emulation system should be gated behind a top-level feature
named `emulation`.
Reason:
- it isolates extra code and dependencies
- it keeps the risk boundary explicit
- it matches the feature's actual purpose better than a backend-specific name
### Recommended features
#### `emulation`
Purpose:
- exposes the public emulation API
- enables preset profiles
- enables automatic protocol handling
- enables automatic connection isolation
- enables the required high-fidelity backend path
Suggested shape:
```toml
emulation = ["btls-backend", "h2"]
```
This should not silently degrade to `rustls`.
If the user enables emulation, they are explicitly opting into anti-detection
behavior. A transparent downgrade would be a dangerous design because it can
create a false sense of safety.
#### `btls-backend`
Purpose:
- enables the `btls`-based TLS backend
Suggested shape:
```toml
btls-backend = ["dep:btls"]
```
This may remain available as a lower-level backend feature, but ordinary users
should normally use `emulation`.
When `emulation` is not enabled, the browser-profile preset/model layer should
not compile. The `btls-backend` feature should remain usable on its own as a
transport/backend choice without pulling in preset emulation behavior.
#### Optional future feature: `emulation-h3`
Purpose:
- reserves a dedicated switch for future QUIC / HTTP/3 emulation work
This should not be part of the first milestone.
### Why separate the features
Because these are distinct concerns:
- public emulation API
- backend implementation
- future protocol extensions
The stable ordinary-user entry point should be `emulation`. Lower-level backend
features remain advanced knobs or implementation details.
## Backend Capability Matrix
This matrix should stay in sync with the actual feature gates and validation
matrix in CI.
| default features | no | no | generic only | no | ordinary HTTP client users |
| `btls-backend` | no | yes | generic only | no | advanced users choosing the backend explicitly |
| `emulation` | yes | yes | yes | yes | ordinary anti-bot API users |
| `emulation,h3` | yes | yes | yes, plus current H3 entry path | yes | maintainers / early adopters |
Notes:
- default builds should not compile the preset/profile implementation
- `btls-backend` should compile without dragging in the public emulation API
- `emulation` should be the only feature set that promises preset-driven
browser-like behavior
- `h3` is still not a claim of browser-grade QUIC impersonation
## Proposed Public API
The public API should have a simple path and an advanced path.
### Simple path
For most users:
```rust
let client = ugi::Client::builder()
.emulation(ugi::Emulation::Chrome136)
.build()?;
```
This should be enough to trigger all automatic behavior that is safe and
deterministic.
### Advanced path
For power users:
```rust
let profile = ugi::EmulationProfile::builder()
.http1_fingerprint(...)
.http2_fingerprint(...)
.boring_tls_fingerprint(...)
.default_header("user-agent", "...")?
.build();
let client = ugi::Client::builder()
.emulation(profile)
.build()?;
```
### Recommended type layout
```rust
pub enum Emulation {
Chrome136,
Firefox128,
Safari18_4,
Custom(EmulationProfile),
}
pub enum EmulationFidelity {
BestEffort,
High,
}
```
And builder methods:
```rust
impl ClientBuilder {
pub fn emulation(self, profile: impl Into<EmulationProfile>) -> Self;
}
```
And the same pattern should exist on `RequestBuilder`:
```rust
impl RequestBuilder {
pub fn emulation(self, profile: impl Into<EmulationProfile>) -> Self;
}
```
`Emulation` presets and custom `EmulationProfile` values should both flow
through the same `.emulation(..)` entrypoint. Compatibility aliases such as
`.emulation_profile(..)` may exist temporarily, but they are not the preferred
API shape.
### Request-level override semantics
Locked decision:
- `RequestBuilder` may override `ClientBuilder`
- but the request-level override must replace the effective emulation profile
for that request, not partially merge connection fingerprints
That keeps the API aligned with the general `ugi` style while avoiding mixed
and ambiguous connection state.
### Header override note
Ordinary `ugi` header overrides should continue to work on top of emulation.
For example, overriding `user-agent` after selecting a preset is allowed.
However, once the caller does that, the request is no longer a canonical
representation of that preset. The library should permit it, but the fidelity
risk belongs to the caller.
### Preset-version note
Preset names should stay versioned, such as `Chrome136`, not generic labels
such as `ChromeStable`.
Reason:
- the preset is meant to represent a captured network fingerprint bundle
- that bundle includes more than `user-agent`
- changing `user-agent` alone does not transform one captured profile into
another browser version
Therefore:
- preset selection identifies the intended captured fingerprint bundle
- header overrides remain allowed
- a caller who needs a different browser version should select a different
preset or provide a custom profile, not only override `user-agent`
## What Should Be Automated
For ordinary users, yes, a large part of the system can and should be
automated.
### Safe and recommended automation
When `.emulation(...)` is used, `ugi` should automatically:
- select the `btls` backend
- upgrade `ProtocolPolicy::Auto` to `PreferHttp2` when `h2` is enabled
- apply ALPN overrides only to protocol-specific clones when needed
- inject default request headers without overriding explicit user headers
- isolate connection pools by emulation connection fingerprint
- reject unsafe cross-profile connection reuse
- reject explicitly incompatible backend choices when emulation is active
### What should not be automated blindly
- site-specific cookies
- CSRF tokens
- business headers with rotating semantics
- challenge/clearance token generation
- request-context heuristics that depend on page state
Those either belong outside emulation or should be added later behind explicit
APIs.
## Merge Rules
The merge rules must be deterministic and documented.
Recommended rules:
- explicit user configuration wins over emulation defaults
- request-level emulation replaces client-level emulation for that request
- no silent overwriting of explicit user TLS settings
- no silent backend downgrade
- incompatible combinations should fail deterministically
This applies to:
- request headers
- backend selection
- TLS fingerprint fields
- HTTP/2 fingerprint fields
## Connection-Pool Isolation
This is mandatory.
Without strict pool isolation, different profiles may reuse the same already
established TLS or HTTP/2 connection. If that happens:
- the visible request headers may look like browser A
- the underlying TLS or HTTP/2 connection may still belong to browser B
That makes the emulation incoherent and easy to detect.
### Required pool-key dimensions
The pool key should include, directly or via a stable hash:
- TLS backend
- emulation profile identity
- TLS fingerprint identity
- HTTP/2 fingerprint identity
- relevant ALPN identity
A sketch:
```rust
pub struct EmulationConnectionKey {
pub tls_backend: &'static str,
pub emulation_profile_hash: Option<[u8; 32]>,
pub tls_fingerprint_hash: Option<[u8; 32]>,
pub http2_fingerprint_hash: Option<[u8; 32]>,
}
```
## HTTP/1 Scope for the Current Goal
For API traffic, HTTP/1 matters less than TLS and HTTP/2, but it still matters
in some environments.
Recommended minimum target:
- stable header output ordering
- explicit separation between HTTP/1 order and HTTP/2 order
- configured original header casing for selected field names
- configured field-name serialization for generated headers such as `Host` and
`Content-Length`
The internal generic `HeaderMap` still lowercases names for lookup and merge
semantics, but the HTTP/1 encoder now restores configured wire names through
`Http1Fingerprint.original_header_case`. This means profile-driven HTTP/1
serialization is covered for selected headers without requiring a second header
map implementation.
## HTTP/2 Scope for the Current Goal
HTTP/2 is first-class for this work. It cannot be treated as an afterthought.
### Minimum target for anti-bot API traffic
- SETTINGS values
- SETTINGS order
- pseudo-header order
- regular-header order
- initial stream window
- initial connection window
- optional priority / dependency frames
- pool isolation keyed by HTTP/2 fingerprint
### Why this matters
In practice, many anti-bot systems correlate:
- TLS fingerprint
- HTTP/2 SETTINGS fingerprint
- request header shape
Doing only TLS and ignoring HTTP/2 leaves the job half-finished.
## HTTP/3 / QUIC
HTTP/3 should remain outside the first implementation milestone.
Reason:
- QUIC and HTTP/3 fingerprinting is a distinct problem
- it should not be conflated with the TLS-over-TCP work
- the current API traffic goal can make meaningful progress with HTTPS + H2
Current guidance:
- keep the existing `h3` protocol entry points
- do not claim browser-grade HTTP/3 impersonation
- keep the current `Http3Fingerprint` minimal until the QUIC design is ready
### What exists today
The repository already has a minimal `Http3Fingerprint`:
```rust
pub struct Http3Fingerprint {
pub alpn_protocols: Vec<String>,
}
```
That is enough to keep request/profile/pool identity shapes coherent, but it is
not enough to claim browser-grade QUIC impersonation.
### Recommended future split
When this work is resumed, the model should split QUIC transport behavior from
HTTP/3 application behavior instead of treating them as one flat struct.
```rust
pub struct QuicFingerprint {
pub versions: Vec<String>,
pub transport_parameter_order: Vec<String>,
pub initial_max_data: Option<u64>,
pub initial_max_stream_data_bidi_local: Option<u64>,
pub initial_max_stream_data_bidi_remote: Option<u64>,
pub initial_max_stream_data_uni: Option<u64>,
pub initial_max_streams_bidi: Option<u64>,
pub initial_max_streams_uni: Option<u64>,
pub max_udp_payload_size: Option<u64>,
pub active_connection_id_limit: Option<u64>,
pub ack_delay_exponent: Option<u64>,
pub max_ack_delay_millis: Option<u64>,
pub disable_active_migration: Option<bool>,
}
```
```rust
pub struct Http3Fingerprint {
pub alpn_protocols: Vec<String>,
pub settings_order: Vec<String>,
pub qpack_max_table_capacity: Option<u64>,
pub qpack_blocked_streams: Option<u64>,
pub priority_update_strategy: Option<String>,
}
```
### Minimum design rules for future QUIC/H3 emulation
- pool identity must include the full QUIC transport fingerprint, not just ALPN
- QUIC transport parameters and HTTP/3 SETTINGS must be modeled separately
- any future browser-grade claim must be backed by live QUIC / H3 probe tests,
not only unit tests
- do not silently reuse an h3 connection across different QUIC fingerprints
- do not promise browser-grade parity until the transport-parameter, SETTINGS,
and lifecycle probes exist together
## Proposed Fingerprint Types
### Generic TLS fingerprint
This type should remain backend-agnostic and limited to fields that make sense
across backends.
```rust
pub struct TlsFingerprint {
pub alpn_protocols: Vec<String>,
pub min_tls_version: Option<String>,
pub max_tls_version: Option<String>,
}
```
### Boring-specific TLS fingerprint
This is where browser TLS fidelity should live.
```rust
pub struct BoringTlsFingerprint {
pub cipher_list: Option<String>,
pub curves_list: Option<String>,
pub sigalgs_list: Option<String>,
pub grease_enabled: Option<bool>,
pub permute_extensions: Option<bool>,
pub enable_ocsp_stapling: bool,
pub enable_signed_cert_timestamps: bool,
pub enable_ech_grease: bool,
pub certificate_compression: Vec<BoringCertCompression>,
}
```
### HTTP/1 fingerprint
```rust
pub struct Http1Fingerprint {
pub header_order: Vec<String>,
pub original_header_case: Vec<(String, String)>,
}
```
### HTTP/2 fingerprint
```rust
pub enum Http2PriorityPhase {
BeforeHeaders,
AfterHeaders,
}
```
```rust
pub struct Http2PrioritySpec {
pub stream_id: Option<u32>,
pub phase: Http2PriorityPhase,
pub stream_dependency: u32,
pub weight: u16,
pub exclusive: bool,
}
```
```rust
pub struct Http2Fingerprint {
pub settings_order: Vec<String>,
pub pseudo_header_order: Vec<String>,
pub regular_header_order: Vec<String>,
pub header_table_size: Option<u32>,
pub initial_window_size: Option<u32>,
pub initial_connection_window_size: Option<u32>,
pub max_frame_size: Option<u32>,
pub priorities: Vec<Http2PrioritySpec>,
}
```
## Coverage Review Checklist
Status legend:
- `[x]` covered
- `[~]` partially covered
- `[ ]` missing
### Public API and feature gating
- `[x]` `ClientBuilder::emulation(..)` exists
- `[x]` `RequestBuilder::emulation(..)` exists
- `[x]` `emulation` Cargo feature
- `[x]` `btls-backend` backend feature
- `[x]` `btls-backend` can compile without `emulation`
- `[x]` request-level override semantics rewritten to be safe
- `[x]` preset/profile implementation excluded from non-`emulation` builds
- `[x]` backend capability matrix visible to maintainers and users
### Profile model
- `[x]` preset enum exists
- `[x]` `EmulationProfile` split into metadata / request / connection
- `[x]` metadata for profile identity and capture provenance
- `[x]` remove cookies from the generic profile model
- `[x]` versioned preset naming that matches actual captured profiles
### TLS backend integration
- `[x]` a Boring-family backend feature exists in current code (`btls-backend`)
- `[x]` `TlsBackend::Boring` exists
- `[x]` custom `async-io` wrapper for `btls::ssl::SslStream` exists
- `[x]` ALPN mapping exists
- `[x]` TLS min/max version mapping exists
- `[x]` cipher-list mapping exists
- `[x]` curves / key shares mapping
- `[x]` signature-algorithm mapping
- `[x]` GREASE controls
- `[x]` extension permutation controls
- `[x]` certificate compression controls
- `[x]` OCSP stapling controls
- `[x]` signed certificate timestamp controls
- `[x]` ECH grease controls
- `[x]` strict fidelity failure mode
### Connection isolation
- `[x]` pool key includes TLS backend name
- `[x]` H2 pool key includes SETTINGS payload bytes
- `[x]` pool key includes TLS fingerprint hash
- `[x]` pool key includes effective emulation connection hash
- `[x]` pool key includes full HTTP/2 fingerprint hash
- `[x]` request-level emulation semantics tightened to match connection reuse
### HTTP/1
- `[x]` request header order can be influenced explicitly
- `[x]` HTTP/1-specific fingerprint type
- `[x]` original header case preservation for configured headers
- `[x]` configured original field-name serialization, including generated headers
### HTTP/2
- `[x]` SETTINGS fingerprint support
- `[x]` header ordering support
- `[x]` SETTINGS order control
- `[x]` split pseudo-header vs regular-header order
- `[x]` initial connection window fingerprinting
- `[x]` priority / dependency fingerprinting
- `[~]` richer H2 lifecycle fingerprint behavior
Current partial coverage:
- arbitrary startup PRIORITY frames can now target placeholder stream IDs
- lifecycle tests cover both:
- placeholder priority-tree frames before the first request HEADERS
- explicit post-HEADERS reprioritization on the request stream
- more browser-specific multi-frame connection choreography is still not modeled
### HTTP/3
- `[~]` protocol entry points exist
- `[x]` QUIC / H3 fingerprint model
- `[~]` browser-grade QUIC fingerprinting design
Current partial coverage:
- the document now defines a concrete split between future QUIC transport and
HTTP/3 application fingerprints
- current code still only implements the minimal `Http3Fingerprint` surface
- browser-grade transport-parameter shaping and live QUIC probe validation are
still not implemented
### Validation and regression protection
- `[x]` unit tests cover basic emulation API behavior
- `[x]` unit tests cover H2 header ordering
- `[x]` unit tests cover H2 SETTINGS payload generation and ordering
- `[x]` lifecycle tests cover H2 startup frame sequence and request header order
- `[x]` integration tests cover request-level profile replacement semantics
- `[x]` offline golden tests for Boring TLS application planning
- `[x]` live ClientHello capture tests cover selected Boring TLS fields
- `[x]` normalized preset fixture corpus exists for current Boring TLS presets
- `[x]` canonical normalized byte-level ClientHello corpus exists for current presets
- `[x]` golden tests for pool isolation by profile hash
- `[x]` optional manual or nightly online smoke tests
The repository now includes an ignored integration suite for online emulation
smoke checks:
```bash
UGI_RUN_ONLINE_SMOKE=1 cargo test --test emulation_online_smoke --features emulation -- --ignored
```
Default target:
- `https://tls.peet.ws/api/all`
Override target:
- `UGI_EMULATION_SMOKE_URL=https://...`
## Implementation Phases
The implementation can still be developed in ordered steps internally, but the
acceptance target should be one coherent milestone rather than a partial public
release.
### Coherent milestone target
- introduce `emulation` feature
- switch the high-fidelity backend path to `btls`
- split request vs connection profile state
- remove cookies from generic emulation
- define merge rules
- isolate pools by effective emulation/TLS/H2 fingerprint identity
- add `BoringTlsFingerprint`
- wire:
- ALPN
- cipher list
- curves
- sigalgs
- GREASE
- extension permutation
- certificate compression
- OCSP/SCT
- ECH grease
- split HTTP/1 and HTTP/2 ordering
- implement:
- SETTINGS order
- pseudo-header and regular-header order
- initial connection window
- priorities
- add golden tests for fingerprint application
- add pool-isolation regression tests
### Explicitly deferred from the first milestone
- true browser-grade HTTP/3 / QUIC impersonation
- navigation/XHR/image request-context automation
- full browser runtime fingerprinting
## Remaining Clarifications Before Coding
The high-level product decisions are now mostly locked. The remaining items are
engineering clarifications, not strategy blockers.
### 1. Exact preset set for the first milestone
The document assumes versioned presets. We still need to choose the exact first
set, for example:
- `Chrome136`
- `Firefox128`
- `Safari18_4`
This is bounded and should be decided from the profile data we are willing to
maintain.
### 2. How far to evolve existing public types in place
Current exported types already exist:
- `Emulation`
- `BrowserProfile`
- `TlsFingerprint`
- `Http2Fingerprint`
Recommended default:
- keep `Emulation` as the top-level entry point
- reshape or replace the lower-level profile types as needed
- prefer correctness over preserving a clearly wrong abstraction
### 3. CI matrix
This means which Cargo feature combinations we promise to compile and test in
automation.
Recommended default:
- default features
- `--features btls-backend`
- `--features emulation`
- `--features emulation,h3` if the h3 lane remains practical
### 4. First-pass Boring TLS field completeness
Locked product decision:
- do not ship a "TLS only" public milestone
- do not split core `btls` fingerprint controls across multiple public
iterations
Expected first complete branch scope:
- ALPN
- cipher list
- curves
- sigalgs
- GREASE
- extension permutation
- certificate compression
- OCSP/SCT
- ECH grease
## Final Recommendation
For the current `ugi` anti-bot API request goal, the design direction should be:
- keep `rustls` as the default general-purpose backend
- treat `btls-backend` as the primary high-fidelity TLS backend
- add a dedicated `emulation` feature
- keep the simple user-facing API centered on `.emulation(...)`
- automate protocol preference, merge behavior, and pool isolation
- enforce the required backend rather than silently downgrade
- defer request-context header automation and full HTTP/3 impersonation until
the TLS + HTTP/2 path is coherent
That gives ordinary users a one-line entry point while keeping the hard parts
inside the library where they belong.