# ClientConnector Refactor Design Document
## Background
The `client_chain` feature allows routing traffic through multiple proxy hops. During implementation, a bug was discovered: **direct connections with `bind_interface` don't work correctly**.
### The Bug
When a user configures:
```yaml
client_chain:
- protocol: direct
bind_interface: eth0
```
The connection fails because `create_transport()` tries to connect to `self.location`, which is `0.0.0.0:0` (UNSPECIFIED) for direct configs.
The root cause is that the `ClientConnector` trait conflates two different operations:
1. Creating a socket and connecting somewhere
2. Setting up a proxy protocol on the connection
For proxy connectors, these are coupled: connect to proxy server, then setup protocol.
For direct connectors, there's no proxy server - we connect directly to the target.
## Design Goals
1. Fix the `bind_interface` bug for direct connections
2. Support mixed pools (direct + proxy in same round-robin pool)
3. Keep the design clean and explicit about what each connector type does
4. Validate invalid configurations at load time where possible
## Key Insight: First Hop is Different
The fundamental realization is that **hop 0 is fundamentally different from hops 1+**:
| Hop | What it does |
|-----|--------------|
| Hop 0 | Creates the TCP connection (uses bind_interface, connects to an address) |
| Hop 1+ | Sets up protocol on existing stream (no socket creation) |
A direct/local connector at hop 1+ is meaningless - the TCP connection already exists. You can't "directly connect" when you're already tunneled through hop 0.
## Use Cases That Must Work
### 1. Direct connection with bind_interface
```yaml
client_chain:
- protocol: direct
bind_interface: eth0
```
Connect directly to remote using eth0.
### 2. Proxy connection
```yaml
client_chain:
- protocol: vless
address: proxy.example.com:443
```
Connect to proxy, setup VLESS protocol to remote.
### 3. Direct first hop, then proxy
```yaml
client_chain:
- protocol: direct
bind_interface: eth0
- protocol: vless
address: proxy.example.com:443
```
Connect to proxy.example.com:443 using eth0, then setup VLESS to remote.
### 4. Multi-hop proxy chain
```yaml
client_chain:
- protocol: socks5
address: hop1.example.com:1080
- protocol: vless
address: hop2.example.com:443
```
Connect to hop1, setup SOCKS5 to hop2, setup VLESS to remote.
### 5. Pool of direct connectors with different interfaces
```yaml
- client_group: network_interfaces
client_proxies:
- protocol: direct
bind_interface: eth0
- protocol: direct
bind_interface: eth1
client_chain:
- network_interfaces
- some_proxy
```
Round-robin between eth0 and eth1 when connecting to some_proxy.
### 6. Mixed pool (direct + proxy) - ShadowTLS use case
```yaml
- client_group: shadowtls_pool
client_proxies:
- protocol: direct
bind_interface: eth0
- protocol: shadowtls
address: 1.2.3.4:443
- protocol: direct
bind_interface: eth1
client_chain:
- shadowtls_pool
- vless_proxy
```
Round-robin between:
- Direct to vless_proxy using eth0
- ShadowTLS to 1.2.3.4, then to vless_proxy
- Direct to vless_proxy using eth1
This is used for load balancing across different IPs/interfaces to reach the same destination.
## Proposed Design
### Two Methods for Different Hop Positions
```rust
#[async_trait]
pub trait ClientConnector: Send + Sync + Debug {
/// Returns the bind interface for this connector, if any.
fn bind_interface(&self) -> &Option<String>;
/// Check if this connector supports UDP-over-TCP tunneling.
fn supports_udp_over_tcp(&self) -> bool;
/// Returns the proxy server address, if this is a proxy connector.
/// Local/direct connectors return None.
fn proxy_location(&self) -> Option<&NetLocation>;
/// For hop 0: Create TCP connection and setup protocol (if any).
///
/// - Local: connects to `target` using bind_interface, returns stream as-is
/// - Proxy: connects to self.proxy_location() using bind_interface,
/// sets up protocol targeting `target`
///
/// # Arguments
/// * `resolver` - DNS resolver for address resolution
/// * `target` - Where traffic should ultimately reach through this hop.
/// For local, this is the connect destination.
/// For proxy, this is the protocol target (proxy connects to its own server).
async fn connect_as_first_hop(
&self,
resolver: &Arc<dyn Resolver>,
target: &NetLocation,
) -> std::io::Result<TcpClientSetupResult>;
/// For hop 1+: Setup protocol on existing stream.
///
/// - Local: ERROR (Local only valid at hop 0)
/// - Proxy: sets up protocol targeting `target`
///
/// # Arguments
/// * `stream` - Existing transport stream from previous hops
/// * `target` - Where traffic should reach through this hop
async fn setup_on_existing_stream(
&self,
stream: Box<dyn AsyncStream>,
target: &NetLocation,
) -> std::io::Result<TcpClientSetupResult>;
/// For hop 0 UDP: Create connection and setup UDP-over-TCP stream.
async fn connect_udp_as_first_hop(
&self,
resolver: &Arc<dyn Resolver>,
target: &NetLocation,
preferred_type: UdpStreamType,
) -> std::io::Result<TcpClientUdpSetupResult>;
/// For hop 1+ UDP: Setup UDP-over-TCP on existing stream.
async fn setup_udp_on_existing_stream(
&self,
stream: Box<dyn AsyncStream>,
target: &NetLocation,
preferred_type: UdpStreamType,
) -> std::io::Result<TcpClientUdpSetupResult>;
}
```
### LocalConnector Implementation
```rust
/// For direct connections - socket config only, no protocol.
#[derive(Debug)]
pub struct LocalConnector {
bind_interface: Option<String>,
tcp_config: TcpConfig,
// Future: quic_config for direct QUIC connections
}
#[async_trait]
impl ClientConnector for LocalConnector {
fn bind_interface(&self) -> &Option<String> {
&self.bind_interface
}
fn supports_udp_over_tcp(&self) -> bool {
false // Direct connections don't tunnel UDP over TCP
}
fn proxy_location(&self) -> Option<&NetLocation> {
None // No proxy server
}
async fn connect_as_first_hop(
&self,
resolver: &Arc<dyn Resolver>,
target: &NetLocation,
) -> std::io::Result<TcpClientSetupResult> {
// Connect directly to target using our bind_interface
let target_addr = resolve_single_address(resolver, target).await?;
let tcp_socket = new_tcp_socket(self.bind_interface.clone(), target_addr.is_ipv6())?;
let stream = tcp_socket.connect(target_addr).await?;
// Apply TCP settings
if self.tcp_config.no_delay {
let _ = stream.set_nodelay(true);
}
Ok(TcpClientSetupResult {
client_stream: Box::new(stream),
early_data: None,
})
}
async fn setup_on_existing_stream(
&self,
_stream: Box<dyn AsyncStream>,
_target: &NetLocation,
) -> std::io::Result<TcpClientSetupResult> {
// Local/direct connector cannot be used as intermediate hop
// The TCP connection already exists - "direct" makes no sense here
Err(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
"Direct connector cannot be used as intermediate hop (position > 0)",
))
}
async fn connect_udp_as_first_hop(
&self,
_resolver: &Arc<dyn Resolver>,
_target: &NetLocation,
_preferred_type: UdpStreamType,
) -> std::io::Result<TcpClientUdpSetupResult> {
Err(std::io::Error::new(
std::io::ErrorKind::Unsupported,
"Direct connector does not support UDP-over-TCP",
))
}
async fn setup_udp_on_existing_stream(
&self,
_stream: Box<dyn AsyncStream>,
_target: &NetLocation,
_preferred_type: UdpStreamType,
) -> std::io::Result<TcpClientUdpSetupResult> {
Err(std::io::Error::new(
std::io::ErrorKind::Unsupported,
"Direct connector does not support UDP-over-TCP",
))
}
}
```
### TcpClientConnector Implementation (Proxy)
```rust
/// For proxy connections - has address + protocol handler.
#[derive(Debug)]
pub struct TcpClientConnector {
bind_interface: Option<String>,
location: NetLocation,
transport_config: TransportConfig,
client_handler: Box<dyn TcpClientHandler>,
}
#[async_trait]
impl ClientConnector for TcpClientConnector {
fn bind_interface(&self) -> &Option<String> {
&self.bind_interface
}
fn supports_udp_over_tcp(&self) -> bool {
self.client_handler.supports_udp_over_tcp()
}
fn proxy_location(&self) -> Option<&NetLocation> {
Some(&self.location)
}
async fn connect_as_first_hop(
&self,
resolver: &Arc<dyn Resolver>,
target: &NetLocation,
) -> std::io::Result<TcpClientSetupResult> {
// Connect to our proxy server (self.location)
let stream = self.create_transport(resolver).await?;
// Setup protocol targeting `target`
self.client_handler.setup_client_stream(stream, target.clone()).await
}
async fn setup_on_existing_stream(
&self,
stream: Box<dyn AsyncStream>,
target: &NetLocation,
) -> std::io::Result<TcpClientSetupResult> {
// Setup protocol on existing stream
self.client_handler.setup_client_stream(stream, target.clone()).await
}
async fn connect_udp_as_first_hop(
&self,
resolver: &Arc<dyn Resolver>,
target: &NetLocation,
preferred_type: UdpStreamType,
) -> std::io::Result<TcpClientUdpSetupResult> {
let stream = self.create_transport(resolver).await?;
self.client_handler.setup_udp_stream(stream, target.clone(), preferred_type).await
}
async fn setup_udp_on_existing_stream(
&self,
stream: Box<dyn AsyncStream>,
target: &NetLocation,
preferred_type: UdpStreamType,
) -> std::io::Result<TcpClientUdpSetupResult> {
self.client_handler.setup_udp_stream(stream, target.clone(), preferred_type).await
}
}
```
### Updated ClientProxyChain
```rust
pub struct ClientProxyChain {
/// Each hop is a pool of connectors (for round-robin selection).
/// Mixed pools (Local + Proxy) are allowed at hop 0.
hops: Vec<Vec<Box<dyn ClientConnector>>>,
/// Round-robin index for each hop's pool.
next_indices: Vec<AtomicU32>,
}
impl ClientProxyChain {
/// Find the next proxy location in the remaining connectors.
/// Returns None if no proxy connectors remain (all are Local or empty).
fn find_next_proxy_location<'a>(
connectors: &[&'a dyn ClientConnector],
) -> Option<&'a NetLocation> {
connectors.iter().find_map(|c| c.proxy_location())
}
pub async fn connect_tcp(
&self,
remote_location: NetLocation,
resolver: &Arc<dyn Resolver>,
) -> std::io::Result<TcpClientSetupResult> {
// Select one connector from each hop (round-robin)
let connectors: Vec<&dyn ClientConnector> = (0..self.hops.len())
.map(|i| self.select_from_pool(i))
.collect();
// Determine target for hop 0:
// - If there are more hops, target is the first proxy location among them
// - If no proxy locations remain, target is remote
let hop0_target = Self::find_next_proxy_location(&connectors[1..])
.unwrap_or(&remote_location);
// Hop 0: create connection and setup protocol (if proxy)
let mut result = connectors[0]
.connect_as_first_hop(resolver, hop0_target)
.await?;
// Hops 1+: protocol setup only on existing stream
for i in 1..connectors.len() {
// Target for this hop: next proxy location, or remote if none
let target = Self::find_next_proxy_location(&connectors[i + 1..])
.unwrap_or(&remote_location);
result = connectors[i]
.setup_on_existing_stream(result.client_stream, target)
.await?;
}
Ok(result)
}
}
```
## Walkthrough: All Cases
### Case 1: Single hop - Direct only
**Config:**
```yaml
client_chain:
- protocol: direct
bind_interface: eth0
```
**Execution:**
```
connectors = [Local(eth0)]
hop0_target = find_next_proxy_location([]) = None → remote
connectors[0].connect_as_first_hop(resolver, remote)
→ LocalConnector: connect to `remote` using eth0
→ Return stream
```
**Result:** Direct TCP connection to remote using eth0. Correct.
---
### Case 2: Single hop - Proxy only
**Config:**
```yaml
client_chain:
- protocol: vless
address: proxy.example.com:443
```
**Execution:**
```
connectors = [Proxy(vless@proxy.example.com:443)]
hop0_target = find_next_proxy_location([]) = None → remote
connectors[0].connect_as_first_hop(resolver, remote)
→ TcpClientConnector: connect to proxy.example.com:443
→ Setup VLESS protocol targeting `remote`
→ Return stream
```
**Result:** Connect to proxy, VLESS handshake to remote. Correct.
---
### Case 3: Direct first hop, then proxy
**Config:**
```yaml
client_chain:
- protocol: direct
bind_interface: eth0
- protocol: vless
address: proxy.example.com:443
```
**Execution:**
```
connectors = [Local(eth0), Proxy(vless@proxy.example.com:443)]
hop0_target = find_next_proxy_location([Proxy(vless)]) = proxy.example.com:443
connectors[0].connect_as_first_hop(resolver, proxy.example.com:443)
→ LocalConnector: connect to proxy.example.com:443 using eth0
→ Return stream
hop1_target = find_next_proxy_location([]) = None → remote
connectors[1].setup_on_existing_stream(stream, remote)
→ TcpClientConnector: setup VLESS protocol targeting `remote`
→ Return stream
```
**Result:** Connect to proxy using eth0, VLESS handshake to remote. Correct.
---
### Case 4: Two-hop proxy chain
**Config:**
```yaml
client_chain:
- protocol: socks5
address: hop1.example.com:1080
- protocol: vless
address: hop2.example.com:443
```
**Execution:**
```
connectors = [Proxy(socks5@hop1), Proxy(vless@hop2)]
hop0_target = find_next_proxy_location([Proxy(vless@hop2)]) = hop2.example.com:443
connectors[0].connect_as_first_hop(resolver, hop2.example.com:443)
→ TcpClientConnector: connect to hop1.example.com:1080
→ Setup SOCKS5 protocol targeting hop2.example.com:443
→ Return stream
hop1_target = find_next_proxy_location([]) = None → remote
connectors[1].setup_on_existing_stream(stream, remote)
→ TcpClientConnector: setup VLESS protocol targeting `remote`
→ Return stream
```
**Result:** Connect to hop1, SOCKS5 to hop2, VLESS to remote. Correct.
---
### Case 5: Pool of direct connectors
**Config:**
```yaml
- client_group: network_interfaces
client_proxies:
- protocol: direct
bind_interface: eth0
- protocol: direct
bind_interface: eth1
client_chain:
- network_interfaces
- protocol: vless
address: proxy.example.com:443
```
**Execution (selection: eth0):**
```
connectors = [Local(eth0), Proxy(vless@proxy)]
hop0_target = find_next_proxy_location([Proxy(vless)]) = proxy.example.com:443
connectors[0].connect_as_first_hop(resolver, proxy.example.com:443)
→ LocalConnector: connect to proxy.example.com:443 using eth0
→ Return stream
hop1_target = remote
connectors[1].setup_on_existing_stream(stream, remote)
→ Setup VLESS to remote
```
**Execution (selection: eth1):**
```
connectors = [Local(eth1), Proxy(vless@proxy)]
hop0_target = proxy.example.com:443
connectors[0].connect_as_first_hop(resolver, proxy.example.com:443)
→ LocalConnector: connect to proxy.example.com:443 using eth1
→ Return stream
hop1_target = remote
connectors[1].setup_on_existing_stream(stream, remote)
→ Setup VLESS to remote
```
**Result:** Round-robin between eth0 and eth1 for connecting to the proxy. Correct.
---
### Case 6: Mixed pool (ShadowTLS use case)
**Config:**
```yaml
- client_group: shadowtls_pool
client_proxies:
- protocol: direct
bind_interface: eth0
- protocol: shadowtls
address: 1.2.3.4:443
- protocol: direct
bind_interface: eth1
client_chain:
- shadowtls_pool
- protocol: vless
address: 5.6.7.8:443
```
**Execution (selection: direct_eth0):**
```
connectors = [Local(eth0), Proxy(vless@5.6.7.8)]
hop0_target = find_next_proxy_location([Proxy(vless)]) = 5.6.7.8:443
connectors[0].connect_as_first_hop(resolver, 5.6.7.8:443)
→ LocalConnector: connect to 5.6.7.8:443 using eth0
→ Return stream
hop1_target = remote
connectors[1].setup_on_existing_stream(stream, remote)
→ Setup VLESS to remote
```
**Execution (selection: shadowtls@1.2.3.4):**
```
connectors = [Proxy(shadowtls@1.2.3.4), Proxy(vless@5.6.7.8)]
hop0_target = find_next_proxy_location([Proxy(vless)]) = 5.6.7.8:443
connectors[0].connect_as_first_hop(resolver, 5.6.7.8:443)
→ TcpClientConnector: connect to 1.2.3.4:443
→ Setup ShadowTLS protocol targeting 5.6.7.8:443
→ Return stream
hop1_target = remote
connectors[1].setup_on_existing_stream(stream, remote)
→ Setup VLESS to remote
```
**Execution (selection: direct_eth1):**
```
connectors = [Local(eth1), Proxy(vless@5.6.7.8)]
hop0_target = 5.6.7.8:443
connectors[0].connect_as_first_hop(resolver, 5.6.7.8:443)
→ LocalConnector: connect to 5.6.7.8:443 using eth1
→ Return stream
hop1_target = remote
connectors[1].setup_on_existing_stream(stream, remote)
→ Setup VLESS to remote
```
**Result:** Round-robin between direct-eth0, shadowtls, direct-eth1. All three paths correctly reach the vless proxy and then remote. Correct.
---
### Case 7: Single hop - Mixed pool
**Config:**
```yaml
- client_group: mixed
client_proxies:
- protocol: direct
bind_interface: eth0
- protocol: socks5
address: proxy.example.com:1080
client_chain:
- mixed
```
**Execution (selection: direct_eth0):**
```
connectors = [Local(eth0)]
hop0_target = find_next_proxy_location([]) = None → remote
connectors[0].connect_as_first_hop(resolver, remote)
→ LocalConnector: connect to remote using eth0
```
**Execution (selection: socks5):**
```
connectors = [Proxy(socks5@proxy)]
hop0_target = find_next_proxy_location([]) = None → remote
connectors[0].connect_as_first_hop(resolver, remote)
→ TcpClientConnector: connect to proxy.example.com:1080
→ Setup SOCKS5 targeting remote
```
**Result:** Round-robin between direct and proxied connections to remote. Correct.
---
### Case 8: Invalid - Direct at hop 1+
**Config:**
```yaml
client_chain:
- protocol: socks5
address: hop1.example.com:1080
- protocol: direct
bind_interface: eth0
- protocol: vless
address: hop2.example.com:443
```
**Execution:**
```
connectors = [Proxy(socks5), Local(eth0), Proxy(vless)]
hop0_target = find_next_proxy_location([Local(eth0), Proxy(vless)]) = hop2.example.com:443
connectors[0].connect_as_first_hop(resolver, hop2.example.com:443)
→ Connect to hop1, SOCKS5 to hop2
→ Return stream
hop1_target = find_next_proxy_location([Proxy(vless)]) = hop2.example.com:443
connectors[1].setup_on_existing_stream(stream, hop2.example.com:443)
→ LocalConnector: ERROR "Direct connector cannot be used as intermediate hop"
```
**Result:** Runtime error. Could also be caught at config validation time.
---
### Case 9: Three-hop chain with direct first
**Config:**
```yaml
client_chain:
- protocol: direct
bind_interface: eth0
- protocol: socks5
address: hop1.example.com:1080
- protocol: vless
address: hop2.example.com:443
```
**Execution:**
```
connectors = [Local(eth0), Proxy(socks5@hop1), Proxy(vless@hop2)]
hop0_target = find_next_proxy_location([Proxy(socks5), Proxy(vless)]) = hop1.example.com:1080
connectors[0].connect_as_first_hop(resolver, hop1.example.com:1080)
→ LocalConnector: connect to hop1.example.com:1080 using eth0
→ Return stream
hop1_target = find_next_proxy_location([Proxy(vless)]) = hop2.example.com:443
connectors[1].setup_on_existing_stream(stream, hop2.example.com:443)
→ Setup SOCKS5 targeting hop2.example.com:443
→ Return stream
hop2_target = find_next_proxy_location([]) = None → remote
connectors[2].setup_on_existing_stream(stream, remote)
→ Setup VLESS targeting remote
→ Return stream
```
**Result:** Connect to hop1 using eth0, SOCKS5 to hop2, VLESS to remote. Correct.
## Migration Path
### Changes to ClientConfig
The `ClientConfig` struct remains largely the same. The change is in how it maps to connectors:
```rust
impl ClientConfig {
pub fn into_connector(self) -> Box<dyn ClientConnector> {
if self.protocol.is_direct() {
Box::new(LocalConnector {
bind_interface: self.bind_interface.into_option(),
tcp_config: self.tcp_settings.unwrap_or_default(),
})
} else {
Box::new(TcpClientConnector::try_from(self).unwrap())
}
}
}
```
### Changes to TcpClientConnector
1. Remove `client_handler: Option<Box<dyn TcpClientHandler>>` - it's now always `Some`
2. Remove direct-handling code from `try_from()` - direct configs create `LocalConnector` instead
3. Implement new trait methods `connect_as_first_hop` and `setup_on_existing_stream`
4. Remove old `create_transport()` and `setup_client_stream()` methods (or keep as private helpers)
### Changes to ClientProxyChain
1. Update `connect_tcp()` to use new two-phase approach
2. Update `connect_udp()` similarly
3. Add `find_next_proxy_location()` helper
### Validation
Optionally add config-time validation to reject direct connectors at hop 1+:
```rust
fn validate_chain(hops: &[Vec<ClientConfig>]) -> Result<(), ConfigError> {
for (i, pool) in hops.iter().enumerate().skip(1) {
for config in pool {
if config.protocol.is_direct() {
return Err(ConfigError::InvalidChain(
format!("Direct connector at hop {} is invalid (only allowed at hop 0)", i)
));
}
}
}
Ok(())
}
```
## Summary
The refactor introduces a clean separation:
| `LocalConnector` | Socket config only | `None` | Connect to target | ERROR |
| `TcpClientConnector` | Proxy + protocol | `Some(addr)` | Connect to proxy, setup to target | Setup to target |
The key insight is that hop 0 creates the TCP connection while hops 1+ only do protocol setup. This maps naturally to two different methods on the trait, with `LocalConnector` only supporting the first-hop operation.
Mixed pools work correctly because `find_next_proxy_location()` skips over `LocalConnector` entries (which have no proxy location) to find the next actual proxy server address that needs to be reached.