remote/
lib.rs

1//! Remote copy protocol and networking for distributed file operations
2//!
3//! This crate provides the networking layer and protocol definitions for remote file copying
4//! in the RCP tools suite. It enables efficient distributed copying between remote hosts using
5//! SSH for orchestration and QUIC for high-performance data transfer.
6//!
7//! # Overview
8//!
9//! The remote copy system uses a three-node architecture:
10//!
11//! ```text
12//! Master (rcp)
13//! ├── SSH → Source Host (rcpd)
14//! │   └── QUIC → Master (control)
15//! │   └── QUIC Server (waits for Destination)
16//! └── SSH → Destination Host (rcpd)
17//!     └── QUIC → Master (control)
18//!     └── QUIC Client → Source (data transfer)
19//! ```
20//!
21//! ## Connection Flow
22//!
23//! 1. **Initialization**: Master starts `rcpd` processes on source and destination via SSH
24//! 2. **Control Connections**: Both `rcpd` processes connect back to Master via QUIC
25//! 3. **Address Exchange**: Source starts QUIC server and sends its address to Master
26//! 4. **Direct Connection**: Master forwards address to Destination, which connects to Source
27//! 5. **Data Transfer**: Files flow directly from Source to Destination (not through Master)
28//!
29//! This design ensures efficient data transfer while allowing the Master to coordinate
30//! operations and monitor progress.
31//!
32//! # Key Components
33//!
34//! ## SSH Session Management
35//!
36//! The [`SshSession`] type represents an SSH connection to a remote host and is used to:
37//! - Launch `rcpd` daemons on remote hosts
38//! - Configure connection parameters (user, host, port)
39//!
40//! ## QUIC Networking
41//!
42//! QUIC protocol provides:
43//! - Multiplexed streams over a single connection
44//! - Built-in encryption and authentication
45//! - Efficient data transfer with congestion control
46//!
47//! Key functions:
48//! - [`get_server_with_port_ranges`] - Create QUIC server endpoint with optional port restrictions
49//! - [`get_client_with_port_ranges_and_pinning`] - Create secure QUIC client with certificate pinning
50//! - [`get_endpoint_addr`] - Get the local address of an endpoint
51//!
52//! ## Port Range Configuration
53//!
54//! The [`port_ranges`] module allows restricting QUIC to specific port ranges, useful for
55//! firewall-restricted environments:
56//!
57//! ```rust,no_run
58//! # use remote::get_server_with_port_ranges;
59//! // Bind to ports in the 8000-8999 range
60//! let endpoint = get_server_with_port_ranges(Some("8000-8999"))?;
61//! # Ok::<(), anyhow::Error>(())
62//! ```
63//!
64//! ## Protocol Messages
65//!
66//! The [`protocol`] module defines the message types exchanged between nodes:
67//! - `MasterHello` - Master → rcpd configuration
68//! - `SourceMasterHello` - Source → Master address information
69//! - `RcpdResult` - rcpd → Master operation results
70//! - `TracingHello` - rcpd → Master tracing initialization
71//!
72//! ## Stream Communication
73//!
74//! The [`streams`] module provides high-level abstractions over QUIC streams:
75//! - Bidirectional streams for request/response communication
76//! - Unidirectional streams for tracing and logging
77//! - Object serialization/deserialization using bincode
78//!
79//! ## Remote Tracing
80//!
81//! The [`tracelog`] module enables distributed logging and progress tracking:
82//! - Forward tracing events from remote `rcpd` processes to Master
83//! - Aggregate progress information across multiple remote operations
84//! - Display unified progress for distributed operations
85//!
86//! # Security Model
87//!
88//! The remote copy system implements a defense-in-depth security model using SSH for authentication
89//! and certificate pinning for QUIC connection integrity. This provides protection against
90//! man-in-the-middle (MITM) attacks while maintaining ease of deployment.
91//!
92//! ## Authentication & Authorization
93//!
94//! **SSH is the security perimeter**: All remote operations begin with SSH authentication.
95//! - Initial access control is handled entirely by SSH
96//! - Users must be authenticated and authorized via SSH before any QUIC connections are established
97//! - SSH configuration (keys, permissions, etc.) determines who can initiate remote copies
98//!
99//! ## Transport Encryption & Integrity
100//!
101//! **QUIC with TLS 1.3**: All data transfer uses QUIC protocol built on TLS 1.3
102//! - Provides encryption for data confidentiality
103//! - Ensures data integrity through cryptographic authentication
104//! - Built-in protection against replay attacks
105//!
106//! ## Trust Bootstrap via Certificate Pinning
107//!
108//! **Two secured QUIC connections** in every remote copy operation:
109//!
110//! ### 1. Master ← rcpd (Control Connection)
111//! ```text
112//! Master (rcp)                    Remote Host (rcpd)
113//!    |                                   |
114//!    | 1. SSH connection established     |
115//!    |<--------------------------------->|
116//!    | 2. Master generates self-signed   |
117//!    |    cert, computes SHA-256         |
118//!    |    fingerprint                    |
119//!    |                                   |
120//!    | 3. Launch rcpd via SSH with       |
121//!    |    fingerprint as argument        |
122//!    |---------------------------------->|
123//!    |                                   |
124//!    | 4. rcpd validates Master's cert   |
125//!    |    against received fingerprint   |
126//!    |<---(QUIC + cert pinning)----------|
127//! ```
128//!
129//! - Master generates ephemeral self-signed certificate at startup
130//! - Certificate fingerprint (SHA-256) is passed to rcpd via SSH command-line arguments
131//! - rcpd validates Master's certificate by computing its fingerprint and comparing
132//! - Connection fails if fingerprints don't match (MITM protection)
133//!
134//! ### 2. Source → Destination (Data Transfer Connection)
135//! ```text
136//! Source (rcpd)                   Destination (rcpd)
137//!    |                                   |
138//!    | 1. Source generates self-signed   |
139//!    |    cert, computes SHA-256         |
140//!    |    fingerprint                    |
141//!    |                                   |
142//!    | 2. Send fingerprint + address     |
143//!    |    to Master via secure channel   |
144//!    |---------------------------------->|
145//!    |                    Master         |
146//!    |                      |            |
147//!    | 3. Master forwards   |            |
148//!    |    to Destination    |            |
149//!    |                      |----------->|
150//!    |                                   |
151//!    | 4. Destination validates Source's |
152//!    |    cert against received          |
153//!    |    fingerprint                    |
154//!    |<---(QUIC + cert pinning)----------|
155//! ```
156//!
157//! - Source generates ephemeral self-signed certificate
158//! - Fingerprint is sent to Master over already-secured Master←Source connection
159//! - Master forwards fingerprint to Destination over already-secured Master←Destination connection
160//! - Destination validates Source's certificate against fingerprint
161//! - Direct Source→Destination connection established only after successful validation
162//!
163//! ## SSH as Secure Out-of-Band Channel
164//!
165//! **Key insight**: SSH provides a secure, authenticated channel for bootstrapping QUIC trust
166//!
167//! - Certificate fingerprints are transmitted through SSH (Master→rcpd command-line arguments)
168//! - SSH connection is already authenticated and encrypted
169//! - This creates a "chain of trust":
170//!   1. User trusts SSH (proven by successful authentication)
171//!   2. SSH carries the certificate fingerprint securely
172//!   3. QUIC connection validates against that fingerprint
173//!   4. Therefore, QUIC connection is trustworthy
174//!
175//! ## Attack Resistance
176//!
177//! ### ✅ Protected Against
178//!
179//! - **Man-in-the-Middle (MITM)**: Certificate pinning prevents attackers from impersonating endpoints
180//! - **Replay Attacks**: TLS 1.3 in QUIC provides built-in replay protection
181//! - **Eavesdropping**: All data encrypted with TLS 1.3
182//! - **Tampering**: Cryptographic integrity checks prevent data modification
183//! - **Unauthorized Access**: SSH authentication is required before any operations
184//!
185//! ### ⚠️ Threat Model Assumptions
186//!
187//! - **SSH is secure**: The security model depends on SSH being properly configured and uncompromised
188//! - **Certificate fingerprints are short-lived**: Ephemeral certificates are generated per-session
189//! - **Trusted network for Master**: The machine running Master (rcp) should be trusted
190//!
191//! ## Best Practices
192//!
193//! 1. **Secure SSH Configuration**: Use key-based authentication, disable password auth
194//! 2. **Keep Systems Updated**: Ensure SSH, TLS libraries, and QUIC implementations are current
195//! 3. **Network Segmentation**: Run remote copies on trusted network segments when possible
196//! 4. **Monitor Logs**: Certificate validation failures indicate potential security issues
197//!
198//! # Network Troubleshooting
199//!
200//! Common failure scenarios and their handling:
201//!
202//! ## SSH Connection Fails
203//! - **Cause**: Host unreachable, authentication failure
204//! - **Timeout**: ~30s (SSH default)
205//! - **Error**: Standard SSH error messages
206//!
207//! ## rcpd Cannot Connect to Master
208//! - **Cause**: Firewall blocks QUIC, network routing issue
209//! - **Timeout**: Configurable via `--remote-copy-conn-timeout-sec` (default: 15s)
210//! - **Solution**: Check firewall rules for QUIC ports
211//!
212//! ## Destination Cannot Connect to Source
213//! - **Cause**: Firewall blocks direct connection between hosts
214//! - **Timeout**: Configurable (default: 15s)
215//! - **Solution**: Use `--quic-port-ranges` to specify allowed ports, configure firewall
216//!
217//! For detailed troubleshooting, see the repository's `docs/network_connectivity.md`.
218//!
219//! # Examples
220//!
221//! ## Starting a Remote Copy Daemon
222//!
223//! ```rust,no_run
224//! use remote::{SshSession, protocol::RcpdConfig, start_rcpd};
225//! use std::net::SocketAddr;
226//!
227//! # async fn example() -> anyhow::Result<()> {
228//! let session = SshSession {
229//!     user: Some("user".to_string()),
230//!     host: "example.com".to_string(),
231//!     port: None,
232//! };
233//!
234//! let config = RcpdConfig::default();
235//! let master_addr: SocketAddr = "192.168.1.100:5000".parse()?;
236//! let server_name = "master-server";
237//!
238//! let process = start_rcpd(&config, &session, &master_addr, server_name).await?;
239//! # Ok(())
240//! # }
241//! ```
242//!
243//! ## Creating a QUIC Server with Port Ranges
244//!
245//! ```rust,no_run
246//! use remote::{get_server_with_port_ranges, get_endpoint_addr};
247//!
248//! # fn example() -> anyhow::Result<()> {
249//! // Create server restricted to ports 8000-8999
250//! let endpoint = get_server_with_port_ranges(Some("8000-8999"))?;
251//! let addr = get_endpoint_addr(&endpoint)?;
252//! println!("Server listening on: {}", addr);
253//! # Ok(())
254//! # }
255//! ```
256//!
257//! # Module Organization
258//!
259//! - [`port_ranges`] - Port range parsing and UDP socket binding
260//! - [`protocol`] - Protocol message definitions and serialization
261//! - [`streams`] - QUIC stream wrappers with typed message passing
262//! - [`tracelog`] - Remote tracing and progress aggregation
263
264use anyhow::{anyhow, Context};
265use rand::Rng;
266use tracing::instrument;
267
268pub mod port_ranges;
269pub mod protocol;
270pub mod streams;
271pub mod tracelog;
272
273#[derive(Debug, PartialEq)]
274pub struct SshSession {
275    pub user: Option<String>,
276    pub host: String,
277    pub port: Option<u16>,
278}
279
280impl SshSession {
281    pub fn local() -> Self {
282        Self {
283            user: None,
284            host: "localhost".to_string(),
285            port: None,
286        }
287    }
288}
289
290async fn setup_ssh_session(
291    session: &SshSession,
292) -> anyhow::Result<std::sync::Arc<openssh::Session>> {
293    let host = session.host.as_str();
294    let destination = match (session.user.as_deref(), session.port) {
295        (Some(user), Some(port)) => format!("ssh://{user}@{host}:{port}"),
296        (None, Some(port)) => format!("ssh://{}:{}", session.host, port),
297        (Some(user), None) => format!("ssh://{user}@{host}"),
298        (None, None) => format!("ssh://{host}"),
299    };
300    tracing::debug!("Connecting to SSH destination: {}", destination);
301    let session = std::sync::Arc::new(
302        openssh::Session::connect(destination, openssh::KnownHosts::Accept)
303            .await
304            .context("Failed to establish SSH connection")?,
305    );
306    Ok(session)
307}
308
309#[instrument]
310pub async fn wait_for_rcpd_process(
311    process: openssh::Child<std::sync::Arc<openssh::Session>>,
312) -> anyhow::Result<()> {
313    tracing::info!("Waiting on rcpd server on: {:?}", process);
314    // wait for process to exit with a timeout and capture output
315    let output = tokio::time::timeout(
316        std::time::Duration::from_secs(10),
317        process.wait_with_output(),
318    )
319    .await
320    .context("Timeout waiting for rcpd process to exit")?
321    .context("Failed to wait for rcpd process")?;
322    if !output.status.success() {
323        let stdout = String::from_utf8_lossy(&output.stdout);
324        let stderr = String::from_utf8_lossy(&output.stderr);
325        tracing::error!(
326            "rcpd command failed on remote host, status code: {:?}\nstdout:\n{}\nstderr:\n{}",
327            output.status.code(),
328            stdout,
329            stderr
330        );
331        return Err(anyhow!(
332            "rcpd command failed on remote host, status code: {:?}",
333            output.status.code(),
334        ));
335    }
336    // log stderr even on success if there's any output (might contain warnings)
337    if !output.stderr.is_empty() {
338        let stderr = String::from_utf8_lossy(&output.stderr);
339        tracing::debug!("rcpd stderr output:\n{}", stderr);
340    }
341    Ok(())
342}
343
344#[instrument]
345pub async fn start_rcpd(
346    rcpd_config: &protocol::RcpdConfig,
347    session: &SshSession,
348    master_addr: &std::net::SocketAddr,
349    master_server_name: &str,
350) -> anyhow::Result<openssh::Child<std::sync::Arc<openssh::Session>>> {
351    tracing::info!("Starting rcpd server on: {:?}", session);
352    let session = setup_ssh_session(session).await?;
353    // Run rcpd command remotely
354    let current_exe = std::env::current_exe().context("Failed to get current executable path")?;
355    let bin_dir = current_exe
356        .parent()
357        .context("Failed to get parent directory of current executable")?;
358    tracing::debug!("Running rcpd from: {:?}", bin_dir);
359    let rcpd_args = rcpd_config.to_args();
360    tracing::debug!("rcpd arguments: {:?}", rcpd_args);
361    let mut cmd = session.arc_command(format!("{}/rcpd", bin_dir.display()));
362    cmd.arg("--master-addr")
363        .arg(master_addr.to_string())
364        .arg("--server-name")
365        .arg(master_server_name)
366        .args(rcpd_args);
367    // capture stdout and stderr so we can read them later
368    cmd.stdout(openssh::Stdio::piped());
369    cmd.stderr(openssh::Stdio::piped());
370    tracing::info!("Will run remotely: {cmd:?}");
371    cmd.spawn().await.context("Failed to spawn rcpd command")
372}
373
374/// Compute SHA-256 fingerprint of a DER-encoded certificate
375fn compute_cert_fingerprint(cert_der: &[u8]) -> ring::digest::Digest {
376    ring::digest::digest(&ring::digest::SHA256, cert_der)
377}
378
379/// Configure QUIC server with a self-signed certificate
380/// Returns the server config and the SHA-256 fingerprint of the certificate
381fn configure_server() -> anyhow::Result<(quinn::ServerConfig, Vec<u8>)> {
382    tracing::info!("Configuring QUIC server");
383    let cert = rcgen::generate_simple_self_signed(vec!["localhost".into()])?;
384    let key_der = cert.serialize_private_key_der();
385    let cert_der = cert.serialize_der()?;
386    let fingerprint = compute_cert_fingerprint(&cert_der);
387    let fingerprint_vec = fingerprint.as_ref().to_vec();
388    tracing::debug!(
389        "Generated certificate with fingerprint: {}",
390        hex::encode(&fingerprint_vec)
391    );
392    let key = rustls::PrivateKey(key_der);
393    let cert = rustls::Certificate(cert_der);
394    let server_config = quinn::ServerConfig::with_single_cert(vec![cert], key)
395        .context("Failed to create server config")?;
396    Ok((server_config, fingerprint_vec))
397}
398
399#[instrument]
400pub fn get_server_with_port_ranges(
401    port_ranges: Option<&str>,
402) -> anyhow::Result<(quinn::Endpoint, Vec<u8>)> {
403    let (server_config, cert_fingerprint) = configure_server()?;
404    let socket = if let Some(ranges_str) = port_ranges {
405        let ranges = port_ranges::PortRanges::parse(ranges_str)?;
406        ranges.bind_udp_socket(std::net::IpAddr::V4(std::net::Ipv4Addr::UNSPECIFIED))?
407    } else {
408        // default behavior: bind to any available port
409        std::net::UdpSocket::bind("0.0.0.0:0")?
410    };
411    let endpoint = quinn::Endpoint::new(
412        quinn::EndpointConfig::default(),
413        Some(server_config),
414        socket,
415        std::sync::Arc::new(quinn::TokioRuntime),
416    )
417    .context("Failed to create QUIC endpoint")?;
418    Ok((endpoint, cert_fingerprint))
419}
420
421// certificate verifier that validates against a pinned certificate fingerprint
422// This prevents MITM attacks by ensuring we're connecting to the expected server
423struct PinnedCertVerifier {
424    expected_fingerprint: Vec<u8>,
425}
426
427impl PinnedCertVerifier {
428    fn new(expected_fingerprint: Vec<u8>) -> Self {
429        Self {
430            expected_fingerprint,
431        }
432    }
433}
434
435impl rustls::client::ServerCertVerifier for PinnedCertVerifier {
436    fn verify_server_cert(
437        &self,
438        end_entity: &rustls::Certificate,
439        _intermediates: &[rustls::Certificate],
440        _server_name: &rustls::ServerName,
441        _scts: &mut dyn Iterator<Item = &[u8]>,
442        _ocsp_response: &[u8],
443        _now: std::time::SystemTime,
444    ) -> Result<rustls::client::ServerCertVerified, rustls::Error> {
445        let received_fingerprint = compute_cert_fingerprint(&end_entity.0);
446        if received_fingerprint.as_ref() == self.expected_fingerprint.as_slice() {
447            tracing::debug!(
448                "Certificate fingerprint validated successfully: {}",
449                hex::encode(&self.expected_fingerprint)
450            );
451            Ok(rustls::client::ServerCertVerified::assertion())
452        } else {
453            tracing::error!(
454                "Certificate fingerprint mismatch! Expected: {}, Got: {}",
455                hex::encode(&self.expected_fingerprint),
456                hex::encode(received_fingerprint)
457            );
458            Err(rustls::Error::InvalidCertificate(
459                rustls::CertificateError::Other(std::sync::Arc::new(std::io::Error::new(
460                    std::io::ErrorKind::InvalidData,
461                    format!(
462                        "Certificate fingerprint mismatch (expected {}, got {})",
463                        hex::encode(&self.expected_fingerprint),
464                        hex::encode(received_fingerprint)
465                    ),
466                ))),
467            ))
468        }
469    }
470}
471
472fn get_local_ip() -> anyhow::Result<std::net::IpAddr> {
473    let socket = std::net::UdpSocket::bind("0.0.0.0:0")?;
474    socket.connect("8.8.8.8:80")?;
475    Ok(socket.local_addr()?.ip())
476}
477
478#[instrument]
479pub fn get_endpoint_addr(endpoint: &quinn::Endpoint) -> anyhow::Result<std::net::SocketAddr> {
480    // endpoint is bound to 0.0.0.0 so we need to get the local IP address
481    let local_ip = get_local_ip().context("Failed to get local IP address")?;
482    let endpoint_addr = endpoint.local_addr()?;
483    Ok(std::net::SocketAddr::new(local_ip, endpoint_addr.port()))
484}
485
486#[instrument]
487pub fn get_random_server_name() -> String {
488    rand::thread_rng()
489        .sample_iter(&rand::distributions::Alphanumeric)
490        .take(20)
491        .map(char::from)
492        .collect()
493}
494
495#[instrument]
496pub fn get_client_with_cert_pinning(cert_fingerprint: Vec<u8>) -> anyhow::Result<quinn::Endpoint> {
497    get_client_with_port_ranges_and_pinning(None, cert_fingerprint)
498}
499
500#[instrument]
501pub fn get_client_with_port_ranges_and_pinning(
502    port_ranges: Option<&str>,
503    cert_fingerprint: Vec<u8>,
504) -> anyhow::Result<quinn::Endpoint> {
505    tracing::info!(
506        "Creating QUIC client with certificate pinning (fingerprint: {})",
507        hex::encode(&cert_fingerprint)
508    );
509    // create a crypto backend with certificate pinning
510    let crypto = rustls::ClientConfig::builder()
511        .with_safe_defaults()
512        .with_custom_certificate_verifier(std::sync::Arc::new(PinnedCertVerifier::new(
513            cert_fingerprint,
514        )))
515        .with_no_client_auth();
516    create_client_endpoint(port_ranges, crypto)
517}
518
519// helper function to create client endpoint with given crypto config
520fn create_client_endpoint(
521    port_ranges: Option<&str>,
522    crypto: rustls::ClientConfig,
523) -> anyhow::Result<quinn::Endpoint> {
524    // create QUIC client config
525    let client_config = quinn::ClientConfig::new(std::sync::Arc::new(crypto));
526    let socket = if let Some(ranges_str) = port_ranges {
527        let ranges = port_ranges::PortRanges::parse(ranges_str)?;
528        ranges.bind_udp_socket(std::net::IpAddr::V4(std::net::Ipv4Addr::UNSPECIFIED))?
529    } else {
530        // default behavior: bind to any available port
531        std::net::UdpSocket::bind("0.0.0.0:0")?
532    };
533    // create and configure endpoint
534    let mut endpoint = quinn::Endpoint::new(
535        quinn::EndpointConfig::default(),
536        None, // No server config for client
537        socket,
538        std::sync::Arc::new(quinn::TokioRuntime),
539    )
540    .context("Failed to create QUIC endpoint")?;
541    endpoint.set_default_client_config(client_config);
542    Ok(endpoint)
543}