Skip to main content

hexz_server/
lib.rs

1//! HTTP, NBD, and S3 gateway server implementations for exposing Hexz snapshots.
2//!
3//! This module provides network-facing interfaces for accessing compressed Hexz
4//! snapshot data over standard protocols. It supports three distinct serving modes:
5//!
6//! 1. **HTTP Range Server** (`serve_http`): Exposes disk and secondary streams via
7//!    HTTP 1.1 range requests with DoS protection and partial content support.
8//! 2. **NBD (Network Block Device) Server** (`serve_nbd`): Allows mounting snapshots
9//!    as Linux block devices using the standard NBD protocol.
10//! 3. **S3 Gateway** (`serve_s3_gateway`): Planned S3-compatible API for cloud
11//!    integration (currently unimplemented).
12//!
13//! # Architecture Overview
14//!
15//! All servers expose the same underlying `File` API, which provides:
16//! - Block-level decompression with LRU caching
17//! - Dual-stream access (disk and memory snapshots)
18//! - Random access with minimal I/O overhead
19//! - Thread-safe concurrent reads via `Arc<File>`
20//!
21//! The servers differ in protocol semantics and use cases:
22//!
23//! | Protocol | Use Case | Access Pattern | Authentication |
24//! |----------|----------|----------------|----------------|
25//! | HTTP     | Browser/API access | Range requests | None (planned) |
26//! | NBD      | Linux block device mount | Block-level reads | None |
27//! | S3       | Cloud integration | Object API | AWS SigV4 (planned) |
28//!
29//! # Design Decisions
30//!
31//! ## Why HTTP Range Requests?
32//!
33//! HTTP range requests (RFC 7233) provide a standardized way to access large files
34//! in chunks without loading the entire file into memory. This aligns perfectly with
35//! Hexz's block-indexed architecture, allowing clients to fetch only the data they
36//! need. The implementation:
37//!
38//! - Returns HTTP 206 (Partial Content) for range requests
39//! - Returns HTTP 416 (Range Not Satisfiable) for invalid ranges
40//! - Clamps requests to `MAX_CHUNK_SIZE` (32 MiB) to prevent memory exhaustion
41//! - Supports both bounded (`bytes=0-1023`) and unbounded (`bytes=1024-`) ranges
42//!
43//! ## Why NBD Protocol?
44//!
45//! The Network Block Device protocol allows mounting remote storage as a local block
46//! device on Linux systems. This enables:
47//! - Transparent filesystem access (mount snapshot, browse files)
48//! - Use of standard Linux tools (`dd`, `fsck`, `mount`)
49//! - Zero application changes (existing software works unmodified)
50//!
51//! Trade-offs:
52//! - **Pro**: Native OS integration, no special client software required
53//! - **Pro**: Kernel handles caching and buffering
54//! - **Con**: No built-in encryption or authentication
55//! - **Con**: TCP-based, higher latency than local disk
56//!
57//! ## Security Architecture
58//!
59//! ### Current Security Posture (localhost-only)
60//!
61//! All servers bind to `127.0.0.1` (loopback) by default, preventing network exposure.
62//! This is appropriate for:
63//! - Local development and testing
64//! - Forensics workstations accessing local snapshots
65//! - Scenarios where network access is provided via SSH tunnels or VPNs
66//!
67//! ### Attack Surface
68//!
69//! The current implementation has a minimal attack surface:
70//! 1. **DoS via large reads**: Mitigated by `MAX_CHUNK_SIZE` clamping (32 MiB)
71//! 2. **Range header parsing**: Simplified parser with strict validation
72//! 3. **Connection exhaustion**: Limited by OS socket limits, no artificial cap
73//! 4. **Path traversal**: N/A (no filesystem access, only fixed `/disk` and `/memory` routes)
74//!
75//! ### Future Security Enhancements (Planned)
76//!
77//! - TLS/HTTPS support for encrypted transport
78//! - Token-based authentication (Bearer tokens)
79//! - Rate limiting per IP address
80//! - Configurable bind addresses (`0.0.0.0` for network access)
81//! - Request logging and audit trails
82//!
83//! # Performance Characteristics
84//!
85//! ## HTTP Server
86//!
87//! - **Throughput**: ~500-2000 MB/s (limited by decompression, not network)
88//! - **Latency**: ~1-5 ms per request (includes decompression)
89//! - **Concurrency**: Handles 1000+ concurrent connections (Tokio async runtime)
90//! - **Memory**: ~100 KB per connection + block cache overhead
91//!
92//! ## NBD Server
93//!
94//! - **Throughput**: ~500-1000 MB/s (similar to HTTP, plus NBD protocol overhead)
95//! - **Latency**: ~2-10 ms per block read (includes TCP RTT + decompression)
96//! - **Concurrency**: One Tokio task per client connection
97//!
98//! ## Bottlenecks
99//!
100//! For local (localhost) connections, the primary bottleneck is:
101//! 1. **Decompression CPU time** (80% of latency for LZ4, more for ZSTD)
102//! 2. **Block cache misses** (requires backend I/O)
103//! 3. **Memory allocation** for large reads (mitigated by clamping)
104//!
105//! Network bandwidth is rarely a bottleneck for localhost connections.
106//!
107//! # Examples
108//!
109//! ## Starting an HTTP Server
110//!
111//! ```no_run
112//! use std::sync::Arc;
113//! use hexz_core::File;
114//! use hexz_store::local::FileBackend;
115//! use hexz_core::algo::compression::lz4::Lz4Compressor;
116//! use hexz_server::serve_http;
117//!
118//! # #[tokio::main]
119//! # async fn main() -> anyhow::Result<()> {
120//! let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
121//! let compressor = Box::new(Lz4Compressor::new());
122//! let snap = File::new(backend, compressor, None)?;
123//!
124//! // Start HTTP server on port 8080
125//! serve_http(snap, 8080, "127.0.0.1").await?;
126//! # Ok(())
127//! # }
128//! ```
129//!
130//! ## Starting an NBD Server
131//!
132//! ```no_run
133//! use std::sync::Arc;
134//! use hexz_core::File;
135//! use hexz_store::local::FileBackend;
136//! use hexz_core::algo::compression::lz4::Lz4Compressor;
137//! use hexz_server::serve_nbd;
138//!
139//! # #[tokio::main]
140//! # async fn main() -> anyhow::Result<()> {
141//! let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
142//! let compressor = Box::new(Lz4Compressor::new());
143//! let snap = File::new(backend, compressor, None)?;
144//!
145//! // Start NBD server on port 10809
146//! serve_nbd(snap, 10809, "127.0.0.1").await?;
147//! # Ok(())
148//! # }
149//! ```
150//!
151//! ## Client Usage Examples
152//!
153//! ### HTTP Client (curl)
154//!
155//! ```bash
156//! # Fetch the first 4KB of the primary stream
157//! curl -H "Range: bytes=0-4095" http://localhost:8080/disk -o chunk.bin
158//!
159//! # Fetch 1MB starting at offset 1MB
160//! curl -H "Range: bytes=1048576-2097151" http://localhost:8080/memory -o mem_chunk.bin
161//!
162//! # Fetch from offset to EOF (server will clamp to MAX_CHUNK_SIZE)
163//! curl -H "Range: bytes=1048576-" http://localhost:8080/disk
164//! ```
165//!
166//! ### NBD Client (Linux)
167//!
168//! ```bash
169//! # Connect NBD client to server
170//! sudo nbd-client localhost 10809 /dev/nbd0
171//!
172//! # Mount the block device (read-only)
173//! sudo mount -o ro /dev/nbd0 /mnt/snapshot
174//!
175//! # Access files normally
176//! ls -la /mnt/snapshot
177//! cat /mnt/snapshot/important.log
178//!
179//! # Disconnect when done
180//! sudo umount /mnt/snapshot
181//! sudo nbd-client -d /dev/nbd0
182//! ```
183//!
184//! # Protocol References
185//!
186//! - **HTTP Range Requests**: [RFC 7233](https://tools.ietf.org/html/rfc7233)
187//! - **NBD Protocol**: [NBD Protocol Specification](https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md)
188//! - **S3 API**: [AWS S3 API Reference](https://docs.aws.amazon.com/s3/index.html) (future work)
189
190pub mod nbd;
191
192use axum::{
193    Router,
194    extract::State,
195    http::{HeaderMap, StatusCode, header},
196    response::{IntoResponse, Response},
197    routing::get,
198};
199use hexz_core::{File, SnapshotStream};
200use std::net::SocketAddr;
201use std::sync::Arc;
202use tokio::net::TcpListener;
203
204/// IPv4 address for all server listeners (localhost only).
205///
206/// # Security Rationale
207///
208/// This constant defaults to the loopback address (`127.0.0.1`) to prevent
209/// accidental exposure of snapshot data to the local network or internet.
210/// Snapshots may contain sensitive information (credentials, personal data,
211/// proprietary code), so network exposure must be an explicit, informed decision.
212///
213/// ## Current Behavior
214///
215/// All servers (HTTP, NBD, S3) bind to `127.0.0.1`, making them accessible only
216/// from the local machine. Remote access requires:
217/// - SSH port forwarding: `ssh -L 8080:localhost:8080 user@server`
218/// - VPN tunnel with local forwarding
219/// - Reverse proxy with authentication (e.g., nginx with TLS + basic auth)
220///
221/// ## Future Enhancement
222///
223/// To enable network access, a future version will support configurable bind
224/// addresses via command-line flags or configuration files:
225///
226/// ```bash
227/// # Proposed CLI syntax (not yet implemented)
228/// hexz-server --bind 0.0.0.0:8080 --auth-token mytoken123 snapshot.st
229/// ```
230///
231/// Network exposure will require authentication to be enabled (enforced by the CLI).
232/// Length in bytes of the HTTP `Range` header prefix `"bytes="`.
233///
234/// The HTTP Range header format is defined in RFC 7233 as:
235///
236/// ```text
237/// Range: bytes=<start>-<end>
238/// ```
239///
240/// This constant represents the length of the literal string `"bytes="` (6 bytes),
241/// which is stripped during parsing. The parser supports:
242///
243/// - Bounded ranges: `bytes=0-1023` (fetch bytes 0 through 1023 inclusive)
244/// - Unbounded ranges: `bytes=1024-` (fetch from byte 1024 to EOF)
245/// - Single-byte ranges: `bytes=0-0` (fetch only byte 0)
246///
247/// Unsupported range types (will return HTTP 416):
248/// - Suffix ranges: `bytes=-500` (last 500 bytes)
249/// - Multi-part ranges: `bytes=0-100,200-300`
250///
251/// # Rationale for Limited Support
252///
253/// Suffix ranges and multi-part ranges are rarely used in practice and add
254/// significant parsing complexity. If needed for browser compatibility, they
255/// can be added in a future version without breaking existing clients.
256const RANGE_PREFIX_LEN: usize = 6;
257
258/// Maximum allowed read size per HTTP request to prevent DoS attacks.
259///
260/// # Value
261///
262/// 32 MiB (33,554,432 bytes)
263///
264/// # DoS Protection Rationale
265///
266/// Without a limit, a malicious client could request the entire snapshot in a single
267/// HTTP request (e.g., `Range: bytes=0-`), forcing the server to:
268///
269/// 1. Decompress gigabytes of data
270/// 2. Allocate gigabytes of heap memory
271/// 3. Hold that memory while slowly transmitting over the network
272///
273/// With multiple concurrent requests, this could exhaust server memory and CPU,
274/// causing crashes or unresponsiveness (denial of service).
275///
276/// # Why 32 MiB?
277///
278/// This value balances throughput efficiency and resource protection:
279///
280/// - **Large enough**: Clients can fetch substantial chunks with low overhead
281///   (at 1 Gbps, 32 MiB transfers in ~256 ms)
282/// - **Small enough**: Even 100 concurrent maximal requests consume <3.2 GB RAM,
283///   which is manageable on modern servers
284/// - **Common practice**: Many HTTP servers use similar limits (nginx default: 16 MiB,
285///   AWS S3 max single GET: 5 GB but recommends <100 MB for performance)
286///
287/// # Clamping Behavior
288///
289/// When a client requests more than `MAX_CHUNK_SIZE` bytes:
290///
291/// 1. The server clamps the end offset: `end = min(end, start + MAX_CHUNK_SIZE - 1)`
292/// 2. Returns HTTP 206 with the clamped range in the `Content-Range` header
293/// 3. The client sees a short read and can issue follow-up requests
294///
295/// Example:
296///
297/// ```text
298/// Client request:  Range: bytes=0-67108863   (64 MiB)
299/// Server response: Content-Range: bytes 0-33554431/total  (32 MiB)
300/// ```
301///
302/// The client must check the `Content-Range` header to detect clamping.
303///
304/// # Future Work
305///
306/// This limit could be made configurable via CLI flags for scenarios where higher
307/// memory usage is acceptable (e.g., dedicated forensics servers with 128+ GB RAM).
308const MAX_CHUNK_SIZE: u64 = 32 * 1024 * 1024;
309
310/// Shared application state for the HTTP serving layer.
311///
312/// This struct is wrapped in `Arc` and cloned for each HTTP request handler.
313/// The inner `snap` field is also `Arc`-wrapped, so cloning `AppState` is cheap
314/// (just incrementing reference counts, no data copying).
315///
316/// # Thread Safety
317///
318/// `AppState` is `Send + Sync` because `File` is `Send + Sync`. The underlying
319/// block cache uses `Mutex` for interior mutability, so multiple concurrent requests
320/// can safely read from the same snapshot.
321///
322/// # Memory Overhead
323///
324/// Each `AppState` clone adds ~16 bytes (one `Arc` pointer). With 1000 concurrent
325/// connections, this overhead is negligible (~16 KB).
326struct AppState {
327    /// The opened Hexz snapshot file being served via HTTP.
328    ///
329    /// This is the same `File` instance for all requests. It contains:
330    /// - The storage backend (local file, S3, etc.)
331    /// - Block cache (shared across all requests)
332    /// - Decompressor instances (thread-local via pooling)
333    snap: Arc<File>,
334}
335
336/// Exposes a `File` over NBD (Network Block Device) protocol.
337///
338/// Starts a TCP listener on `127.0.0.1:<port>` that implements the NBD protocol,
339/// allowing Linux clients to mount the Hexz snapshot as a local block device
340/// using standard tools like `nbd-client`.
341///
342/// This function runs indefinitely, accepting connections in a loop. Each client
343/// connection is handled in a separate Tokio task, allowing concurrent clients.
344///
345/// # Arguments
346///
347/// - `snap`: The Hexz snapshot file to expose. Must be wrapped in `Arc` for sharing
348///   across multiple client connections.
349/// - `port`: TCP port to bind to on the loopback interface (e.g., `10809`).
350///
351/// # Returns
352///
353/// This function never returns under normal operation (it runs forever). It only
354/// returns `Err` if:
355/// - The TCP listener fails to bind (port already in use, permission denied)
356/// - An unrecoverable I/O error occurs on the listener socket
357///
358/// Individual client errors (malformed requests, disconnects) are logged but do not
359/// stop the server.
360///
361/// # Errors
362///
363/// - `std::io::Error`: If binding to the socket fails or the listener encounters
364///   a fatal error.
365///
366/// # Examples
367///
368/// ```no_run
369/// use std::sync::Arc;
370/// use hexz_core::File;
371/// use hexz_store::local::FileBackend;
372/// use hexz_core::algo::compression::lz4::Lz4Compressor;
373/// use hexz_server::serve_nbd;
374///
375/// # #[tokio::main]
376/// # async fn main() -> anyhow::Result<()> {
377/// let backend = Arc::new(FileBackend::new("vm_snapshot.hxz".as_ref())?);
378/// let compressor = Box::new(Lz4Compressor::new());
379/// let snap = File::new(backend, compressor, None)?;
380///
381/// // Start NBD server (runs forever)
382/// serve_nbd(snap, 10809, "127.0.0.1").await?;
383/// # Ok(())
384/// # }
385/// ```
386///
387/// ## Client-Side Usage (Linux)
388///
389/// ```bash
390/// # Connect to the NBD server
391/// sudo nbd-client localhost 10809 /dev/nbd0
392///
393/// # Mount the block device (read-only, automatically detected filesystem)
394/// sudo mount -o ro /dev/nbd0 /mnt/snapshot
395///
396/// # Browse files normally
397/// ls -la /mnt/snapshot
398/// sudo cat /mnt/snapshot/var/log/syslog
399///
400/// # Unmount and disconnect
401/// sudo umount /mnt/snapshot
402/// sudo nbd-client -d /dev/nbd0
403/// ```
404///
405/// # Security Considerations
406///
407/// ## No Encryption
408///
409/// The NBD protocol transmits data in plaintext. For localhost connections this
410/// is acceptable, but for remote access consider:
411///
412/// - **SSH tunnel**: `ssh -L 10809:localhost:10809 user@server`
413/// - **VPN**: WireGuard, OpenVPN, etc.
414/// - **TLS wrapper**: `stunnel` or similar
415///
416/// ## No Authentication
417///
418/// Any process with network access to the port can connect. The default loopback
419/// binding mitigates this, but if exposing to the network, use firewall rules or
420/// SSH key authentication.
421///
422/// ## Read-Only Enforcement
423///
424/// The NBD server always exports snapshots as read-only (NBD flag `NBD_FLAG_READ_ONLY`).
425/// Write attempts return `EPERM` (operation not permitted). However, a malicious
426/// NBD client could theoretically attempt to crash the server via protocol abuse.
427///
428/// # Performance Notes
429///
430/// - **Concurrency**: Each client spawns a separate Tokio task. With 100 concurrent
431///   clients, memory overhead is ~10 MB (100 KB per task).
432/// - **Throughput**: Typically 500-1000 MB/s for sequential reads, limited by
433///   decompression rather than NBD protocol overhead.
434/// - **Latency**: ~2-10 ms per read, including TCP round-trip and decompression.
435///
436/// # Panics
437///
438/// This function does not panic under normal operation. Client errors are logged
439/// and handled gracefully.
440pub async fn serve_nbd(snap: Arc<File>, port: u16, bind: &str) -> anyhow::Result<()> {
441    let addr: SocketAddr = format!("{}:{}", bind, port).parse()?;
442    let listener = TcpListener::bind(addr).await?;
443
444    tracing::info!("NBD server listening on {}", addr);
445    println!(
446        "NBD server started on {}. Use 'nbd-client localhost {} /dev/nbd0' to mount.",
447        addr, port
448    );
449
450    loop {
451        // Accept incoming NBD connections
452        let (socket, remote_addr) = match listener.accept().await {
453            Ok(conn) => conn,
454            Err(e) => {
455                tracing::warn!("NBD accept error (continuing): {}", e);
456                continue;
457            }
458        };
459        tracing::debug!("Accepted NBD connection from {}", remote_addr);
460
461        let snap_clone = snap.clone();
462        tokio::spawn(async move {
463            if let Err(e) = nbd::handle_client(socket, snap_clone).await {
464                tracing::error!("NBD client error: {}", e);
465            }
466        });
467    }
468}
469
470/// Exposes a `File` as an S3-compatible object storage gateway.
471///
472/// # Implementation Status: NOT IMPLEMENTED
473///
474/// This function is a **placeholder** for future S3 API compatibility. It currently
475/// blocks forever without serving any requests. Calling this function will NOT panic,
476/// but it provides no useful functionality.
477///
478/// # Planned Functionality
479///
480/// When implemented, this gateway will provide S3-compatible HTTP endpoints for:
481///
482/// ## Supported Operations (Planned)
483///
484/// - `GET /<bucket>/<key>`: Retrieve snapshot data as an S3 object
485/// - `HEAD /<bucket>/<key>`: Get object metadata (size, ETag)
486/// - `GET /<bucket>/<key>?range=bytes=<start>-<end>`: Partial object retrieval
487/// - `GET /<bucket>?list-type=2`: List objects (future: multi-snapshot support)
488///
489/// ## S3 API Compatibility Goals
490///
491/// - **Authentication**: AWS Signature Version 4 (SigV4) for production use
492/// - **Authorization**: IAM-style policies (read-only by default)
493/// - **Error responses**: Standard S3 XML error responses
494/// - **Metadata**: ETag (CRC32 of snapshot header), Content-Type, Last-Modified
495///
496/// ## Mapping Hexz Concepts to S3
497///
498/// | Hexz Concept | S3 Equivalent | Mapping Strategy |
499/// |----------------|---------------|------------------|
500/// | Snapshot file | Bucket | One bucket per snapshot |
501/// | Primary stream | Object `disk.img` | Virtual object, synthesized from snapshot |
502/// | Secondary stream | Object `memory.img` | Virtual object, synthesized from snapshot |
503/// | Block index | N/A | Transparent to S3 clients |
504///
505/// ## Example S3 API Usage (Planned)
506///
507/// ```bash
508/// # Configure AWS CLI to point to local S3 gateway
509/// export AWS_ACCESS_KEY_ID=minioadmin
510/// export AWS_SECRET_ACCESS_KEY=minioadmin
511/// export AWS_ENDPOINT_URL=http://localhost:9000
512///
513/// # List buckets (snapshots)
514/// aws s3 ls
515///
516/// # List objects in a snapshot
517/// aws s3 ls s3://my-snapshot/
518///
519/// # Download the primary stream
520/// aws s3 cp s3://my-snapshot/disk.img disk_copy.img
521///
522/// # Download a range (100 MB starting at offset 1 GB)
523/// aws s3api get-object --bucket my-snapshot --key disk.img \
524///   --range bytes=1073741824-1178599423 chunk.bin
525/// ```
526///
527/// # Configuration (Planned)
528///
529/// Future configuration options (not yet implemented):
530///
531/// - **Bind address**: CLI flag `--s3-bind 0.0.0.0:9000` (default: `127.0.0.1`)
532/// - **Authentication**: `--s3-access-key` and `--s3-secret-key` for SigV4
533/// - **Bucket name**: `--s3-bucket-name <name>` (default: derived from snapshot filename)
534/// - **Anonymous access**: `--s3-allow-anonymous` flag (dangerous, for testing only)
535///
536/// # Why S3 Compatibility?
537///
538/// S3 is a de facto standard for object storage. Supporting the S3 API enables:
539///
540/// 1. **Cloud integration**: Use Hexz with existing cloud infrastructure (AWS, MinIO, etc.)
541/// 2. **Tool compatibility**: Any S3-compatible tool (s3cmd, rclone, boto3) works with Hexz
542/// 3. **Caching CDNs**: Front the gateway with CloudFront or similar for caching
543/// 4. **Lifecycle policies**: Future support for automated snapshot expiration
544///
545/// # Security Considerations (Planned)
546///
547/// When implemented, the S3 gateway will require authentication by default:
548///
549/// - **SigV4 authentication**: All requests must include valid AWS Signature V4 headers
550/// - **Read-only mode**: No PUT/DELETE operations to prevent accidental modification
551/// - **Rate limiting**: Per-access-key request throttling to prevent abuse
552/// - **TLS requirement**: Production deployments must use HTTPS (enforced by CLI flag check)
553///
554/// # Performance Goals (Planned)
555///
556/// - **Throughput**: Match HTTP server performance (~500-2000 MB/s)
557/// - **Latency**: <10 ms for authenticated requests (signature verification adds ~1-2 ms)
558/// - **Concurrency**: Handle 1000+ concurrent S3 GET requests
559///
560/// # Limitations (Planned)
561///
562/// The S3 gateway will NOT support:
563///
564/// - **Write operations**: No PUT, POST, DELETE (snapshots are read-only)
565/// - **Multipart uploads**: N/A for read-only gateway
566/// - **Bucket policies**: Simplified IAM-like policies only
567/// - **Versioning**: Snapshots are immutable, no object versioning needed
568/// - **Server-side encryption**: Use TLS for transport encryption instead
569///
570/// # Arguments
571///
572/// - `_snap`: The Hexz snapshot to expose (currently unused).
573/// - `port`: TCP port to bind to on the loopback interface (e.g., `9000`).
574///
575/// # Returns
576///
577/// This function never returns (blocks indefinitely on `std::future::pending()`).
578/// It does not perform any useful work in the current implementation.
579///
580/// # Errors
581///
582/// Currently, this function cannot return an error (it blocks forever). In the
583/// future implementation, it will return errors for:
584///
585/// - Socket binding failures
586/// - Configuration validation errors
587/// - Unrecoverable I/O errors on the listener
588///
589/// # Examples
590///
591/// ```no_run
592/// use std::sync::Arc;
593/// use hexz_core::File;
594/// use hexz_store::local::FileBackend;
595/// use hexz_core::algo::compression::lz4::Lz4Compressor;
596/// use hexz_server::serve_s3_gateway;
597///
598/// # #[tokio::main]
599/// # async fn main() -> anyhow::Result<()> {
600/// let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
601/// let compressor = Box::new(Lz4Compressor::new());
602/// let snap = File::new(backend, compressor, None)?;
603///
604/// // WARNING: This will block forever without serving requests
605/// serve_s3_gateway(snap, 9000).await?;
606/// # Ok(())
607/// # }
608/// ```
609///
610/// # Implementation Roadmap
611///
612/// 1. **Phase 1**: Basic GET/HEAD operations with no authentication (localhost-only)
613/// 2. **Phase 2**: AWS SigV4 authentication and bucket listing
614/// 3. **Phase 3**: Multi-snapshot support (multiple buckets)
615/// 4. **Phase 4**: TLS support and network binding options
616/// 5. **Phase 5**: IAM-style policies and access control
617///
618/// # Call for Contributions
619///
620/// Implementing S3 compatibility is a substantial undertaking. If you are interested
621/// in contributing, see `docs/s3_gateway_design.md` (to be created) for the design
622/// specification and implementation plan.
623#[deprecated(note = "Not implemented. Blocks indefinitely without serving requests.")]
624pub async fn serve_s3_gateway(_snap: Arc<File>, port: u16) -> anyhow::Result<()> {
625    tracing::info!("Starting S3 Gateway on port {}", port);
626    println!(
627        "S3 Gateway started on port {} (Not fully implemented)",
628        port
629    );
630    std::future::pending::<()>().await; // Keep alive
631    unreachable!();
632}
633
634/// Exposes a `File` over HTTP with range request support.
635///
636/// Starts an HTTP 1.1 server on `127.0.0.1:<port>` that exposes snapshot data via
637/// two endpoints:
638///
639/// - `GET /disk`: Serves the primary stream (persistent storage snapshot)
640/// - `GET /memory`: Serves the secondary stream (RAM snapshot)
641///
642/// Both endpoints support HTTP range requests (RFC 7233) for partial content retrieval.
643///
644/// # Protocol Behavior
645///
646/// ## Full Content Request (No Range Header)
647///
648/// ```http
649/// GET /disk HTTP/1.1
650/// Host: localhost:8080
651/// ```
652///
653/// Response:
654///
655/// ```http
656/// HTTP/1.1 206 Partial Content
657/// Content-Type: application/octet-stream
658/// Content-Range: bytes 0-33554431/10737418240
659/// Accept-Ranges: bytes
660///
661/// [First 32 MiB of data, clamped by MAX_CHUNK_SIZE]
662/// ```
663///
664/// Note: Even without a `Range` header, the response is clamped to `MAX_CHUNK_SIZE`
665/// and returns HTTP 206 (not 200) to indicate partial content.
666///
667/// ## Range Request (Partial Content)
668///
669/// ```http
670/// GET /memory HTTP/1.1
671/// Host: localhost:8080
672/// Range: bytes=1048576-2097151
673/// ```
674///
675/// Response (success):
676///
677/// ```http
678/// HTTP/1.1 206 Partial Content
679/// Content-Type: application/octet-stream
680/// Content-Range: bytes 1048576-2097151/8589934592
681/// Accept-Ranges: bytes
682///
683/// [1 MiB of data from offset 1048576]
684/// ```
685///
686/// Response (invalid range):
687///
688/// ```http
689/// HTTP/1.1 416 Range Not Satisfiable
690/// Content-Range: bytes */8589934592
691/// ```
692///
693/// ## Error Responses
694///
695/// - **416 Range Not Satisfiable**: Invalid range syntax or out-of-bounds request
696/// - **500 Internal Server Error**: Backend I/O failure or decompression error
697///
698/// # HTTP Range Request Limitations
699///
700/// ## Supported Range Types
701///
702/// - **Bounded ranges**: `bytes=<start>-<end>` (both offsets specified)
703/// - **Unbounded ranges**: `bytes=<start>-` (from start to EOF, clamped to `MAX_CHUNK_SIZE`)
704///
705/// ## Unsupported Range Types
706///
707/// These return HTTP 416 (Range Not Satisfiable):
708///
709/// - **Suffix ranges**: `bytes=-<suffix-length>` (e.g., `bytes=-1024` for last 1KB)
710/// - **Multi-part ranges**: `bytes=0-100,200-300` (multiple ranges in one request)
711///
712/// Rationale: These are rarely used and add significant implementation complexity.
713/// Standard range requests cover 99% of real-world use cases.
714///
715/// # DoS Protection Mechanisms
716///
717/// ## Request Size Clamping
718///
719/// All reads are clamped to `MAX_CHUNK_SIZE` (32 MiB) to prevent memory exhaustion:
720///
721/// ```text
722/// Client requests:  bytes=0-1073741823   (1 GB)
723/// Server clamps to: bytes=0-33554431     (32 MiB)
724/// Response header:  Content-Range: bytes 0-33554431/total
725/// ```
726///
727/// The client detects clamping by comparing the `Content-Range` header to the
728/// requested range and can issue follow-up requests for remaining data.
729///
730/// ## Connection Limits
731///
732/// The server relies on OS-level TCP connection limits (controlled by `ulimit -n`
733/// and kernel parameters). Tokio's async runtime handles thousands of concurrent
734/// connections efficiently (each connection consumes ~100 KB of memory).
735///
736/// For production deployments, consider:
737///
738/// - **Reverse proxy**: nginx or Caddy with connection limits and rate limiting
739/// - **Firewall rules**: Limit connections per IP address
740/// - **Resource limits**: Set `ulimit -n` to a reasonable value (e.g., 4096)
741///
742/// # Arguments
743///
744/// - `snap`: The Hexz snapshot file to expose. Must be wrapped in `Arc` for sharing
745///   across request handlers.
746/// - `port`: TCP port to bind to on the loopback interface (e.g., `8080`, `3000`).
747///
748/// # Returns
749///
750/// This function runs indefinitely, serving HTTP requests until the server is shut
751/// down (e.g., via Ctrl+C signal). It only returns `Err` if:
752///
753/// - The TCP listener fails to bind (port already in use, permission denied)
754/// - The HTTP server encounters a fatal error (should be extremely rare)
755///
756/// Individual request errors (invalid ranges, read failures) are handled gracefully
757/// and return appropriate HTTP error responses without stopping the server.
758///
759/// # Errors
760///
761/// - `std::io::Error`: If binding to the socket fails.
762/// - `anyhow::Error`: If the HTTP server encounters an unrecoverable error.
763///
764/// # Examples
765///
766/// ## Server Setup
767///
768/// ```no_run
769/// use std::sync::Arc;
770/// use hexz_core::File;
771/// use hexz_store::local::FileBackend;
772/// use hexz_core::algo::compression::lz4::Lz4Compressor;
773/// use hexz_server::serve_http;
774///
775/// # #[tokio::main]
776/// # async fn main() -> anyhow::Result<()> {
777/// let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
778/// let compressor = Box::new(Lz4Compressor::new());
779/// let snap = File::new(backend, compressor, None)?;
780///
781/// // Start HTTP server on port 8080 (runs forever)
782/// serve_http(snap, 8080, "127.0.0.1").await?;
783/// # Ok(())
784/// # }
785/// ```
786///
787/// ## Client Usage (curl)
788///
789/// ```bash
790/// # Fetch first 4KB of primary stream
791/// curl -H "Range: bytes=0-4095" http://localhost:8080/disk -o chunk.bin
792///
793/// # Fetch 1MB starting at 1MB offset
794/// curl -H "Range: bytes=1048576-2097151" http://localhost:8080/memory -o mem_chunk.bin
795///
796/// # Fetch from offset to EOF (clamped to 32 MiB)
797/// curl -H "Range: bytes=1048576-" http://localhost:8080/disk -o large_chunk.bin
798///
799/// # Full GET (no range header, returns first 32 MiB)
800/// curl http://localhost:8080/disk -o first_32mb.bin
801/// ```
802///
803/// ## Client Usage (Python)
804///
805/// ```python
806/// import requests
807///
808/// # Fetch a range
809/// headers = {'Range': 'bytes=0-4095'}
810/// response = requests.get('http://localhost:8080/disk', headers=headers)
811/// assert response.status_code == 206  # Partial Content
812/// data = response.content
813/// print(f"Fetched {len(data)} bytes")
814///
815/// # Parse Content-Range header
816/// content_range = response.headers['Content-Range']
817/// # Example: "bytes 0-4095/10737418240"
818/// print(f"Content-Range: {content_range}")
819/// ```
820///
821/// # Performance Characteristics
822///
823/// ## Throughput
824///
825/// - **Local (127.0.0.1)**: 500-2000 MB/s (limited by decompression, not HTTP overhead)
826/// - **1 Gbps network**: ~120 MB/s (network-bound)
827/// - **10 Gbps network**: ~800 MB/s (may be decompression-bound for LZ4, network-bound for ZSTD)
828///
829/// ## Latency
830///
831/// - **Cache hit**: ~80μs (block already decompressed)
832/// - **Cache miss**: ~1-5 ms (includes decompression and backend I/O)
833/// - **Network RTT**: Add local RTT (~0.1 ms for localhost, ~10-50 ms for remote)
834///
835/// ## Memory Usage
836///
837/// - **Per connection**: ~100 KB (Tokio task stack + buffers)
838/// - **Per request**: ~32 MB worst-case (if requesting `MAX_CHUNK_SIZE`)
839/// - **Block cache**: Shared across all connections (typically 100-500 MB)
840///
841/// With 1000 concurrent connections, memory overhead is ~100 MB for connections
842/// plus the shared block cache.
843///
844/// # Security Considerations
845///
846/// ## Current Security Posture
847///
848/// - **Localhost-only**: Binds to `127.0.0.1`, not accessible from network
849/// - **No authentication**: Anyone with local access can read snapshot data
850/// - **No TLS**: Plaintext HTTP (acceptable for loopback)
851/// - **DoS protection**: Request size clamping, but no rate limiting
852///
853/// ## Threat Model
854///
855/// For localhost-only deployments, the threat model assumes:
856///
857/// 1. **Trusted local environment**: All local users are trusted (or isolated via OS permissions)
858/// 2. **No remote attackers**: Firewall prevents external access
859/// 3. **Process isolation**: Snapshot data is not more sensitive than other local files
860///
861/// ## Future Security Enhancements (Planned)
862///
863/// - **TLS/HTTPS**: Certificate-based encryption for network access
864/// - **Bearer token auth**: Simple token in `Authorization` header
865/// - **Rate limiting**: Per-IP request throttling
866/// - **Audit logging**: Request logs with client IP and byte ranges
867///
868/// # Panics
869///
870/// This function does not panic under normal operation. Request handling errors
871/// are converted to HTTP error responses.
872pub async fn serve_http(snap: Arc<File>, port: u16, bind: &str) -> anyhow::Result<()> {
873    let addr: SocketAddr = format!("{}:{}", bind, port).parse()?;
874    let listener = TcpListener::bind(addr).await?;
875    tracing::info!("HTTP server listening on {}", addr);
876    serve_http_with_listener(snap, listener).await
877}
878
879/// Like [`serve_http`], but accepts a pre-bound [`TcpListener`].
880///
881/// This avoids a TOCTOU race when the caller needs to discover a free port
882/// (bind to port 0) and then pass the listener directly instead of
883/// re-binding by port number.
884pub async fn serve_http_with_listener(
885    snap: Arc<File>,
886    listener: TcpListener,
887) -> anyhow::Result<()> {
888    let state = Arc::new(AppState { snap });
889
890    let app = Router::new()
891        .route("/disk", get(get_disk))
892        .route("/memory", get(get_memory))
893        .with_state(state);
894
895    axum::serve(listener, app).await?;
896    Ok(())
897}
898
899/// HTTP handler for the `/disk` endpoint.
900///
901/// Serves the primary stream (persistent storage snapshot) from the Hexz file.
902/// Delegates to `handle_request` with `SnapshotStream::Primary`.
903///
904/// # Route
905///
906/// `GET /disk`
907///
908/// # Request Headers
909///
910/// - `Range` (optional): HTTP range request (e.g., `bytes=0-4095`)
911///
912/// # Response Headers
913///
914/// - `Content-Type`: Always `application/octet-stream` (raw binary data)
915/// - `Content-Range`: Byte range served (e.g., `bytes 0-4095/10737418240`)
916/// - `Accept-Ranges`: Always `bytes` (indicates range request support)
917///
918/// # Response Status Codes
919///
920/// - **206 Partial Content**: Successful range request
921/// - **416 Range Not Satisfiable**: Invalid or out-of-bounds range
922/// - **500 Internal Server Error**: Snapshot read failure
923///
924/// # Examples
925///
926/// See `serve_http` for client usage examples.
927async fn get_disk(headers: HeaderMap, State(state): State<Arc<AppState>>) -> impl IntoResponse {
928    handle_request(headers, &state.snap, SnapshotStream::Primary)
929}
930
931/// HTTP handler for the `/memory` endpoint.
932///
933/// Serves the secondary stream (RAM snapshot) from the Hexz file.
934/// Delegates to `handle_request` with `SnapshotStream::Secondary`.
935///
936/// # Route
937///
938/// `GET /memory`
939///
940/// # Request Headers
941///
942/// - `Range` (optional): HTTP range request (e.g., `bytes=0-4095`)
943///
944/// # Response Headers
945///
946/// - `Content-Type`: Always `application/octet-stream` (raw binary data)
947/// - `Content-Range`: Byte range served (e.g., `bytes 0-4095/8589934592`)
948/// - `Accept-Ranges`: Always `bytes` (indicates range request support)
949///
950/// # Response Status Codes
951///
952/// - **206 Partial Content**: Successful range request
953/// - **416 Range Not Satisfiable**: Invalid or out-of-bounds range
954/// - **500 Internal Server Error**: Snapshot read failure
955///
956/// # Examples
957///
958/// See `serve_http` for client usage examples.
959async fn get_memory(headers: HeaderMap, State(state): State<Arc<AppState>>) -> impl IntoResponse {
960    handle_request(headers, &state.snap, SnapshotStream::Secondary)
961}
962
963/// Core HTTP request handler that translates `Range` headers into snapshot reads.
964///
965/// This function implements the HTTP range request logic for both `/disk` and `/memory`
966/// endpoints. It performs the following steps:
967///
968/// 1. Parse the `Range` header (if present) or default to full stream access
969/// 2. Clamp the requested range to `MAX_CHUNK_SIZE` to prevent DoS
970/// 3. Read the data from the snapshot via `File::read_at`
971/// 4. Return HTTP 206 with `Content-Range` header, or error status codes
972///
973/// # Arguments
974///
975/// - `headers`: HTTP request headers from the client (parsed by Axum)
976/// - `snap`: The Hexz snapshot file to read from
977/// - `stream`: Which logical stream to read (`Disk` or `Memory`)
978///
979/// # Returns
980///
981/// An Axum `Response` with one of the following status codes:
982///
983/// - **206 Partial Content**: Successful read (even for full stream requests)
984/// - **416 Range Not Satisfiable**: Invalid range syntax or out-of-bounds offset
985/// - **500 Internal Server Error**: Snapshot read failure (decompression error, I/O error)
986///
987/// # HTTP Range Request Parsing
988///
989/// The `Range` header is expected in the format `bytes=<start>-<end>` where:
990///
991/// - `<start>` is the starting byte offset (inclusive, zero-indexed)
992/// - `<end>` is the ending byte offset (inclusive), or omitted for "to EOF"
993///
994/// ## Examples of Supported Ranges
995///
996/// ```text
997/// Range: bytes=0-1023         → Read bytes 0-1023 (1024 bytes)
998/// Range: bytes=1024-2047      → Read bytes 1024-2047 (1024 bytes)
999/// Range: bytes=1048576-       → Read from 1MB to EOF (clamped to MAX_CHUNK_SIZE)
1000/// (no Range header)           → Read from start to EOF (clamped to MAX_CHUNK_SIZE)
1001/// ```
1002///
1003/// ## Examples of Unsupported/Invalid Ranges
1004///
1005/// These return HTTP 416:
1006///
1007/// ```text
1008/// Range: bytes=-1024          → Suffix range (last 1024 bytes) - not supported
1009/// Range: bytes=0-100,200-300  → Multi-part range - not supported
1010/// Range: bytes=1000-500       → Start > end - invalid
1011/// Range: bytes=999999999999-  → Start beyond EOF - out of bounds
1012/// ```
1013///
1014/// # DoS Protection: Range Clamping Algorithm
1015///
1016/// To prevent a malicious client from requesting gigabytes of data in a single
1017/// request, the handler clamps the effective range:
1018///
1019/// ```text
1020/// requested_length = end - start + 1
1021/// if requested_length > MAX_CHUNK_SIZE:
1022///     end = start + MAX_CHUNK_SIZE - 1
1023///     if end >= total_size:
1024///         end = total_size - 1
1025/// ```
1026///
1027/// The clamped range is reflected in the `Content-Range` response header:
1028///
1029/// ```text
1030/// Content-Range: bytes <actual_start>-<actual_end>/<total_size>
1031/// ```
1032///
1033/// Clients must check this header to detect clamping and issue follow-up requests
1034/// for remaining data.
1035///
1036/// ## Clamping Example
1037///
1038/// ```text
1039/// Client request:    Range: bytes=0-67108863 (64 MiB)
1040/// Total size:        10 GB
1041/// Server clamps to:  0-33554431 (32 MiB due to MAX_CHUNK_SIZE)
1042/// Response header:   Content-Range: bytes 0-33554431/10737418240
1043/// ```
1044///
1045/// # Error Handling
1046///
1047/// ## Range Parsing Errors
1048///
1049/// If `parse_range` returns `Err(())`, the handler returns HTTP 416 (Range Not
1050/// Satisfiable). This occurs when:
1051///
1052/// - The `Range` header does not start with `"bytes="`
1053/// - The start/end offsets are not valid integers
1054/// - The start offset is greater than the end offset
1055/// - The end offset is beyond the stream size
1056///
1057/// ## Snapshot Read Errors
1058///
1059/// If `snap.read_at` returns `Err(_)`, the handler returns HTTP 500 (Internal
1060/// Server Error). This occurs when:
1061///
1062/// - Decompression fails (corrupted compressed data)
1063/// - Backend I/O fails (disk error, network timeout for remote backends)
1064/// - Encryption decryption fails (incorrect key, corrupted ciphertext)
1065///
1066/// The specific error is not exposed to the client (only logged internally) to
1067/// avoid information leakage.
1068///
1069/// # Edge Cases
1070///
1071/// ## Empty Range
1072///
1073/// If the calculated range length is 0 (e.g., due to clamping at EOF), the handler
1074/// returns HTTP 416. This should be rare in practice since clients typically request
1075/// valid ranges.
1076///
1077/// ## Zero-Sized Stream
1078///
1079/// If the snapshot stream size is 0 (empty disk or memory snapshot), any range
1080/// request returns HTTP 416 because no valid offsets exist.
1081///
1082/// ## Single-Byte Range
1083///
1084/// A request like `bytes=0-0` (fetch only byte 0) is valid and returns 1 byte with
1085/// HTTP 206 and `Content-Range: bytes 0-0/<total>`.
1086///
1087/// # Performance Characteristics
1088///
1089/// - **No Range Header**: Clamps to `MAX_CHUNK_SIZE`, then performs one `read_at` call
1090/// - **Valid Range**: One `read_at` call (may hit block cache or require decompression)
1091/// - **Invalid Range**: Immediate return (no snapshot I/O)
1092///
1093/// For cache hits, latency is ~80μs. For cache misses, latency is ~1-5 ms depending
1094/// on backend speed and compression algorithm.
1095///
1096/// # Security Notes
1097///
1098/// - **No authentication**: This function does not check credentials (handled by
1099///   future middleware or reverse proxy)
1100/// - **DoS mitigation**: Request size clamping prevents memory exhaustion
1101/// - **Information leakage**: Error responses do not reveal internal details
1102///   (e.g., "decompression failed" is hidden behind HTTP 500)
1103///
1104/// # Examples
1105///
1106/// See `serve_http`, `get_disk`, and `get_memory` for usage context.
1107fn handle_request(headers: HeaderMap, snap: &Arc<File>, stream: SnapshotStream) -> Response {
1108    let total_size = snap.size(stream);
1109
1110    let (start, mut end) = if let Some(range) = headers.get(header::RANGE) {
1111        match parse_range(range.to_str().unwrap_or(""), total_size) {
1112            Ok(r) => r,
1113            Err(_) => return StatusCode::RANGE_NOT_SATISFIABLE.into_response(),
1114        }
1115    } else {
1116        (0, total_size.saturating_sub(1))
1117    };
1118
1119    // SECURITY: DoS Protection
1120    // Clamp the requested range to avoid huge memory allocations.
1121    if end - start + 1 > MAX_CHUNK_SIZE {
1122        end = start + MAX_CHUNK_SIZE - 1;
1123        // Ensure we don't go past EOF after clamping
1124        if end >= total_size {
1125            end = total_size.saturating_sub(1);
1126        }
1127    }
1128
1129    let len = (end - start + 1) as usize;
1130    if len == 0 {
1131        // Handle empty range edge case
1132        return StatusCode::RANGE_NOT_SATISFIABLE.into_response();
1133    }
1134
1135    match snap.read_at(stream, start, len) {
1136        Ok(data) => (
1137            StatusCode::PARTIAL_CONTENT,
1138            [
1139                (header::CONTENT_TYPE, "application/octet-stream"),
1140                (
1141                    header::CONTENT_RANGE,
1142                    &format!("bytes {}-{}/{}", start, end, total_size),
1143                ),
1144                (header::ACCEPT_RANGES, "bytes"),
1145            ],
1146            data,
1147        )
1148            .into_response(),
1149        Err(_) => StatusCode::INTERNAL_SERVER_ERROR.into_response(),
1150    }
1151}
1152
1153/// Parses an HTTP `Range` header into absolute byte offsets.
1154///
1155/// Implements a subset of HTTP range request syntax (RFC 7233), supporting only
1156/// simple byte ranges without multi-part or suffix ranges.
1157///
1158/// # Supported Syntax
1159///
1160/// - **Bounded range**: `bytes=<start>-<end>` (both offsets specified)
1161///   - Example: `bytes=0-1023` → Returns `(0, 1023)`
1162/// - **Unbounded range**: `bytes=<start>-` (from start to EOF)
1163///   - Example: `bytes=1024-` → Returns `(1024, size-1)`
1164///
1165/// # Unsupported Syntax
1166///
1167/// - **Suffix range**: `bytes=-<length>` (last N bytes)
1168///   - Example: `bytes=-1024` → Returns `Err(())`
1169/// - **Multi-part range**: `bytes=0-100,200-300`
1170///   - Example: `bytes=0-100,200-300` → Returns `Err(())`
1171///
1172/// These are rejected because:
1173/// 1. They are rarely used in practice (<1% of range requests)
1174/// 2. They add significant parsing and response generation complexity
1175/// 3. The HTTP 416 error response is acceptable for clients that need them
1176///
1177/// # Arguments
1178///
1179/// - `range`: The value of the `Range` header (e.g., `"bytes=0-1023"`)
1180/// - `size`: The total size of the stream in bytes (used to validate offsets)
1181///
1182/// # Returns
1183///
1184/// - `Ok((start, end))`: Valid range with absolute byte offsets (both inclusive)
1185/// - `Err(())`: Invalid syntax or out-of-bounds range
1186///
1187/// # Error Conditions
1188///
1189/// Returns `Err(())` if:
1190///
1191/// 1. **Missing prefix**: Header does not start with `"bytes="`
1192///    - Example: `"items=0-100"` → Error
1193/// 2. **Invalid integer**: Start or end cannot be parsed as `u64`
1194///    - Example: `"bytes=abc-def"` → Error
1195/// 3. **Inverted range**: Start offset is greater than end offset
1196///    - Example: `"bytes=1000-500"` → Error
1197/// 4. **Out of bounds**: End offset is beyond the stream size
1198///    - Example: `"bytes=0-999999"` when size is 1000 → Error
1199///
1200/// # Parsing Algorithm
1201///
1202/// ```text
1203/// 1. Check for "bytes=" prefix (RANGE_PREFIX_LEN = 6)
1204/// 2. Split remaining string on '-' delimiter
1205/// 3. Parse start offset (parts[0])
1206/// 4. Parse end offset (parts[1] if present and non-empty, else size-1)
1207/// 5. Validate: start <= end && end < size
1208/// 6. Return (start, end)
1209/// ```
1210///
1211/// # Edge Cases
1212///
1213/// ## Empty String After Prefix
1214///
1215/// ```text
1216/// Range: bytes=
1217/// ```
1218///
1219/// Returns `Err(())` because there is no start offset.
1220///
1221/// ## Single Byte Range
1222///
1223/// ```text
1224/// Range: bytes=0-0
1225/// ```
1226///
1227/// Returns `Ok((0, 0))` (valid, requests exactly 1 byte).
1228///
1229/// ## Range at EOF
1230///
1231/// ```text
1232/// Range: bytes=0-999 (size = 1000)
1233/// ```
1234///
1235/// Returns `Ok((0, 999))` (valid, end is inclusive and equals `size - 1`).
1236///
1237/// ## Range Beyond EOF
1238///
1239/// ```text
1240/// Range: bytes=0-1000 (size = 1000)
1241/// ```
1242///
1243/// Returns `Err(())` because offset 1000 does not exist (valid range is 0-999).
1244///
1245/// # Examples
1246///
1247/// ```text
1248/// parse_range("bytes=0-1023", 10000)  -> Ok((0, 1023))
1249/// parse_range("bytes=1024-", 10000)   -> Ok((1024, 9999))
1250/// parse_range("0-1023", 10000)        -> Err(())   // missing "bytes=" prefix
1251/// parse_range("bytes=0-10000", 10000) -> Err(())   // out of bounds
1252/// parse_range("bytes=1000-500", 10000)-> Err(())   // inverted range
1253/// ```
1254///
1255/// # Performance
1256///
1257/// - **Time complexity**: O(n) where n is the length of the range string (typically <20 chars)
1258/// - **Allocation**: One heap allocation for the `split('-')` iterator's internal state
1259/// - **Typical latency**: <1 μs (negligible compared to snapshot read latency)
1260///
1261/// # Security
1262///
1263/// This function is resilient to malicious input:
1264///
1265/// - **Integer overflow**: `u64::parse` rejects values >2^64-1
1266/// - **Unbounded length**: The `Range` header is bounded by HTTP header size limits
1267///   (typically 8 KB, enforced by the HTTP server)
1268/// - **No allocation attacks**: Uses only one small allocation for splitting
1269#[allow(clippy::result_unit_err)]
1270pub fn parse_range(range: &str, size: u64) -> Result<(u64, u64), ()> {
1271    if !range.starts_with("bytes=") {
1272        return Err(());
1273    }
1274    let parts: Vec<&str> = range[RANGE_PREFIX_LEN..].split('-').collect();
1275    let start = parts[0].parse::<u64>().map_err(|_| ())?;
1276    let end = if parts.len() > 1 && !parts[1].is_empty() {
1277        parts[1].parse::<u64>().map_err(|_| ())?
1278    } else {
1279        size.saturating_sub(1)
1280    };
1281    if start > end || end >= size {
1282        return Err(());
1283    }
1284    Ok((start, end))
1285}