hexz_server/lib.rs
1//! HTTP, NBD, and S3 gateway server implementations for exposing Hexz snapshots.
2//!
3//! This module provides network-facing interfaces for accessing compressed Hexz
4//! snapshot data over standard protocols. It supports three distinct serving modes:
5//!
6//! 1. **HTTP Range Server** (`serve_http`): Exposes disk and memory streams via
7//! HTTP 1.1 range requests with DoS protection and partial content support.
8//! 2. **NBD (Network Block Device) Server** (`serve_nbd`): Allows mounting snapshots
9//! as Linux block devices using the standard NBD protocol.
10//! 3. **S3 Gateway** (`serve_s3_gateway`): Planned S3-compatible API for cloud
11//! integration (currently unimplemented).
12//!
13//! # Architecture Overview
14//!
15//! All servers expose the same underlying `File` API, which provides:
16//! - Block-level decompression with LRU caching
17//! - Dual-stream access (disk and memory snapshots)
18//! - Random access with minimal I/O overhead
19//! - Thread-safe concurrent reads via `Arc<File>`
20//!
21//! The servers differ in protocol semantics and use cases:
22//!
23//! | Protocol | Use Case | Access Pattern | Authentication |
24//! |----------|----------|----------------|----------------|
25//! | HTTP | Browser/API access | Range requests | None (planned) |
26//! | NBD | Linux block device mount | Block-level reads | None |
27//! | S3 | Cloud integration | Object API | AWS SigV4 (planned) |
28//!
29//! # Design Decisions
30//!
31//! ## Why HTTP Range Requests?
32//!
33//! HTTP range requests (RFC 7233) provide a standardized way to access large files
34//! in chunks without loading the entire file into memory. This aligns perfectly with
35//! Hexz's block-indexed architecture, allowing clients to fetch only the data they
36//! need. The implementation:
37//!
38//! - Returns HTTP 206 (Partial Content) for range requests
39//! - Returns HTTP 416 (Range Not Satisfiable) for invalid ranges
40//! - Clamps requests to `MAX_CHUNK_SIZE` (32 MiB) to prevent memory exhaustion
41//! - Supports both bounded (`bytes=0-1023`) and unbounded (`bytes=1024-`) ranges
42//!
43//! ## Why NBD Protocol?
44//!
45//! The Network Block Device protocol allows mounting remote storage as a local block
46//! device on Linux systems. This enables:
47//! - Transparent filesystem access (mount snapshot, browse files)
48//! - Use of standard Linux tools (`dd`, `fsck`, `mount`)
49//! - Zero application changes (existing software works unmodified)
50//!
51//! Trade-offs:
52//! - **Pro**: Native OS integration, no special client software required
53//! - **Pro**: Kernel handles caching and buffering
54//! - **Con**: No built-in encryption or authentication
55//! - **Con**: TCP-based, higher latency than local disk
56//!
57//! ## Security Architecture
58//!
59//! ### Current Security Posture (localhost-only)
60//!
61//! All servers bind to `127.0.0.1` (loopback) by default, preventing network exposure.
62//! This is appropriate for:
63//! - Local development and testing
64//! - Forensics workstations accessing local snapshots
65//! - Scenarios where network access is provided via SSH tunnels or VPNs
66//!
67//! ### Attack Surface
68//!
69//! The current implementation has a minimal attack surface:
70//! 1. **DoS via large reads**: Mitigated by `MAX_CHUNK_SIZE` clamping (32 MiB)
71//! 2. **Range header parsing**: Simplified parser with strict validation
72//! 3. **Connection exhaustion**: Limited by OS socket limits, no artificial cap
73//! 4. **Path traversal**: N/A (no filesystem access, only fixed `/disk` and `/memory` routes)
74//!
75//! ### Future Security Enhancements (Planned)
76//!
77//! - TLS/HTTPS support for encrypted transport
78//! - Token-based authentication (Bearer tokens)
79//! - Rate limiting per IP address
80//! - Configurable bind addresses (`0.0.0.0` for network access)
81//! - Request logging and audit trails
82//!
83//! # Performance Characteristics
84//!
85//! ## HTTP Server
86//!
87//! - **Throughput**: ~500-2000 MB/s (limited by decompression, not network)
88//! - **Latency**: ~1-5 ms per request (includes decompression)
89//! - **Concurrency**: Handles 1000+ concurrent connections (Tokio async runtime)
90//! - **Memory**: ~100 KB per connection + block cache overhead
91//!
92//! ## NBD Server
93//!
94//! - **Throughput**: ~500-1000 MB/s (similar to HTTP, plus NBD protocol overhead)
95//! - **Latency**: ~2-10 ms per block read (includes TCP RTT + decompression)
96//! - **Concurrency**: One Tokio task per client connection
97//!
98//! ## Bottlenecks
99//!
100//! For local (localhost) connections, the primary bottleneck is:
101//! 1. **Decompression CPU time** (80% of latency for LZ4, more for ZSTD)
102//! 2. **Block cache misses** (requires backend I/O)
103//! 3. **Memory allocation** for large reads (mitigated by clamping)
104//!
105//! Network bandwidth is rarely a bottleneck for localhost connections.
106//!
107//! # Examples
108//!
109//! ## Starting an HTTP Server
110//!
111//! ```no_run
112//! use std::sync::Arc;
113//! use hexz_core::File;
114//! use hexz_core::store::local::FileBackend;
115//! use hexz_core::algo::compression::lz4::Lz4Compressor;
116//! use hexz_server::serve_http;
117//!
118//! # #[tokio::main]
119//! # async fn main() -> anyhow::Result<()> {
120//! let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
121//! let compressor = Box::new(Lz4Compressor::new());
122//! let snap = File::new(backend, compressor, None)?;
123//!
124//! // Start HTTP server on port 8080
125//! serve_http(snap, 8080).await?;
126//! # Ok(())
127//! # }
128//! ```
129//!
130//! ## Starting an NBD Server
131//!
132//! ```no_run
133//! use std::sync::Arc;
134//! use hexz_core::File;
135//! use hexz_core::store::local::FileBackend;
136//! use hexz_core::algo::compression::lz4::Lz4Compressor;
137//! use hexz_server::serve_nbd;
138//!
139//! # #[tokio::main]
140//! # async fn main() -> anyhow::Result<()> {
141//! let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
142//! let compressor = Box::new(Lz4Compressor::new());
143//! let snap = File::new(backend, compressor, None)?;
144//!
145//! // Start NBD server on port 10809
146//! serve_nbd(snap, 10809).await?;
147//! # Ok(())
148//! # }
149//! ```
150//!
151//! ## Client Usage Examples
152//!
153//! ### HTTP Client (curl)
154//!
155//! ```bash
156//! # Fetch the first 4KB of the disk stream
157//! curl -H "Range: bytes=0-4095" http://localhost:8080/disk -o chunk.bin
158//!
159//! # Fetch 1MB starting at offset 1MB
160//! curl -H "Range: bytes=1048576-2097151" http://localhost:8080/memory -o mem_chunk.bin
161//!
162//! # Fetch from offset to EOF (server will clamp to MAX_CHUNK_SIZE)
163//! curl -H "Range: bytes=1048576-" http://localhost:8080/disk
164//! ```
165//!
166//! ### NBD Client (Linux)
167//!
168//! ```bash
169//! # Connect NBD client to server
170//! sudo nbd-client localhost 10809 /dev/nbd0
171//!
172//! # Mount the block device (read-only)
173//! sudo mount -o ro /dev/nbd0 /mnt/snapshot
174//!
175//! # Access files normally
176//! ls -la /mnt/snapshot
177//! cat /mnt/snapshot/important.log
178//!
179//! # Disconnect when done
180//! sudo umount /mnt/snapshot
181//! sudo nbd-client -d /dev/nbd0
182//! ```
183//!
184//! # Protocol References
185//!
186//! - **HTTP Range Requests**: [RFC 7233](https://tools.ietf.org/html/rfc7233)
187//! - **NBD Protocol**: [NBD Protocol Specification](https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md)
188//! - **S3 API**: [AWS S3 API Reference](https://docs.aws.amazon.com/s3/index.html) (future work)
189
190pub mod nbd;
191
192use axum::{
193 Router,
194 extract::State,
195 http::{HeaderMap, StatusCode, header},
196 response::{IntoResponse, Response},
197 routing::get,
198};
199use hexz_core::{File, SnapshotStream};
200use std::net::SocketAddr;
201use std::sync::Arc;
202use tokio::net::TcpListener;
203
204/// IPv4 address for all server listeners (localhost only).
205///
206/// # Security Rationale
207///
208/// This constant defaults to the loopback address (`127.0.0.1`) to prevent
209/// accidental exposure of snapshot data to the local network or internet.
210/// Snapshots may contain sensitive information (credentials, personal data,
211/// proprietary code), so network exposure must be an explicit, informed decision.
212///
213/// ## Current Behavior
214///
215/// All servers (HTTP, NBD, S3) bind to `127.0.0.1`, making them accessible only
216/// from the local machine. Remote access requires:
217/// - SSH port forwarding: `ssh -L 8080:localhost:8080 user@server`
218/// - VPN tunnel with local forwarding
219/// - Reverse proxy with authentication (e.g., nginx with TLS + basic auth)
220///
221/// ## Future Enhancement
222///
223/// To enable network access, a future version will support configurable bind
224/// addresses via command-line flags or configuration files:
225///
226/// ```bash
227/// # Proposed CLI syntax (not yet implemented)
228/// hexz-server --bind 0.0.0.0:8080 --auth-token mytoken123 snapshot.st
229/// ```
230///
231/// Network exposure will require authentication to be enabled (enforced by the CLI).
232const BIND_ADDR: [u8; 4] = [127, 0, 0, 1];
233
234/// Length in bytes of the HTTP `Range` header prefix `"bytes="`.
235///
236/// The HTTP Range header format is defined in RFC 7233 as:
237///
238/// ```text
239/// Range: bytes=<start>-<end>
240/// ```
241///
242/// This constant represents the length of the literal string `"bytes="` (6 bytes),
243/// which is stripped during parsing. The parser supports:
244///
245/// - Bounded ranges: `bytes=0-1023` (fetch bytes 0 through 1023 inclusive)
246/// - Unbounded ranges: `bytes=1024-` (fetch from byte 1024 to EOF)
247/// - Single-byte ranges: `bytes=0-0` (fetch only byte 0)
248///
249/// Unsupported range types (will return HTTP 416):
250/// - Suffix ranges: `bytes=-500` (last 500 bytes)
251/// - Multi-part ranges: `bytes=0-100,200-300`
252///
253/// # Rationale for Limited Support
254///
255/// Suffix ranges and multi-part ranges are rarely used in practice and add
256/// significant parsing complexity. If needed for browser compatibility, they
257/// can be added in a future version without breaking existing clients.
258const RANGE_PREFIX_LEN: usize = 6;
259
260/// Maximum allowed read size per HTTP request to prevent DoS attacks.
261///
262/// # Value
263///
264/// 32 MiB (33,554,432 bytes)
265///
266/// # DoS Protection Rationale
267///
268/// Without a limit, a malicious client could request the entire snapshot in a single
269/// HTTP request (e.g., `Range: bytes=0-`), forcing the server to:
270///
271/// 1. Decompress gigabytes of data
272/// 2. Allocate gigabytes of heap memory
273/// 3. Hold that memory while slowly transmitting over the network
274///
275/// With multiple concurrent requests, this could exhaust server memory and CPU,
276/// causing crashes or unresponsiveness (denial of service).
277///
278/// # Why 32 MiB?
279///
280/// This value balances throughput efficiency and resource protection:
281///
282/// - **Large enough**: Clients can fetch substantial chunks with low overhead
283/// (at 1 Gbps, 32 MiB transfers in ~256 ms)
284/// - **Small enough**: Even 100 concurrent maximal requests consume <3.2 GB RAM,
285/// which is manageable on modern servers
286/// - **Common practice**: Many HTTP servers use similar limits (nginx default: 16 MiB,
287/// AWS S3 max single GET: 5 GB but recommends <100 MB for performance)
288///
289/// # Clamping Behavior
290///
291/// When a client requests more than `MAX_CHUNK_SIZE` bytes:
292///
293/// 1. The server clamps the end offset: `end = min(end, start + MAX_CHUNK_SIZE - 1)`
294/// 2. Returns HTTP 206 with the clamped range in the `Content-Range` header
295/// 3. The client sees a short read and can issue follow-up requests
296///
297/// Example:
298///
299/// ```text
300/// Client request: Range: bytes=0-67108863 (64 MiB)
301/// Server response: Content-Range: bytes 0-33554431/total (32 MiB)
302/// ```
303///
304/// The client must check the `Content-Range` header to detect clamping.
305///
306/// # Future Work
307///
308/// This limit could be made configurable via CLI flags for scenarios where higher
309/// memory usage is acceptable (e.g., dedicated forensics servers with 128+ GB RAM).
310const MAX_CHUNK_SIZE: u64 = 32 * 1024 * 1024;
311
312/// Shared application state for the HTTP serving layer.
313///
314/// This struct is wrapped in `Arc` and cloned for each HTTP request handler.
315/// The inner `snap` field is also `Arc`-wrapped, so cloning `AppState` is cheap
316/// (just incrementing reference counts, no data copying).
317///
318/// # Thread Safety
319///
320/// `AppState` is `Send + Sync` because `File` is `Send + Sync`. The underlying
321/// block cache uses `Mutex` for interior mutability, so multiple concurrent requests
322/// can safely read from the same snapshot.
323///
324/// # Memory Overhead
325///
326/// Each `AppState` clone adds ~16 bytes (one `Arc` pointer). With 1000 concurrent
327/// connections, this overhead is negligible (~16 KB).
328struct AppState {
329 /// The opened Hexz snapshot file being served via HTTP.
330 ///
331 /// This is the same `File` instance for all requests. It contains:
332 /// - The storage backend (local file, S3, etc.)
333 /// - Block cache (shared across all requests)
334 /// - Decompressor instances (thread-local via pooling)
335 snap: Arc<File>,
336}
337
338/// Exposes a `File` over NBD (Network Block Device) protocol.
339///
340/// Starts a TCP listener on `127.0.0.1:<port>` that implements the NBD protocol,
341/// allowing Linux clients to mount the Hexz snapshot as a local block device
342/// using standard tools like `nbd-client`.
343///
344/// This function runs indefinitely, accepting connections in a loop. Each client
345/// connection is handled in a separate Tokio task, allowing concurrent clients.
346///
347/// # Arguments
348///
349/// - `snap`: The Hexz snapshot file to expose. Must be wrapped in `Arc` for sharing
350/// across multiple client connections.
351/// - `port`: TCP port to bind to on the loopback interface (e.g., `10809`).
352///
353/// # Returns
354///
355/// This function never returns under normal operation (it runs forever). It only
356/// returns `Err` if:
357/// - The TCP listener fails to bind (port already in use, permission denied)
358/// - An unrecoverable I/O error occurs on the listener socket
359///
360/// Individual client errors (malformed requests, disconnects) are logged but do not
361/// stop the server.
362///
363/// # Errors
364///
365/// - `std::io::Error`: If binding to the socket fails or the listener encounters
366/// a fatal error.
367///
368/// # Examples
369///
370/// ```no_run
371/// use std::sync::Arc;
372/// use hexz_core::File;
373/// use hexz_core::store::local::FileBackend;
374/// use hexz_core::algo::compression::lz4::Lz4Compressor;
375/// use hexz_server::serve_nbd;
376///
377/// # #[tokio::main]
378/// # async fn main() -> anyhow::Result<()> {
379/// let backend = Arc::new(FileBackend::new("vm_snapshot.hxz".as_ref())?);
380/// let compressor = Box::new(Lz4Compressor::new());
381/// let snap = File::new(backend, compressor, None)?;
382///
383/// // Start NBD server (runs forever)
384/// serve_nbd(snap, 10809).await?;
385/// # Ok(())
386/// # }
387/// ```
388///
389/// ## Client-Side Usage (Linux)
390///
391/// ```bash
392/// # Connect to the NBD server
393/// sudo nbd-client localhost 10809 /dev/nbd0
394///
395/// # Mount the block device (read-only, automatically detected filesystem)
396/// sudo mount -o ro /dev/nbd0 /mnt/snapshot
397///
398/// # Browse files normally
399/// ls -la /mnt/snapshot
400/// sudo cat /mnt/snapshot/var/log/syslog
401///
402/// # Unmount and disconnect
403/// sudo umount /mnt/snapshot
404/// sudo nbd-client -d /dev/nbd0
405/// ```
406///
407/// # Security Considerations
408///
409/// ## No Encryption
410///
411/// The NBD protocol transmits data in plaintext. For localhost connections this
412/// is acceptable, but for remote access consider:
413///
414/// - **SSH tunnel**: `ssh -L 10809:localhost:10809 user@server`
415/// - **VPN**: WireGuard, OpenVPN, etc.
416/// - **TLS wrapper**: `stunnel` or similar
417///
418/// ## No Authentication
419///
420/// Any process with network access to the port can connect. The default loopback
421/// binding mitigates this, but if exposing to the network, use firewall rules or
422/// SSH key authentication.
423///
424/// ## Read-Only Enforcement
425///
426/// The NBD server always exports snapshots as read-only (NBD flag `NBD_FLAG_READ_ONLY`).
427/// Write attempts return `EPERM` (operation not permitted). However, a malicious
428/// NBD client could theoretically attempt to crash the server via protocol abuse.
429///
430/// # Performance Notes
431///
432/// - **Concurrency**: Each client spawns a separate Tokio task. With 100 concurrent
433/// clients, memory overhead is ~10 MB (100 KB per task).
434/// - **Throughput**: Typically 500-1000 MB/s for sequential reads, limited by
435/// decompression rather than NBD protocol overhead.
436/// - **Latency**: ~2-10 ms per read, including TCP round-trip and decompression.
437///
438/// # Panics
439///
440/// This function does not panic under normal operation. Client errors are logged
441/// and handled gracefully.
442pub async fn serve_nbd(snap: Arc<File>, port: u16) -> anyhow::Result<()> {
443 let addr = SocketAddr::from((BIND_ADDR, port));
444 let listener = TcpListener::bind(addr).await?;
445
446 tracing::info!("NBD server listening on {}", addr);
447 println!(
448 "NBD server started on {}. Use 'nbd-client localhost {} /dev/nbd0' to mount.",
449 addr, port
450 );
451
452 loop {
453 // Accept incoming NBD connections
454 let (socket, remote_addr) = match listener.accept().await {
455 Ok(conn) => conn,
456 Err(e) => {
457 tracing::warn!("NBD accept error (continuing): {}", e);
458 continue;
459 }
460 };
461 tracing::debug!("Accepted NBD connection from {}", remote_addr);
462
463 let snap_clone = snap.clone();
464 tokio::spawn(async move {
465 if let Err(e) = nbd::handle_client(socket, snap_clone).await {
466 tracing::error!("NBD client error: {}", e);
467 }
468 });
469 }
470}
471
472/// Exposes a `File` as an S3-compatible object storage gateway.
473///
474/// # Implementation Status: NOT IMPLEMENTED
475///
476/// This function is a **placeholder** for future S3 API compatibility. It currently
477/// blocks forever without serving any requests. Calling this function will NOT panic,
478/// but it provides no useful functionality.
479///
480/// # Planned Functionality
481///
482/// When implemented, this gateway will provide S3-compatible HTTP endpoints for:
483///
484/// ## Supported Operations (Planned)
485///
486/// - `GET /<bucket>/<key>`: Retrieve snapshot data as an S3 object
487/// - `HEAD /<bucket>/<key>`: Get object metadata (size, ETag)
488/// - `GET /<bucket>/<key>?range=bytes=<start>-<end>`: Partial object retrieval
489/// - `GET /<bucket>?list-type=2`: List objects (future: multi-snapshot support)
490///
491/// ## S3 API Compatibility Goals
492///
493/// - **Authentication**: AWS Signature Version 4 (SigV4) for production use
494/// - **Authorization**: IAM-style policies (read-only by default)
495/// - **Error responses**: Standard S3 XML error responses
496/// - **Metadata**: ETag (CRC32 of snapshot header), Content-Type, Last-Modified
497///
498/// ## Mapping Hexz Concepts to S3
499///
500/// | Hexz Concept | S3 Equivalent | Mapping Strategy |
501/// |----------------|---------------|------------------|
502/// | Snapshot file | Bucket | One bucket per snapshot |
503/// | Disk stream | Object `disk.img` | Virtual object, synthesized from snapshot |
504/// | Memory stream | Object `memory.img` | Virtual object, synthesized from snapshot |
505/// | Block index | N/A | Transparent to S3 clients |
506///
507/// ## Example S3 API Usage (Planned)
508///
509/// ```bash
510/// # Configure AWS CLI to point to local S3 gateway
511/// export AWS_ACCESS_KEY_ID=minioadmin
512/// export AWS_SECRET_ACCESS_KEY=minioadmin
513/// export AWS_ENDPOINT_URL=http://localhost:9000
514///
515/// # List buckets (snapshots)
516/// aws s3 ls
517///
518/// # List objects in a snapshot
519/// aws s3 ls s3://my-snapshot/
520///
521/// # Download the disk stream
522/// aws s3 cp s3://my-snapshot/disk.img disk_copy.img
523///
524/// # Download a range (100 MB starting at offset 1 GB)
525/// aws s3api get-object --bucket my-snapshot --key disk.img \
526/// --range bytes=1073741824-1178599423 chunk.bin
527/// ```
528///
529/// # Configuration (Planned)
530///
531/// Future configuration options (not yet implemented):
532///
533/// - **Bind address**: CLI flag `--s3-bind 0.0.0.0:9000` (default: `127.0.0.1`)
534/// - **Authentication**: `--s3-access-key` and `--s3-secret-key` for SigV4
535/// - **Bucket name**: `--s3-bucket-name <name>` (default: derived from snapshot filename)
536/// - **Anonymous access**: `--s3-allow-anonymous` flag (dangerous, for testing only)
537///
538/// # Why S3 Compatibility?
539///
540/// S3 is a de facto standard for object storage. Supporting the S3 API enables:
541///
542/// 1. **Cloud integration**: Use Hexz with existing cloud infrastructure (AWS, MinIO, etc.)
543/// 2. **Tool compatibility**: Any S3-compatible tool (s3cmd, rclone, boto3) works with Hexz
544/// 3. **Caching CDNs**: Front the gateway with CloudFront or similar for caching
545/// 4. **Lifecycle policies**: Future support for automated snapshot expiration
546///
547/// # Security Considerations (Planned)
548///
549/// When implemented, the S3 gateway will require authentication by default:
550///
551/// - **SigV4 authentication**: All requests must include valid AWS Signature V4 headers
552/// - **Read-only mode**: No PUT/DELETE operations to prevent accidental modification
553/// - **Rate limiting**: Per-access-key request throttling to prevent abuse
554/// - **TLS requirement**: Production deployments must use HTTPS (enforced by CLI flag check)
555///
556/// # Performance Goals (Planned)
557///
558/// - **Throughput**: Match HTTP server performance (~500-2000 MB/s)
559/// - **Latency**: <10 ms for authenticated requests (signature verification adds ~1-2 ms)
560/// - **Concurrency**: Handle 1000+ concurrent S3 GET requests
561///
562/// # Limitations (Planned)
563///
564/// The S3 gateway will NOT support:
565///
566/// - **Write operations**: No PUT, POST, DELETE (snapshots are read-only)
567/// - **Multipart uploads**: N/A for read-only gateway
568/// - **Bucket policies**: Simplified IAM-like policies only
569/// - **Versioning**: Snapshots are immutable, no object versioning needed
570/// - **Server-side encryption**: Use TLS for transport encryption instead
571///
572/// # Arguments
573///
574/// - `_snap`: The Hexz snapshot to expose (currently unused).
575/// - `port`: TCP port to bind to on the loopback interface (e.g., `9000`).
576///
577/// # Returns
578///
579/// This function never returns (blocks indefinitely on `std::future::pending()`).
580/// It does not perform any useful work in the current implementation.
581///
582/// # Errors
583///
584/// Currently, this function cannot return an error (it blocks forever). In the
585/// future implementation, it will return errors for:
586///
587/// - Socket binding failures
588/// - Configuration validation errors
589/// - Unrecoverable I/O errors on the listener
590///
591/// # Examples
592///
593/// ```no_run
594/// use std::sync::Arc;
595/// use hexz_core::File;
596/// use hexz_core::store::local::FileBackend;
597/// use hexz_core::algo::compression::lz4::Lz4Compressor;
598/// use hexz_server::serve_s3_gateway;
599///
600/// # #[tokio::main]
601/// # async fn main() -> anyhow::Result<()> {
602/// let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
603/// let compressor = Box::new(Lz4Compressor::new());
604/// let snap = File::new(backend, compressor, None)?;
605///
606/// // WARNING: This will block forever without serving requests
607/// serve_s3_gateway(snap, 9000).await?;
608/// # Ok(())
609/// # }
610/// ```
611///
612/// # Implementation Roadmap
613///
614/// 1. **Phase 1**: Basic GET/HEAD operations with no authentication (localhost-only)
615/// 2. **Phase 2**: AWS SigV4 authentication and bucket listing
616/// 3. **Phase 3**: Multi-snapshot support (multiple buckets)
617/// 4. **Phase 4**: TLS support and network binding options
618/// 5. **Phase 5**: IAM-style policies and access control
619///
620/// # Call for Contributions
621///
622/// Implementing S3 compatibility is a substantial undertaking. If you are interested
623/// in contributing, see `docs/s3_gateway_design.md` (to be created) for the design
624/// specification and implementation plan.
625#[deprecated(note = "Not implemented. Blocks indefinitely without serving requests.")]
626pub async fn serve_s3_gateway(_snap: Arc<File>, port: u16) -> anyhow::Result<()> {
627 tracing::info!("Starting S3 Gateway on port {}", port);
628 println!(
629 "S3 Gateway started on port {} (Not fully implemented)",
630 port
631 );
632 std::future::pending::<()>().await; // Keep alive
633 unreachable!();
634}
635
636/// Exposes a `File` over HTTP with range request support.
637///
638/// Starts an HTTP 1.1 server on `127.0.0.1:<port>` that exposes snapshot data via
639/// two endpoints:
640///
641/// - `GET /disk`: Serves the disk stream (persistent storage snapshot)
642/// - `GET /memory`: Serves the memory stream (RAM snapshot)
643///
644/// Both endpoints support HTTP range requests (RFC 7233) for partial content retrieval.
645///
646/// # Protocol Behavior
647///
648/// ## Full Content Request (No Range Header)
649///
650/// ```http
651/// GET /disk HTTP/1.1
652/// Host: localhost:8080
653/// ```
654///
655/// Response:
656///
657/// ```http
658/// HTTP/1.1 206 Partial Content
659/// Content-Type: application/octet-stream
660/// Content-Range: bytes 0-33554431/10737418240
661/// Accept-Ranges: bytes
662///
663/// [First 32 MiB of data, clamped by MAX_CHUNK_SIZE]
664/// ```
665///
666/// Note: Even without a `Range` header, the response is clamped to `MAX_CHUNK_SIZE`
667/// and returns HTTP 206 (not 200) to indicate partial content.
668///
669/// ## Range Request (Partial Content)
670///
671/// ```http
672/// GET /memory HTTP/1.1
673/// Host: localhost:8080
674/// Range: bytes=1048576-2097151
675/// ```
676///
677/// Response (success):
678///
679/// ```http
680/// HTTP/1.1 206 Partial Content
681/// Content-Type: application/octet-stream
682/// Content-Range: bytes 1048576-2097151/8589934592
683/// Accept-Ranges: bytes
684///
685/// [1 MiB of data from offset 1048576]
686/// ```
687///
688/// Response (invalid range):
689///
690/// ```http
691/// HTTP/1.1 416 Range Not Satisfiable
692/// Content-Range: bytes */8589934592
693/// ```
694///
695/// ## Error Responses
696///
697/// - **416 Range Not Satisfiable**: Invalid range syntax or out-of-bounds request
698/// - **500 Internal Server Error**: Backend I/O failure or decompression error
699///
700/// # HTTP Range Request Limitations
701///
702/// ## Supported Range Types
703///
704/// - **Bounded ranges**: `bytes=<start>-<end>` (both offsets specified)
705/// - **Unbounded ranges**: `bytes=<start>-` (from start to EOF, clamped to `MAX_CHUNK_SIZE`)
706///
707/// ## Unsupported Range Types
708///
709/// These return HTTP 416 (Range Not Satisfiable):
710///
711/// - **Suffix ranges**: `bytes=-<suffix-length>` (e.g., `bytes=-1024` for last 1KB)
712/// - **Multi-part ranges**: `bytes=0-100,200-300` (multiple ranges in one request)
713///
714/// Rationale: These are rarely used and add significant implementation complexity.
715/// Standard range requests cover 99% of real-world use cases.
716///
717/// # DoS Protection Mechanisms
718///
719/// ## Request Size Clamping
720///
721/// All reads are clamped to `MAX_CHUNK_SIZE` (32 MiB) to prevent memory exhaustion:
722///
723/// ```text
724/// Client requests: bytes=0-1073741823 (1 GB)
725/// Server clamps to: bytes=0-33554431 (32 MiB)
726/// Response header: Content-Range: bytes 0-33554431/total
727/// ```
728///
729/// The client detects clamping by comparing the `Content-Range` header to the
730/// requested range and can issue follow-up requests for remaining data.
731///
732/// ## Connection Limits
733///
734/// The server relies on OS-level TCP connection limits (controlled by `ulimit -n`
735/// and kernel parameters). Tokio's async runtime handles thousands of concurrent
736/// connections efficiently (each connection consumes ~100 KB of memory).
737///
738/// For production deployments, consider:
739///
740/// - **Reverse proxy**: nginx or Caddy with connection limits and rate limiting
741/// - **Firewall rules**: Limit connections per IP address
742/// - **Resource limits**: Set `ulimit -n` to a reasonable value (e.g., 4096)
743///
744/// # Arguments
745///
746/// - `snap`: The Hexz snapshot file to expose. Must be wrapped in `Arc` for sharing
747/// across request handlers.
748/// - `port`: TCP port to bind to on the loopback interface (e.g., `8080`, `3000`).
749///
750/// # Returns
751///
752/// This function runs indefinitely, serving HTTP requests until the server is shut
753/// down (e.g., via Ctrl+C signal). It only returns `Err` if:
754///
755/// - The TCP listener fails to bind (port already in use, permission denied)
756/// - The HTTP server encounters a fatal error (should be extremely rare)
757///
758/// Individual request errors (invalid ranges, read failures) are handled gracefully
759/// and return appropriate HTTP error responses without stopping the server.
760///
761/// # Errors
762///
763/// - `std::io::Error`: If binding to the socket fails.
764/// - `anyhow::Error`: If the HTTP server encounters an unrecoverable error.
765///
766/// # Examples
767///
768/// ## Server Setup
769///
770/// ```no_run
771/// use std::sync::Arc;
772/// use hexz_core::File;
773/// use hexz_core::store::local::FileBackend;
774/// use hexz_core::algo::compression::lz4::Lz4Compressor;
775/// use hexz_server::serve_http;
776///
777/// # #[tokio::main]
778/// # async fn main() -> anyhow::Result<()> {
779/// let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
780/// let compressor = Box::new(Lz4Compressor::new());
781/// let snap = File::new(backend, compressor, None)?;
782///
783/// // Start HTTP server on port 8080 (runs forever)
784/// serve_http(snap, 8080).await?;
785/// # Ok(())
786/// # }
787/// ```
788///
789/// ## Client Usage (curl)
790///
791/// ```bash
792/// # Fetch first 4KB of disk stream
793/// curl -H "Range: bytes=0-4095" http://localhost:8080/disk -o chunk.bin
794///
795/// # Fetch 1MB starting at 1MB offset
796/// curl -H "Range: bytes=1048576-2097151" http://localhost:8080/memory -o mem_chunk.bin
797///
798/// # Fetch from offset to EOF (clamped to 32 MiB)
799/// curl -H "Range: bytes=1048576-" http://localhost:8080/disk -o large_chunk.bin
800///
801/// # Full GET (no range header, returns first 32 MiB)
802/// curl http://localhost:8080/disk -o first_32mb.bin
803/// ```
804///
805/// ## Client Usage (Python)
806///
807/// ```python
808/// import requests
809///
810/// # Fetch a range
811/// headers = {'Range': 'bytes=0-4095'}
812/// response = requests.get('http://localhost:8080/disk', headers=headers)
813/// assert response.status_code == 206 # Partial Content
814/// data = response.content
815/// print(f"Fetched {len(data)} bytes")
816///
817/// # Parse Content-Range header
818/// content_range = response.headers['Content-Range']
819/// # Example: "bytes 0-4095/10737418240"
820/// print(f"Content-Range: {content_range}")
821/// ```
822///
823/// # Performance Characteristics
824///
825/// ## Throughput
826///
827/// - **Local (127.0.0.1)**: 500-2000 MB/s (limited by decompression, not HTTP overhead)
828/// - **1 Gbps network**: ~120 MB/s (network-bound)
829/// - **10 Gbps network**: ~800 MB/s (may be decompression-bound for LZ4, network-bound for ZSTD)
830///
831/// ## Latency
832///
833/// - **Cache hit**: ~80μs (block already decompressed)
834/// - **Cache miss**: ~1-5 ms (includes decompression and backend I/O)
835/// - **Network RTT**: Add local RTT (~0.1 ms for localhost, ~10-50 ms for remote)
836///
837/// ## Memory Usage
838///
839/// - **Per connection**: ~100 KB (Tokio task stack + buffers)
840/// - **Per request**: ~32 MB worst-case (if requesting `MAX_CHUNK_SIZE`)
841/// - **Block cache**: Shared across all connections (typically 100-500 MB)
842///
843/// With 1000 concurrent connections, memory overhead is ~100 MB for connections
844/// plus the shared block cache.
845///
846/// # Security Considerations
847///
848/// ## Current Security Posture
849///
850/// - **Localhost-only**: Binds to `127.0.0.1`, not accessible from network
851/// - **No authentication**: Anyone with local access can read snapshot data
852/// - **No TLS**: Plaintext HTTP (acceptable for loopback)
853/// - **DoS protection**: Request size clamping, but no rate limiting
854///
855/// ## Threat Model
856///
857/// For localhost-only deployments, the threat model assumes:
858///
859/// 1. **Trusted local environment**: All local users are trusted (or isolated via OS permissions)
860/// 2. **No remote attackers**: Firewall prevents external access
861/// 3. **Process isolation**: Snapshot data is not more sensitive than other local files
862///
863/// ## Future Security Enhancements (Planned)
864///
865/// - **TLS/HTTPS**: Certificate-based encryption for network access
866/// - **Bearer token auth**: Simple token in `Authorization` header
867/// - **Rate limiting**: Per-IP request throttling
868/// - **Audit logging**: Request logs with client IP and byte ranges
869///
870/// # Panics
871///
872/// This function does not panic under normal operation. Request handling errors
873/// are converted to HTTP error responses.
874pub async fn serve_http(snap: Arc<File>, port: u16) -> anyhow::Result<()> {
875 let state = Arc::new(AppState { snap });
876
877 let app = Router::new()
878 .route("/disk", get(get_disk))
879 .route("/memory", get(get_memory))
880 .with_state(state);
881
882 let addr = SocketAddr::from((BIND_ADDR, port));
883 let listener = TcpListener::bind(addr).await?;
884 tracing::info!("HTTP server listening on {}", addr);
885 axum::serve(listener, app).await?;
886 Ok(())
887}
888
889/// HTTP handler for the `/disk` endpoint.
890///
891/// Serves the disk stream (persistent storage snapshot) from the Hexz file.
892/// Delegates to `handle_request` with `SnapshotStream::Disk`.
893///
894/// # Route
895///
896/// `GET /disk`
897///
898/// # Request Headers
899///
900/// - `Range` (optional): HTTP range request (e.g., `bytes=0-4095`)
901///
902/// # Response Headers
903///
904/// - `Content-Type`: Always `application/octet-stream` (raw binary data)
905/// - `Content-Range`: Byte range served (e.g., `bytes 0-4095/10737418240`)
906/// - `Accept-Ranges`: Always `bytes` (indicates range request support)
907///
908/// # Response Status Codes
909///
910/// - **206 Partial Content**: Successful range request
911/// - **416 Range Not Satisfiable**: Invalid or out-of-bounds range
912/// - **500 Internal Server Error**: Snapshot read failure
913///
914/// # Examples
915///
916/// See `serve_http` for client usage examples.
917async fn get_disk(headers: HeaderMap, State(state): State<Arc<AppState>>) -> impl IntoResponse {
918 handle_request(headers, &state.snap, SnapshotStream::Disk)
919}
920
921/// HTTP handler for the `/memory` endpoint.
922///
923/// Serves the memory stream (RAM snapshot) from the Hexz file.
924/// Delegates to `handle_request` with `SnapshotStream::Memory`.
925///
926/// # Route
927///
928/// `GET /memory`
929///
930/// # Request Headers
931///
932/// - `Range` (optional): HTTP range request (e.g., `bytes=0-4095`)
933///
934/// # Response Headers
935///
936/// - `Content-Type`: Always `application/octet-stream` (raw binary data)
937/// - `Content-Range`: Byte range served (e.g., `bytes 0-4095/8589934592`)
938/// - `Accept-Ranges`: Always `bytes` (indicates range request support)
939///
940/// # Response Status Codes
941///
942/// - **206 Partial Content**: Successful range request
943/// - **416 Range Not Satisfiable**: Invalid or out-of-bounds range
944/// - **500 Internal Server Error**: Snapshot read failure
945///
946/// # Examples
947///
948/// See `serve_http` for client usage examples.
949async fn get_memory(headers: HeaderMap, State(state): State<Arc<AppState>>) -> impl IntoResponse {
950 handle_request(headers, &state.snap, SnapshotStream::Memory)
951}
952
953/// Core HTTP request handler that translates `Range` headers into snapshot reads.
954///
955/// This function implements the HTTP range request logic for both `/disk` and `/memory`
956/// endpoints. It performs the following steps:
957///
958/// 1. Parse the `Range` header (if present) or default to full stream access
959/// 2. Clamp the requested range to `MAX_CHUNK_SIZE` to prevent DoS
960/// 3. Read the data from the snapshot via `File::read_at`
961/// 4. Return HTTP 206 with `Content-Range` header, or error status codes
962///
963/// # Arguments
964///
965/// - `headers`: HTTP request headers from the client (parsed by Axum)
966/// - `snap`: The Hexz snapshot file to read from
967/// - `stream`: Which logical stream to read (`Disk` or `Memory`)
968///
969/// # Returns
970///
971/// An Axum `Response` with one of the following status codes:
972///
973/// - **206 Partial Content**: Successful read (even for full stream requests)
974/// - **416 Range Not Satisfiable**: Invalid range syntax or out-of-bounds offset
975/// - **500 Internal Server Error**: Snapshot read failure (decompression error, I/O error)
976///
977/// # HTTP Range Request Parsing
978///
979/// The `Range` header is expected in the format `bytes=<start>-<end>` where:
980///
981/// - `<start>` is the starting byte offset (inclusive, zero-indexed)
982/// - `<end>` is the ending byte offset (inclusive), or omitted for "to EOF"
983///
984/// ## Examples of Supported Ranges
985///
986/// ```text
987/// Range: bytes=0-1023 → Read bytes 0-1023 (1024 bytes)
988/// Range: bytes=1024-2047 → Read bytes 1024-2047 (1024 bytes)
989/// Range: bytes=1048576- → Read from 1MB to EOF (clamped to MAX_CHUNK_SIZE)
990/// (no Range header) → Read from start to EOF (clamped to MAX_CHUNK_SIZE)
991/// ```
992///
993/// ## Examples of Unsupported/Invalid Ranges
994///
995/// These return HTTP 416:
996///
997/// ```text
998/// Range: bytes=-1024 → Suffix range (last 1024 bytes) - not supported
999/// Range: bytes=0-100,200-300 → Multi-part range - not supported
1000/// Range: bytes=1000-500 → Start > end - invalid
1001/// Range: bytes=999999999999- → Start beyond EOF - out of bounds
1002/// ```
1003///
1004/// # DoS Protection: Range Clamping Algorithm
1005///
1006/// To prevent a malicious client from requesting gigabytes of data in a single
1007/// request, the handler clamps the effective range:
1008///
1009/// ```text
1010/// requested_length = end - start + 1
1011/// if requested_length > MAX_CHUNK_SIZE:
1012/// end = start + MAX_CHUNK_SIZE - 1
1013/// if end >= total_size:
1014/// end = total_size - 1
1015/// ```
1016///
1017/// The clamped range is reflected in the `Content-Range` response header:
1018///
1019/// ```text
1020/// Content-Range: bytes <actual_start>-<actual_end>/<total_size>
1021/// ```
1022///
1023/// Clients must check this header to detect clamping and issue follow-up requests
1024/// for remaining data.
1025///
1026/// ## Clamping Example
1027///
1028/// ```text
1029/// Client request: Range: bytes=0-67108863 (64 MiB)
1030/// Total size: 10 GB
1031/// Server clamps to: 0-33554431 (32 MiB due to MAX_CHUNK_SIZE)
1032/// Response header: Content-Range: bytes 0-33554431/10737418240
1033/// ```
1034///
1035/// # Error Handling
1036///
1037/// ## Range Parsing Errors
1038///
1039/// If `parse_range` returns `Err(())`, the handler returns HTTP 416 (Range Not
1040/// Satisfiable). This occurs when:
1041///
1042/// - The `Range` header does not start with `"bytes="`
1043/// - The start/end offsets are not valid integers
1044/// - The start offset is greater than the end offset
1045/// - The end offset is beyond the stream size
1046///
1047/// ## Snapshot Read Errors
1048///
1049/// If `snap.read_at` returns `Err(_)`, the handler returns HTTP 500 (Internal
1050/// Server Error). This occurs when:
1051///
1052/// - Decompression fails (corrupted compressed data)
1053/// - Backend I/O fails (disk error, network timeout for remote backends)
1054/// - Encryption decryption fails (incorrect key, corrupted ciphertext)
1055///
1056/// The specific error is not exposed to the client (only logged internally) to
1057/// avoid information leakage.
1058///
1059/// # Edge Cases
1060///
1061/// ## Empty Range
1062///
1063/// If the calculated range length is 0 (e.g., due to clamping at EOF), the handler
1064/// returns HTTP 416. This should be rare in practice since clients typically request
1065/// valid ranges.
1066///
1067/// ## Zero-Sized Stream
1068///
1069/// If the snapshot stream size is 0 (empty disk or memory snapshot), any range
1070/// request returns HTTP 416 because no valid offsets exist.
1071///
1072/// ## Single-Byte Range
1073///
1074/// A request like `bytes=0-0` (fetch only byte 0) is valid and returns 1 byte with
1075/// HTTP 206 and `Content-Range: bytes 0-0/<total>`.
1076///
1077/// # Performance Characteristics
1078///
1079/// - **No Range Header**: Clamps to `MAX_CHUNK_SIZE`, then performs one `read_at` call
1080/// - **Valid Range**: One `read_at` call (may hit block cache or require decompression)
1081/// - **Invalid Range**: Immediate return (no snapshot I/O)
1082///
1083/// For cache hits, latency is ~80μs. For cache misses, latency is ~1-5 ms depending
1084/// on backend speed and compression algorithm.
1085///
1086/// # Security Notes
1087///
1088/// - **No authentication**: This function does not check credentials (handled by
1089/// future middleware or reverse proxy)
1090/// - **DoS mitigation**: Request size clamping prevents memory exhaustion
1091/// - **Information leakage**: Error responses do not reveal internal details
1092/// (e.g., "decompression failed" is hidden behind HTTP 500)
1093///
1094/// # Examples
1095///
1096/// See `serve_http`, `get_disk`, and `get_memory` for usage context.
1097fn handle_request(headers: HeaderMap, snap: &Arc<File>, stream: SnapshotStream) -> Response {
1098 let total_size = snap.size(stream);
1099
1100 let (start, mut end) = if let Some(range) = headers.get(header::RANGE) {
1101 match parse_range(range.to_str().unwrap_or(""), total_size) {
1102 Ok(r) => r,
1103 Err(_) => return StatusCode::RANGE_NOT_SATISFIABLE.into_response(),
1104 }
1105 } else {
1106 (0, total_size.saturating_sub(1))
1107 };
1108
1109 // SECURITY: DoS Protection
1110 // Clamp the requested range to avoid huge memory allocations.
1111 if end - start + 1 > MAX_CHUNK_SIZE {
1112 end = start + MAX_CHUNK_SIZE - 1;
1113 // Ensure we don't go past EOF after clamping
1114 if end >= total_size {
1115 end = total_size.saturating_sub(1);
1116 }
1117 }
1118
1119 let len = (end - start + 1) as usize;
1120 if len == 0 {
1121 // Handle empty range edge case
1122 return StatusCode::RANGE_NOT_SATISFIABLE.into_response();
1123 }
1124
1125 match snap.read_at(stream, start, len) {
1126 Ok(data) => (
1127 StatusCode::PARTIAL_CONTENT,
1128 [
1129 (header::CONTENT_TYPE, "application/octet-stream"),
1130 (
1131 header::CONTENT_RANGE,
1132 &format!("bytes {}-{}/{}", start, end, total_size),
1133 ),
1134 (header::ACCEPT_RANGES, "bytes"),
1135 ],
1136 data,
1137 )
1138 .into_response(),
1139 Err(_) => StatusCode::INTERNAL_SERVER_ERROR.into_response(),
1140 }
1141}
1142
1143/// Parses an HTTP `Range` header into absolute byte offsets.
1144///
1145/// Implements a subset of HTTP range request syntax (RFC 7233), supporting only
1146/// simple byte ranges without multi-part or suffix ranges.
1147///
1148/// # Supported Syntax
1149///
1150/// - **Bounded range**: `bytes=<start>-<end>` (both offsets specified)
1151/// - Example: `bytes=0-1023` → Returns `(0, 1023)`
1152/// - **Unbounded range**: `bytes=<start>-` (from start to EOF)
1153/// - Example: `bytes=1024-` → Returns `(1024, size-1)`
1154///
1155/// # Unsupported Syntax
1156///
1157/// - **Suffix range**: `bytes=-<length>` (last N bytes)
1158/// - Example: `bytes=-1024` → Returns `Err(())`
1159/// - **Multi-part range**: `bytes=0-100,200-300`
1160/// - Example: `bytes=0-100,200-300` → Returns `Err(())`
1161///
1162/// These are rejected because:
1163/// 1. They are rarely used in practice (<1% of range requests)
1164/// 2. They add significant parsing and response generation complexity
1165/// 3. The HTTP 416 error response is acceptable for clients that need them
1166///
1167/// # Arguments
1168///
1169/// - `range`: The value of the `Range` header (e.g., `"bytes=0-1023"`)
1170/// - `size`: The total size of the stream in bytes (used to validate offsets)
1171///
1172/// # Returns
1173///
1174/// - `Ok((start, end))`: Valid range with absolute byte offsets (both inclusive)
1175/// - `Err(())`: Invalid syntax or out-of-bounds range
1176///
1177/// # Error Conditions
1178///
1179/// Returns `Err(())` if:
1180///
1181/// 1. **Missing prefix**: Header does not start with `"bytes="`
1182/// - Example: `"items=0-100"` → Error
1183/// 2. **Invalid integer**: Start or end cannot be parsed as `u64`
1184/// - Example: `"bytes=abc-def"` → Error
1185/// 3. **Inverted range**: Start offset is greater than end offset
1186/// - Example: `"bytes=1000-500"` → Error
1187/// 4. **Out of bounds**: End offset is beyond the stream size
1188/// - Example: `"bytes=0-999999"` when size is 1000 → Error
1189///
1190/// # Parsing Algorithm
1191///
1192/// ```text
1193/// 1. Check for "bytes=" prefix (RANGE_PREFIX_LEN = 6)
1194/// 2. Split remaining string on '-' delimiter
1195/// 3. Parse start offset (parts[0])
1196/// 4. Parse end offset (parts[1] if present and non-empty, else size-1)
1197/// 5. Validate: start <= end && end < size
1198/// 6. Return (start, end)
1199/// ```
1200///
1201/// # Edge Cases
1202///
1203/// ## Empty String After Prefix
1204///
1205/// ```text
1206/// Range: bytes=
1207/// ```
1208///
1209/// Returns `Err(())` because there is no start offset.
1210///
1211/// ## Single Byte Range
1212///
1213/// ```text
1214/// Range: bytes=0-0
1215/// ```
1216///
1217/// Returns `Ok((0, 0))` (valid, requests exactly 1 byte).
1218///
1219/// ## Range at EOF
1220///
1221/// ```text
1222/// Range: bytes=0-999 (size = 1000)
1223/// ```
1224///
1225/// Returns `Ok((0, 999))` (valid, end is inclusive and equals `size - 1`).
1226///
1227/// ## Range Beyond EOF
1228///
1229/// ```text
1230/// Range: bytes=0-1000 (size = 1000)
1231/// ```
1232///
1233/// Returns `Err(())` because offset 1000 does not exist (valid range is 0-999).
1234///
1235/// # Examples
1236///
1237/// ```text
1238/// parse_range("bytes=0-1023", 10000) -> Ok((0, 1023))
1239/// parse_range("bytes=1024-", 10000) -> Ok((1024, 9999))
1240/// parse_range("0-1023", 10000) -> Err(()) // missing "bytes=" prefix
1241/// parse_range("bytes=0-10000", 10000) -> Err(()) // out of bounds
1242/// parse_range("bytes=1000-500", 10000)-> Err(()) // inverted range
1243/// ```
1244///
1245/// # Performance
1246///
1247/// - **Time complexity**: O(n) where n is the length of the range string (typically <20 chars)
1248/// - **Allocation**: One heap allocation for the `split('-')` iterator's internal state
1249/// - **Typical latency**: <1 μs (negligible compared to snapshot read latency)
1250///
1251/// # Security
1252///
1253/// This function is resilient to malicious input:
1254///
1255/// - **Integer overflow**: `u64::parse` rejects values >2^64-1
1256/// - **Unbounded length**: The `Range` header is bounded by HTTP header size limits
1257/// (typically 8 KB, enforced by the HTTP server)
1258/// - **No allocation attacks**: Uses only one small allocation for splitting
1259#[allow(clippy::result_unit_err)]
1260pub fn parse_range(range: &str, size: u64) -> Result<(u64, u64), ()> {
1261 if !range.starts_with("bytes=") {
1262 return Err(());
1263 }
1264 let parts: Vec<&str> = range[RANGE_PREFIX_LEN..].split('-').collect();
1265 let start = parts[0].parse::<u64>().map_err(|_| ())?;
1266 let end = if parts.len() > 1 && !parts[1].is_empty() {
1267 parts[1].parse::<u64>().map_err(|_| ())?
1268 } else {
1269 size.saturating_sub(1)
1270 };
1271 if start > end || end >= size {
1272 return Err(());
1273 }
1274 Ok((start, end))
1275}