pub async fn serve_http(snap: Arc<File>, port: u16) -> Result<()>Expand description
Exposes a File over HTTP with range request support.
Starts an HTTP 1.1 server on 127.0.0.1:<port> that exposes snapshot data via
two endpoints:
GET /disk: Serves the disk stream (persistent storage snapshot)GET /memory: Serves the memory stream (RAM snapshot)
Both endpoints support HTTP range requests (RFC 7233) for partial content retrieval.
§Protocol Behavior
§Full Content Request (No Range Header)
GET /disk HTTP/1.1
Host: localhost:8080Response:
HTTP/1.1 206 Partial Content
Content-Type: application/octet-stream
Content-Range: bytes 0-33554431/10737418240
Accept-Ranges: bytes
[First 32 MiB of data, clamped by MAX_CHUNK_SIZE]Note: Even without a Range header, the response is clamped to MAX_CHUNK_SIZE
and returns HTTP 206 (not 200) to indicate partial content.
§Range Request (Partial Content)
GET /memory HTTP/1.1
Host: localhost:8080
Range: bytes=1048576-2097151Response (success):
HTTP/1.1 206 Partial Content
Content-Type: application/octet-stream
Content-Range: bytes 1048576-2097151/8589934592
Accept-Ranges: bytes
[1 MiB of data from offset 1048576]Response (invalid range):
HTTP/1.1 416 Range Not Satisfiable
Content-Range: bytes */8589934592§Error Responses
- 416 Range Not Satisfiable: Invalid range syntax or out-of-bounds request
- 500 Internal Server Error: Backend I/O failure or decompression error
§HTTP Range Request Limitations
§Supported Range Types
- Bounded ranges:
bytes=<start>-<end>(both offsets specified) - Unbounded ranges:
bytes=<start>-(from start to EOF, clamped toMAX_CHUNK_SIZE)
§Unsupported Range Types
These return HTTP 416 (Range Not Satisfiable):
- Suffix ranges:
bytes=-<suffix-length>(e.g.,bytes=-1024for last 1KB) - Multi-part ranges:
bytes=0-100,200-300(multiple ranges in one request)
Rationale: These are rarely used and add significant implementation complexity. Standard range requests cover 99% of real-world use cases.
§DoS Protection Mechanisms
§Request Size Clamping
All reads are clamped to MAX_CHUNK_SIZE (32 MiB) to prevent memory exhaustion:
Client requests: bytes=0-1073741823 (1 GB)
Server clamps to: bytes=0-33554431 (32 MiB)
Response header: Content-Range: bytes 0-33554431/totalThe client detects clamping by comparing the Content-Range header to the
requested range and can issue follow-up requests for remaining data.
§Connection Limits
The server relies on OS-level TCP connection limits (controlled by ulimit -n
and kernel parameters). Tokio’s async runtime handles thousands of concurrent
connections efficiently (each connection consumes ~100 KB of memory).
For production deployments, consider:
- Reverse proxy: nginx or Caddy with connection limits and rate limiting
- Firewall rules: Limit connections per IP address
- Resource limits: Set
ulimit -nto a reasonable value (e.g., 4096)
§Arguments
snap: The Hexz snapshot file to expose. Must be wrapped inArcfor sharing across request handlers.port: TCP port to bind to on the loopback interface (e.g.,8080,3000).
§Returns
This function runs indefinitely, serving HTTP requests until the server is shut
down (e.g., via Ctrl+C signal). It only returns Err if:
- The TCP listener fails to bind (port already in use, permission denied)
- The HTTP server encounters a fatal error (should be extremely rare)
Individual request errors (invalid ranges, read failures) are handled gracefully and return appropriate HTTP error responses without stopping the server.
§Errors
std::io::Error: If binding to the socket fails.anyhow::Error: If the HTTP server encounters an unrecoverable error.
§Examples
§Server Setup
use std::sync::Arc;
use hexz_core::File;
use hexz_core::store::local::FileBackend;
use hexz_core::algo::compression::lz4::Lz4Compressor;
use hexz_server::serve_http;
let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
let compressor = Box::new(Lz4Compressor::new());
let snap = File::new(backend, compressor, None)?;
// Start HTTP server on port 8080 (runs forever)
serve_http(snap, 8080).await?;§Client Usage (curl)
# Fetch first 4KB of disk stream
curl -H "Range: bytes=0-4095" http://localhost:8080/disk -o chunk.bin
# Fetch 1MB starting at 1MB offset
curl -H "Range: bytes=1048576-2097151" http://localhost:8080/memory -o mem_chunk.bin
# Fetch from offset to EOF (clamped to 32 MiB)
curl -H "Range: bytes=1048576-" http://localhost:8080/disk -o large_chunk.bin
# Full GET (no range header, returns first 32 MiB)
curl http://localhost:8080/disk -o first_32mb.bin§Client Usage (Python)
import requests
# Fetch a range
headers = {'Range': 'bytes=0-4095'}
response = requests.get('http://localhost:8080/disk', headers=headers)
assert response.status_code == 206 # Partial Content
data = response.content
print(f"Fetched {len(data)} bytes")
# Parse Content-Range header
content_range = response.headers['Content-Range']
# Example: "bytes 0-4095/10737418240"
print(f"Content-Range: {content_range}")§Performance Characteristics
§Throughput
- Local (127.0.0.1): 500-2000 MB/s (limited by decompression, not HTTP overhead)
- 1 Gbps network: ~120 MB/s (network-bound)
- 10 Gbps network: ~800 MB/s (may be decompression-bound for LZ4, network-bound for ZSTD)
§Latency
- Cache hit: ~80μs (block already decompressed)
- Cache miss: ~1-5 ms (includes decompression and backend I/O)
- Network RTT: Add local RTT (~0.1 ms for localhost, ~10-50 ms for remote)
§Memory Usage
- Per connection: ~100 KB (Tokio task stack + buffers)
- Per request: ~32 MB worst-case (if requesting
MAX_CHUNK_SIZE) - Block cache: Shared across all connections (typically 100-500 MB)
With 1000 concurrent connections, memory overhead is ~100 MB for connections plus the shared block cache.
§Security Considerations
§Current Security Posture
- Localhost-only: Binds to
127.0.0.1, not accessible from network - No authentication: Anyone with local access can read snapshot data
- No TLS: Plaintext HTTP (acceptable for loopback)
- DoS protection: Request size clamping, but no rate limiting
§Threat Model
For localhost-only deployments, the threat model assumes:
- Trusted local environment: All local users are trusted (or isolated via OS permissions)
- No remote attackers: Firewall prevents external access
- Process isolation: Snapshot data is not more sensitive than other local files
§Future Security Enhancements (Planned)
- TLS/HTTPS: Certificate-based encryption for network access
- Bearer token auth: Simple token in
Authorizationheader - Rate limiting: Per-IP request throttling
- Audit logging: Request logs with client IP and byte ranges
§Panics
This function does not panic under normal operation. Request handling errors are converted to HTTP error responses.