Skip to main content

Module s3

Module s3 

Source
Expand description

S3 backend for the ObjectStore trait.

S3Store wraps aws-sdk-s3. The SDK owns SigV4, retries, connection pooling, and timeout policy; this module owns the URL → SDK config translation, the error-code classifier ([classify]), and the hand-rolled multipart download orchestrator that the SDK does not provide.

§Key composition

S3Store does not auto-prepend the RemoteUrl prefix. Trait keys are byte-prefix per the contract on ObjectStore::list; the URL prefix is a repository concern and is composed by callers that build keys like <prefix>/refs/.../<sha>.bundle.

§Conditional writes

put_if_absent uses If-None-Match: "*". S3 returns either 412 (PreconditionFailed) when the key already exists or 409 (ConditionalRequestConflict) when two PUTs race. Both collapse to Ok(false).

§Size limits

AWS caps a single PutObject body at [SINGLE_PUT_LIMIT_BYTES] (5 GiB) and a multipart upload at [S3_MAX_PARTS] (10 000) parts; the per-object ceiling is 5 TiB. The helper auto-promotes uploads above [super::multipart::MULTIPART_PUT_THRESHOLD] onto the multipart path, so callers do not have to reason about the 5 GiB single-PUT cutoff. The upload path is not resumable across process death — see the README “Known limitations” section.

§Atomic get_to_file

Both the small-object and multipart download paths write to a sibling tempfile::NamedTempFile and rename on success so a partial failure cannot leave a corrupt destination for the unbundle step.

Every GET carries If-Match: <etag> derived from the preceding HeadObject call. If the object is overwritten between head and the body download, S3 returns 412 and get_to_file retries once with a fresh head/ETag. After one retry the 412 propagates as ObjectStoreError::PreconditionFailed.

§HTTP transport tuning

aws-sdk-s3’s default HTTP client keeps idle pooled connections indefinitely, so a pooled connection to a rotated VIP would wedge an in-flight request until the OS-level TCP retransmit timeout fires (~15 minutes on Linux). S3Store::from_remote_url installs a custom HTTP client built via aws_smithy_http_client::Builder with [POOL_IDLE_TIMEOUT] bounded to 30 s, so a rotation costs at most one short-circuited request rather than minutes of wedged transfer. Tracking issue: #26.

Pool-idle alone does not bound a hot pooled connection — one that was used within the last 30 s but has since become stuck — and the 412 retry in ObjectStore::get_to_file is a deliberate-server- response retry, so forcing a fresh connection there does not help. Instead, the SDK’s aws_config::timeout::TimeoutConfig is given a [READ_TIMEOUT] so a stuck request fails fast and the SDK’s internal retry layer can pick a fresh one. connect_timeout is left at the SDK default (3.1 s, already aggressive). Tracking issue: #26.

Note: smithy’s read_timeout resolves the HTTP connector future at “response-headers received.” That bounds:

  • Uploads in full — the connector future cannot resolve until the request body is sent and the response status arrives, so a stuck upload trips at [READ_TIMEOUT]. put_body therefore overrides the timeout per-operation so large bundle uploads are not cut off at 30 s.
  • Downloads only up to time-to-first-byte. Once response headers arrive the future resolves; subsequent body-chunk reads are not bounded by read_timeout. A peer that wedges mid-body on a GET (e.g. a stuck multipart range) is still subject to the pool-idle / TCP-keepalive layers, but not to READ_TIMEOUT. Lesson #2 in docs/development/lessons_learned.md covers this.

TCP keepalive (the second knob suggested in #27) is not wired on the S3 path: aws-smithy-http-client 1.1.12’s public Builder / ConnectorBuilder API exposes pool_idle_timeout but does not expose tcp_keepalive. The dominant DNS-rotation failure in #26 is pool reuse of a dead VIP, which pool_idle_timeout already fixes; the gap relative to the Azure backend (which uses reqwest and gets keepalive for free) is documented in CHANGELOG.md.

§Multipart-upload lifetime safety

S3 retains uncompleted multipart uploads indefinitely without an explicit lifecycle rule, so a future dropped between CreateMultipartUpload and CompleteMultipartUpload orphans the upload-id and bills the caller for the staged parts (issues #169, #171). S3Store::start_multipart_upload therefore hands back a [MultipartUploadGuard] that owns the upload-id and best-effort issues AbortMultipartUpload on Drop; finish_multipart_upload is the only call site that may disarm the guard.

Future contributors must not introduce an early ?-return between obtaining the upload-id and constructing the [MultipartUploadGuard] inside start_multipart_upload, nor between start_multipart_upload and the matching finish_multipart_upload: a bare upload-id outside the guard reintroduces the leak the guard exists to prevent. (The ok_or_else for a missing upload-id field on the SDK response is benign — there is no upload-id to abort.)

Azure has no equivalent need — uncommitted blocks auto-expire after seven days (azure.rs).

§Stdout discipline

Per .claude/rules/protocol-stdout.md, this module never writes to stdout. Diagnostics go through tracing (which the helper binaries configure to write to stderr).

Structs§

S3Store
Production ObjectStore backed by aws-sdk-s3.