Skip to main content

Module azure

Module azure 

Source
Expand description

Azure Blob Storage backend for the ObjectStore trait.

AzureStore wraps azure_storage_blob. Like the S3 backend, this module owns the URL → SDK config translation, the error-code classifier ([classify]), and the credential resolution plumbing. Unlike S3, the SDK already does parallel range downloads inside BlobClient::download(), so there is no hand-rolled multipart orchestrator (asymmetric with S3 by design).

§Authentication

The official azure_storage_blob 0.12 crate currently exposes only Arc<dyn TokenCredential> (Entra ID) on its constructors. Azurite does not implement Entra ID without an --oauth basic HTTPS setup, and many production accounts still authenticate with shared keys. To bridge both, we install our own [auth::SharedKeySigningPolicy] as a per-try azure_core::http::policies::Policy and pass None for the SDK’s credential parameter. The SDK then forwards every request through our policy, which signs the request using the Azure Storage shared-key v2 scheme. Tracking issue: Azure/azure-sdk-for-rust#2975.

Resolution order for ?credential=<NAME> in the URL:

  1. AZSTORE_<NAME>_KEY — base64 account key → shared-key signing.
  2. AZSTORE_<NAME>_CONNECTION_STRING — connection string with AccountName= / AccountKey= → shared-key signing.
  3. AZSTORE_<NAME>_SAS — SAS query string appended verbatim to every outgoing request URL.

When no ?credential= flag is set we fall back to azure_identity::DeveloperToolsCredential (env, workload identity, managed identity, Azure CLI, …).

§Conditional writes

put_if_absent uses If-None-Match: "*" (the SDK’s BlockBlobClientUploadOptions::with_if_not_exists convenience). Azure returns 409 (BlobAlreadyExists) or 412 (ConditionNotMet) for the contention case; both collapse to Ok(false).

§Atomic get_to_file

Identical to the S3 path: head → tempfile → download(if_match) → persist. The SDK’s download() aggregates parallel range fetches internally, so no per-chunk semaphore here. A single retry with a fresh ETag covers the head-then-GET race (412 mid-download).

§copy(src, dst)

azure_storage_blob 0.12 does not expose a BlobClient::copy_from_url method (only BlockBlobClient::upload_blob_from_url, which requires a SAS-tokened source URL or an x-ms-copy-source-authorization header — neither integrates cleanly with our credential model). We implement copy as a stream-through-tempfile round trip: get_to_file writes src to a NamedTempFile, then put_path uploads it to dst. Both legs already stream — get_to_file consumes the SDK’s chunked download into the file without buffering the body, and put_path switches to our explicit stage_block + commit_block_list orchestrator (see AzureStore::multipart_put_path) once the body crosses [super::multipart::MULTIPART_PUT_THRESHOLD]. Peak in-flight bytes are bounded by [super::multipart::MULTIPART_PUT_MAX_CONCURRENCY] × [super::multipart::MULTIPART_PUT_PART_SIZE] regardless of blob size, which matters for manage doctor’s duplicate-bundle quarantine path (crate::manage::doctor::Doctor::evict_losing_bundle) — that path can copy multi-GiB bundles. Zero-byte lock files still round-trip fast: get_to_file short-circuits the GET on size == 0 and put_path issues a single zero-byte Put Blob. Body is preserved; user metadata is not propagated, matching the S3 backend’s CopyObject path which similarly carries only body bytes.

This is asymmetric with the S3 backend, which uses CopyObject for a true server-side copy — Azure’s equivalent (Copy Blob, Put Blob From URL) requires a SAS-signed source URL or an x-ms-copy-source-authorization header that the 0.12 SDK does not ergonomically expose. The download+reupload path is the safe correct fallback until the SDK closes that gap.

§A note on Range and zero-byte blobs

A Range request against a zero-byte blob returns HTTP 416. We never issue Range requests directly — BlobClient::download() owns that — but the zero-size short-circuit in get_to_file also avoids any download SDK call against a known-empty blob, which sidesteps the issue entirely.

§Size limits

Azure caps a block blob at 50 000 committed blocks (~4.75 TiB at the SDK’s default block size) and a single Put Blob body at 5000 MiB; above [super::multipart::MULTIPART_PUT_THRESHOLD] the helper switches to explicit stage_block + commit_block_list, so callers do not have to reason about the single-call cutoff. The upload path is not resumable across process death — see the README “Known limitations” section.

§HTTP transport tuning

azure_core 0.35’s default transport keeps idle pooled connections forever and never sets TCP keepalive, so a pooled connection to a rotated VIP would hang an in-flight request until the OS-level TCP retransmit timeout fires (~15 minutes on Linux). AzureStore installs a custom reqwest::Client via Transport on ClientOptions::transport with four bounds:

  • [POOL_IDLE_TIMEOUT] (30 s) — drops idle pooled connections before a typical DNS rotation makes them stale.
  • [TCP_KEEPALIVE] (30 s) — detects a dead-but-not-closed TCP session in seconds rather than the 2-hour Linux default; covers hot pooled connections that pool-idle alone cannot.
  • [CONNECT_TIMEOUT] (10 s) — bounds a fresh-connect attempt to a dead VIP rather than waiting on the OS connect timeout.
  • [READ_TIMEOUT] (30 s) — per-read timeout that resets after a successful read, so a stuck transfer fails fast without limiting total body size.

Together these cap a DNS-rotation hang at tens of seconds rather than minutes. The custom transport leaves ClientOptions::per_try_policies (where the shared-key signing lives) untouched — the SDK pipeline runs per-try policies independently of the transport. Tracking issue: #26.

§Stdout discipline

Per .claude/rules/protocol-stdout.md, this module never writes to stdout. Diagnostics go through tracing (which the helper binaries configure to write to stderr).

Modules§

auth
Credential resolution and the shared-key / SAS signing policies for the Azure Blob backend.

Structs§

AzureStore
Production ObjectStore backed by azure_storage_blob.