Skip to main content

Crate polyc_llm_vertex

Crate polyc_llm_vertex 

Source
Expand description

A concrete LlmProvider backed by the GCP AI platform REST API, authenticated via Workload Identity Federation / Application Default Credentials (handled by gcp_auth: in-cluster metadata, WIF, or a local gcloud login, transparently).

Uses the SSE-streaming streamGenerateContent endpoint (?alt=sse) and adapts each partial response into the streaming Chunk vocabulary the trait expects (text → Chunk::TextDelta per token batch, function calls → tool-call chunks, usage + finish reason → Chunk::Usage/Chunk::Stop). Chunks are yielded as bytes arrive — the harness pushes them down to the client without buffering the whole response, so user-visible latency starts at first-token time. The trait boundary keeps it a single-file change.

Structs§

VertexConfig
Which model/project/region to call.
VertexProvider
LlmProvider over the GCP AI platform REST API.

Enums§

VertexError
Errors from the provider.