Expand description
A concrete LlmProvider backed by the GCP AI platform REST API,
authenticated via Workload Identity Federation / Application Default
Credentials (handled by gcp_auth: in-cluster metadata, WIF, or a local
gcloud login, transparently).
Uses the SSE-streaming streamGenerateContent endpoint
(?alt=sse) and adapts each partial response into the streaming
Chunk vocabulary the trait expects (text → Chunk::TextDelta per
token batch, function calls → tool-call chunks, usage + finish reason
→ Chunk::Usage/Chunk::Stop). Chunks are yielded as bytes arrive
— the harness pushes them down to the client without buffering the
whole response, so user-visible latency starts at first-token time.
The trait boundary keeps it a
single-file change.
Structs§
- Vertex
Config - Which model/project/region to call.
- Vertex
Provider LlmProviderover the GCP AI platform REST API.
Enums§
- Vertex
Error - Errors from the provider.