Headwind
A Kubernetes operator for automating workload updates based on container image changes, written in Rust.
Headwind monitors container registries and automatically updates your Kubernetes workloads when new images are available, with intelligent semantic versioning policies and approval workflows.
Features
- Dual Update Triggers: Event-driven webhooks or registry polling for maximum flexibility
- Semver Policy Engine: Intelligent update decisions based on semantic versioning (patch, minor, major, glob, force, all)
- Web UI Dashboard: Modern web interface with:
- Real-time filtering, sorting, and pagination
- Multi-mode authentication (none, simple header, Kubernetes token, proxy/ingress)
- Audit logging for all approval/rejection actions
- Auto-refresh every 30 seconds
- Responsive design for desktop and mobile
- Observability Dashboard: Built-in metrics visualization with:
- Multi-backend support (Prometheus, VictoriaMetrics, InfluxDB)
- Auto-discovery of available backends
- Real-time metrics cards and time-series data
- Hot-reload configuration management
- Approval Workflow: Full HTTP API for approval requests with integration possibilities (Slack, webhooks, etc.)
- Rollback Support: Manual rollback to previous versions with update history tracking and automatic rollback on failures
- Notifications: Slack, Microsoft Teams, and generic webhook notifications with dashboard links for all deployment events
- Full Observability: Prometheus metrics (35+ metrics), distributed tracing, and structured logging
- Resource Support:
- Kubernetes Deployments ✅
- Kubernetes StatefulSets ✅
- Kubernetes DaemonSets ✅
- Flux HelmReleases ✅
- Lightweight: Single binary, no database required
- Secure: Runs as non-root, read-only filesystem, minimal permissions
Quick Start
Prerequisites
- Kubernetes cluster (1.25+)
- kubectl configured
Installation
Option 1: Pre-built Container Images (Recommended)
Pull the latest release from GitHub Container Registry or Google Artifact Registry:
# From GitHub Container Registry (ghcr.io)
# Or from Google Artifact Registry
# Apply Kubernetes manifests
# Optional: Apply HelmRepository CRD if you want Helm chart auto-discovery
# (Skip if you already have Flux CD installed)
# Update deployment to use the pulled image
Image Locations:
- GitHub Container Registry:
ghcr.io/headwind-sh/headwind:VERSION - Google Artifact Registry:
us-docker.pkg.dev/secret-node-477601-s8/headwind/headwind:VERSION
Available Tags:
latest- Latest stable releaseX.Y.Z- Specific version (e.g.,0.1.0)X.Y- Latest patch version (e.g.,0.1)X- Latest minor version (e.g.,0)
Image Details:
- Base: Chainguard wolfi-base (enterprise security)
- Size: ~58MB (73% smaller than Ubuntu-based images)
- Architecture: Multi-arch (amd64, arm64)
- Security: Non-root user, no shell, minimal CVEs
Option 2: Install from crates.io
If you have Rust installed, you can install Headwind as a binary:
# Install from crates.io
# Run directly (requires KUBECONFIG)
Option 3: Pre-built Binaries
Download pre-built binaries from GitHub Releases:
Platforms:
- Linux:
headwind-linux-amd64,headwind-linux-arm64 - macOS:
headwind-darwin-amd64(Intel),headwind-darwin-arm64(Apple Silicon) - Windows:
headwind-windows-amd64.exe,headwind-windows-arm64.exe
# Example: Download and install Linux binary
# Run
Option 4: Build from Source
# Clone the repository
# Build the Docker image
# Load into your cluster (for kind/minikube/Docker Desktop)
# Apply all Kubernetes manifests
Configuration
Add annotations to your Deployments to enable Headwind:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
annotations:
# Update policy: none, patch, minor, major, glob, force, all
headwind.sh/policy: "minor"
# Require approval before updating (default: true)
headwind.sh/require-approval: "true"
# Minimum time between updates in seconds (default: 300)
headwind.sh/min-update-interval: "300"
# Specific images to track (comma-separated, empty = all)
headwind.sh/images: "nginx, redis"
# Event source: webhook, polling, both, none (default: webhook)
headwind.sh/event-source: "webhook"
# Per-resource polling interval in seconds (overrides global HEADWIND_POLLING_INTERVAL)
# Only applies when event-source is "polling" or "both"
headwind.sh/polling-interval: "600"
# Automatic rollback on deployment failures (default: false)
headwind.sh/auto-rollback: "true"
# Rollback timeout in seconds (default: 300)
headwind.sh/rollback-timeout: "300"
# Health check retries before rollback (default: 3)
headwind.sh/health-check-retries: "3"
spec:
# ... rest of deployment spec
Flux HelmRelease Support
Headwind can monitor Flux HelmRelease resources and automatically discover new Helm chart versions from Helm repositories, updating based on semantic versioning policies.
Prerequisites
Headwind requires the HelmRepository CRD to query Helm repositories for available chart versions:
If you have Flux CD installed: The CRD already exists - no action needed!
If you DON'T have Flux CD: Apply the HelmRepository CRD:
Setup
Headwind supports both traditional HTTP Helm repositories and modern OCI registries (like ECR, GCR, ACR, Harbor, JFrog Artifactory, GitHub Container Registry, etc.).
- Create a HelmRepository resource pointing to your Helm repository:
HTTP Helm Repository:
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: my-repo
namespace: default
spec:
url: https://charts.example.com # Traditional HTTP Helm repository
interval: 5m
type: default
# Optional: for private repositories
secretRef:
name: helm-repo-credentials # Secret with username/password keys
OCI Registry (ECR, GCR, ACR, Harbor, JFrog, GHCR, etc.):
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: my-oci-repo
namespace: default
spec:
url: oci://registry.example.com/helm-charts # OCI registry URL
interval: 5m
type: oci
# Optional: for private registries
secretRef:
name: oci-registry-credentials # Secret with username/password keys
Note: Headwind automatically detects whether to use HTTP or OCI based on the URL scheme (https:// vs oci://).
Known Limitations
OCI Registry Support: Due to a limitation in the underlying oci-distribution Rust crate (v0.11), OCI Helm repositories may incorrectly query Docker Hub when the chart name matches a common Docker image name (e.g., busybox, nginx, redis, postgres). This results in discovering Docker container image tags instead of Helm chart versions.
Workaround: Use traditional HTTP Helm repositories (fully supported) or ensure your OCI Helm chart names don't conflict with popular Docker Hub image names. This limitation is expected to be resolved in future crate updates.
Status: HTTP Helm repositories work perfectly and are the recommended approach until this OCI limitation is addressed.
- Create a HelmRelease with Headwind annotations:
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: my-app
namespace: default
annotations:
# Update policy: none, patch, minor, major, glob, force, all
headwind.sh/policy: "minor"
# Require approval before updating (default: true)
headwind.sh/require-approval: "true"
# Minimum time between updates in seconds (default: 300)
headwind.sh/min-update-interval: "300"
# Event source: webhook, polling, both, none (default: webhook)
headwind.sh/event-source: "webhook"
# Per-resource polling interval in seconds (overrides global HEADWIND_POLLING_INTERVAL)
# Only applies when event-source is "polling" or "both"
headwind.sh/polling-interval: "600"
spec:
interval: 5m
chart:
spec:
chart: my-app
version: "1.2.3" # Headwind monitors this version
sourceRef:
kind: HelmRepository
name: my-repo
namespace: default
values:
# ... your values
How it works:
- Headwind watches all HelmRelease resources with
headwind.sh/policyannotation - Automatically queries the referenced HelmRepository for available chart versions
- Uses the PolicyEngine to find the best matching version based on your policy
- Compares discovered versions with
status.lastAttemptedRevisionorspec.chart.spec.version - Either:
- Creates an UpdateRequest CRD if
require-approval: "true"(default) - Applies the update directly if
require-approval: "false"(respectsmin-update-interval)
- Creates an UpdateRequest CRD if
- Sends notifications (Slack, Teams, webhooks) about the update
Configuration:
Automatic version discovery is enabled by default. To disable:
# deploy/k8s/deployment.yaml
env:
- name: HEADWIND_HELM_AUTO_DISCOVERY
value: "false"
Private Helm Repositories:
For private repositories requiring authentication, create a Secret:
apiVersion: v1
kind: Secret
metadata:
name: helm-repo-credentials
namespace: default
type: Opaque
stringData:
username: myusername
password: mypassword
Metrics:
Helm-specific metrics are available at /metrics:
headwind_helm_releases_watched- Number of HelmReleases being monitoredheadwind_helm_chart_versions_checked_total- Version checks performedheadwind_helm_updates_found_total- Updates discoveredheadwind_helm_updates_approved_total- Updates approved by policyheadwind_helm_updates_rejected_total- Updates rejected by policyheadwind_helm_updates_applied_total- Updates successfully applied to HelmReleasesheadwind_helm_repository_queries_total- Repository index queries performedheadwind_helm_repository_errors_total- Repository query errorsheadwind_helm_repository_query_duration_seconds- Repository query duration
Update Policies
- none: Never update automatically (default)
- patch: Only update patch versions (1.2.3 → 1.2.4)
- minor: Update minor versions (1.2.3 → 1.3.0)
- major: Update major versions (1.2.3 → 2.0.0)
- all: Update to any new version
- glob: Match glob pattern (specify with
headwind.sh/pattern) - force: Force update regardless of version
Update Triggers
Headwind supports two methods for detecting new images:
1. Webhooks (Recommended)
Event-driven updates are faster and more efficient. Configure your registry to send webhooks to Headwind.
Docker Hub:
Webhook URL: http://<headwind-webhook-service>/webhook/dockerhub
Generic Registry (Harbor, GitLab, GCR, etc.):
Webhook URL: http://<headwind-webhook-service>/webhook/registry
For external access, use an Ingress or LoadBalancer service.
2. Registry Polling (Fallback)
If webhooks aren't available, enable registry polling:
# deploy/k8s/deployment.yaml
env:
- name: HEADWIND_POLLING_ENABLED
value: "true"
- name: HEADWIND_POLLING_INTERVAL
value: "300" # Poll every 5 minutes
When to use polling:
- Registry doesn't support webhooks
- Headwind is not publicly accessible
- Testing or development environments
Note: Polling is less efficient and has a delay. Use webhooks when possible.
3. Per-Resource Event Source Configuration
By default, all resources use webhooks as their event source (headwind.sh/event-source: "webhook"). You can override this on a per-resource basis:
Event Source Options:
webhook(default) - Only respond to webhook events, skip registry pollingpolling- Only use registry polling, ignore webhook eventsboth- Respond to both webhooks and polling (redundant but ensures coverage)none- Disable all update triggers for this resource
Use Cases:
Webhook-only resources (default):
metadata:
annotations:
headwind.sh/policy: "minor"
headwind.sh/event-source: "webhook" # Can be omitted (default)
Best for registries with webhook support. Updates are immediate when new images are pushed.
Polling-only resources:
metadata:
annotations:
headwind.sh/policy: "minor"
headwind.sh/event-source: "polling"
headwind.sh/polling-interval: "600" # Optional: poll every 10 minutes
Best for:
- Registries without webhook support
- Resources that should be checked less frequently
- Development/staging environments
Both webhooks and polling:
metadata:
annotations:
headwind.sh/policy: "minor"
headwind.sh/event-source: "both"
Provides redundancy - updates will be detected via webhooks (fast) or polling (fallback).
Per-resource polling intervals:
When using event-source: "polling" or event-source: "both", you can override the global HEADWIND_POLLING_INTERVAL for specific resources:
metadata:
annotations:
headwind.sh/policy: "minor"
headwind.sh/event-source: "polling"
headwind.sh/polling-interval: "60" # Poll this resource every 60 seconds
This allows you to poll critical resources more frequently while checking less critical resources less often, reducing registry API load.
Example: Mixed event sources in a namespace:
# Production API - webhook-only (fastest)
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-production
annotations:
headwind.sh/policy: "patch"
headwind.sh/event-source: "webhook"
---
# Staging API - polling every 5 minutes
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-staging
annotations:
headwind.sh/policy: "all"
headwind.sh/event-source: "polling"
headwind.sh/polling-interval: "300"
headwind.sh/require-approval: "false"
---
# Background job - polling every 30 minutes (low priority)
apiVersion: apps/v1
kind: Deployment
metadata:
name: background-job
annotations:
headwind.sh/policy: "minor"
headwind.sh/event-source: "polling"
headwind.sh/polling-interval: "1800"
Working with UpdateRequests
Headwind creates UpdateRequest custom resources when it detects a new image version that matches a Deployment's policy. These CRDs track the approval workflow.
Viewing UpdateRequests
# List all UpdateRequests
# Get details of a specific UpdateRequest
# Watch for new UpdateRequests in real-time
UpdateRequest Status
Each UpdateRequest has a phase indicating its current state:
- Pending: Waiting for approval
- Completed: Approved and successfully applied
- Rejected: Rejected by approver
- Failed: Approval granted but update failed
Example UpdateRequest
apiVersion: headwind.sh/v1alpha1
kind: UpdateRequest
metadata:
name: nginx-update-1-26-0
namespace: default
spec:
targetRef:
kind: Deployment
name: nginx-example
namespace: default
containerName: nginx
currentImage: nginx:1.25.0
newImage: nginx:1.26.0
policy: minor
status:
phase: Pending
createdAt: "2025-11-06T01:00:00Z"
lastUpdated: "2025-11-06T01:00:00Z"
Web UI Dashboard
Headwind provides a modern web-based dashboard for viewing and managing update requests.
Accessing the Web UI
The Web UI is available on port 8082 by default:
# Port forward to access locally
# Open in browser
Features
- Dashboard View: List all pending and completed UpdateRequests across all namespaces
- Filtering & Search:
- Real-time search by resource name or image
- Filter by namespace
- Filter by resource kind (Deployment, StatefulSet, DaemonSet, HelmRelease)
- Filter by policy type
- Sorting: Sort by date (newest/oldest first), namespace, or resource name
- Pagination: View updates in pages of 20 items
- One-Click Actions:
- Approve updates with confirmation
- Reject updates with reason (modal dialog)
- View detailed information for each update
- Real-time Notifications: Toast notifications for success/error
- Responsive Design: Works on desktop and mobile
Screenshots
The Web UI provides:
- Stats Cards: Quick overview of pending and completed updates
- Pending Updates Table: Actionable list with approve/reject buttons
- Completed Updates: Collapsible history of processed updates
- Detail View: Full information about each UpdateRequest
Access at http://localhost:8082 when port-forwarded, or expose via Service/Ingress for remote access.
Authentication
The Web UI supports four authentication modes configured via the HEADWIND_UI_AUTH_MODE environment variable:
1. None (Default)
No authentication required. All actions are logged as "web-ui-user".
env:
- name: HEADWIND_UI_AUTH_MODE
value: "none"
2. Simple Header Authentication
Reads username from X-User HTTP header. Suitable for use behind an authenticating reverse proxy.
env:
- name: HEADWIND_UI_AUTH_MODE
value: "simple"
Example usage:
3. Kubernetes Token Authentication
Validates bearer tokens using Kubernetes TokenReview API and extracts the authenticated username.
env:
- name: HEADWIND_UI_AUTH_MODE
value: "token"
Requirements:
- RBAC permission for
authentication.k8s.io/tokenreviews(already included indeploy/k8s/rbac.yaml)
Example usage:
# Get service account token
TOKEN=
# Access Web UI with token
4. Proxy/Ingress Authentication
Reads username from a configurable HTTP header set by an ingress controller or authentication proxy (e.g., oauth2-proxy, Authelia).
env:
- name: HEADWIND_UI_AUTH_MODE
value: "proxy"
- name: HEADWIND_UI_PROXY_HEADER # Optional, defaults to X-Forwarded-User
value: "X-Auth-Request-User"
Audit Logging
All approval and rejection actions are logged with structured audit information:
Audit logs use the dedicated log target headwind::audit and can be filtered with:
|
Auto-Refresh
The dashboard automatically refreshes every 30 seconds to show the latest UpdateRequests. This can be disabled by clicking the "Auto-refresh" toggle in the UI.
Configuration Management
The Web UI supports hot-reload configuration via ConfigMap. Changes to the ConfigMap are detected automatically without requiring pod restarts.
apiVersion: v1
kind: ConfigMap
metadata:
name: headwind-ui-config
namespace: headwind-system
data:
config.yaml: |
refresh_interval: 30
max_items_per_page: 20
Mount the ConfigMap in the deployment:
volumeMounts:
- name: ui-config
mountPath: /etc/headwind/ui
volumes:
- name: ui-config
configMap:
name: headwind-ui-config
Observability Dashboard
The Web UI includes a comprehensive observability dashboard at /observability with real-time metrics visualization.
Features
- Multi-Backend Support: Automatically detects and connects to Prometheus, VictoriaMetrics, or InfluxDB v2
- Auto-Discovery: Automatically finds available metrics backends in your cluster
- Fallback Mode: Falls back to parsing
/metricsendpoint if no backend is available - Real-Time Data: Auto-refreshes every 30 seconds
- Interactive Time-Series Charts: Chart.js-powered visualizations showing 24-hour trends (Prometheus/VictoriaMetrics/InfluxDB only)
- Updates Over Time (approved, applied, failed)
- Resources Watched (deployments, statefulsets, daemonsets, helmreleases)
- Key Metrics Cards:
- Updates: Pending, Approved, Applied, Failed
- Resources Watched: Deployments, StatefulSets, DaemonSets, HelmReleases
Configuration
Configure metrics backend via ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: headwind-config
namespace: headwind-system
data:
config.yaml: |
observability:
metricsBackend: "auto" # auto | prometheus | victoriametrics | influxdb | live
prometheus:
enabled: true
url: "http://prometheus-server.monitoring.svc.cluster.local:80"
victoriametrics:
enabled: false
url: "http://victoria-metrics.monitoring.svc.cluster.local:8428"
influxdb:
enabled: false
url: "http://influxdb.monitoring.svc.cluster.local:8086"
org: "headwind" # InfluxDB v2 organization
bucket: "metrics" # InfluxDB v2 bucket
token: "your-api-token" # InfluxDB v2 API token
Backend Options:
auto- Automatically detects available backend (default)prometheus- Use Prometheus for metrics storage and queriesvictoriametrics- Use VictoriaMetrics (Prometheus-compatible API)influxdb- Use InfluxDB v2 for time-series datalive- Parse metrics directly from/metricsendpoint (no external backend)
Auto-Discovery Priority: Prometheus → VictoriaMetrics → InfluxDB → Live
API Endpoints
# Get current metrics from configured backend
# Get 24-hour time-series data for specific metric
Prometheus Integration Example
Deploy Prometheus to scrape Headwind metrics:
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
scrape_configs:
- job_name: 'headwind'
static_configs:
- targets: ['headwind-metrics.headwind-system.svc.cluster.local:9090']
scrape_interval: 15s
The observability dashboard will automatically detect Prometheus and display metrics from it.
API Endpoints
Approval API (Port 8081)
# List all pending updates across all namespaces
# Get specific update by namespace and name
# Approve update (automatically executes the update)
# Reject update with reason
# Example: Approve an update
Note: Approving an update immediately executes the deployment update and updates the UpdateRequest CRD status.
Rollback API
Headwind automatically tracks update history for all deployments and provides manual rollback capabilities.
Using kubectl Plugin (Recommended)
# Install the kubectl plugin
# Rollback a deployment
# View update history
# List all pending updates
# Approve/reject updates
See KUBECTL_PLUGIN.md for complete plugin documentation.
Using curl directly
# Get update history for a deployment
# Rollback to the previous image
# Example: Rollback nginx deployment
# Get history
Automatic Rollback
When enabled, Headwind automatically monitors deployment health after updates and rolls back if failures are detected:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
annotations:
# Enable automatic rollback (default: false)
headwind.sh/auto-rollback: "true"
# How long to monitor deployment health (default: 300s)
headwind.sh/rollback-timeout: "300"
# Number of failed health checks before rollback (default: 3)
headwind.sh/health-check-retries: "3"
Automatic rollback triggers on:
- CrashLoopBackOff: Pods repeatedly crashing
- ImagePullBackOff: Unable to pull new image
- High restart count: Container restarts > 5 times
- Readiness failures: Pods not becoming ready
- Deployment deadline exceeded: ProgressDeadlineExceeded condition
When a failure is detected, Headwind automatically:
- Logs the failure reason
- Reverts to the previous working image
- Creates a rollback entry in the update history
- Continues monitoring the rolled-back deployment
Update History
All updates are tracked in deployment annotations:
# View update history in deployment annotations
|
# Example output:
[
{
}
{
}
Headwind keeps the last 10 updates per container.
Notifications
Headwind can send notifications about deployment updates to Slack, Microsoft Teams, or generic webhooks.
Configuration
Configure notifications using environment variables in deploy/k8s/deployment.yaml:
env:
# Slack Configuration
- name: SLACK_ENABLED
value: "true"
- name: SLACK_WEBHOOK_URL
value: "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
- name: SLACK_CHANNEL # Optional: override webhook default
value: "#deployments"
- name: SLACK_USERNAME # Optional: customize bot name
value: "Headwind Bot"
- name: SLACK_ICON_EMOJI # Optional: customize bot icon
value: ":rocket:"
# Microsoft Teams Configuration
- name: TEAMS_ENABLED
value: "true"
- name: TEAMS_WEBHOOK_URL
value: "https://outlook.office.com/webhook/YOUR-WEBHOOK-URL"
# Generic Webhook Configuration
- name: WEBHOOK_ENABLED
value: "true"
- name: WEBHOOK_URL
value: "https://your-webhook-endpoint.com/notifications"
- name: WEBHOOK_SECRET # Optional: HMAC signature verification
value: "your-secret-key"
- name: WEBHOOK_TIMEOUT # Optional: timeout in seconds (default: 10)
value: "10"
- name: WEBHOOK_MAX_RETRIES # Optional: max retries (default: 3)
value: "3"
# Dashboard Integration
- name: HEADWIND_UI_URL # Optional: adds "View in Dashboard" links to notifications
value: "https://headwind.example.com" # or http://localhost:8082 for local
Notification Events
Headwind sends notifications for the following events:
- UpdateRequestCreated: New UpdateRequest CRD created (requires approval)
- UpdateApproved: Update approved by user
- UpdateRejected: Update rejected by user
- UpdateCompleted: Update successfully applied
- UpdateFailed: Update failed to apply
- RollbackTriggered: Automatic rollback triggered due to health check failure
- RollbackCompleted: Rollback completed successfully
- RollbackFailed: Rollback failed
Slack Integration
Slack notifications use Block Kit for rich formatting with:
- Color-coded messages by event type
- Deployment details (namespace, name, images)
- Interactive "View in Dashboard" button (when
HEADWIND_UI_URLis set) - Interactive "Approve" button (when approval API is available)
- Timestamp with relative time formatting
Microsoft Teams Integration
Teams notifications use Adaptive Cards with:
- Color themes matching event severity
- Structured fact display
- "View in Dashboard" action button (when
HEADWIND_UI_URLis set) - "Approve" action button (when approval API is available)
- Kubernetes logo branding
Generic Webhook Format
Generic webhooks receive JSON payloads with HMAC SHA256 signature verification:
Signature is sent in the X-Headwind-Signature header as sha256=<hex>.
To verify:
=
return f ==
Notification Metrics
Monitor notification delivery with Prometheus metrics:
headwind_notifications_sent_total- Total notifications sent successfullyheadwind_notifications_failed_total- Total notification failuresheadwind_notifications_slack_sent_total- Notifications sent to Slackheadwind_notifications_teams_sent_total- Notifications sent to Teamsheadwind_notifications_webhook_sent_total- Notifications sent via webhook
Metrics (Port 9090)
Prometheus metrics available at:
http://headwind-metrics:9090/metrics
Available metrics:
headwind_webhook_events_total- Total webhook events receivedheadwind_webhook_events_processed- Successfully processed eventsheadwind_polling_cycles_total- Total polling cycles completedheadwind_polling_images_checked_total- Images checked during pollingheadwind_polling_new_tags_found_total- New tags discovered via pollingheadwind_polling_helm_charts_checked_total- Helm charts checked during pollingheadwind_polling_helm_new_versions_found_total- Helm chart versions discovered via pollingheadwind_polling_errors_total- Polling errors encounteredheadwind_updates_pending- Updates awaiting approvalheadwind_updates_approved_total- Total approved updatesheadwind_updates_rejected_total- Total rejected updatesheadwind_updates_applied_total- Successfully applied updatesheadwind_updates_failed_total- Failed update attemptsheadwind_updates_skipped_interval_total- Updates skipped due to minimum interval not elapsedheadwind_reconcile_duration_seconds- Controller reconciliation timeheadwind_deployments_watched- Number of watched Deploymentsheadwind_helm_releases_watched- Number of watched HelmReleasesheadwind_helm_chart_versions_checked_total- Helm chart version checks performedheadwind_helm_updates_found_total- Helm chart updates discoveredheadwind_helm_updates_approved_total- Helm chart updates approved by policyheadwind_helm_updates_rejected_total- Helm chart updates rejected by policyheadwind_helm_updates_applied_total- Helm chart updates successfully appliedheadwind_rollbacks_total- Total rollback operations performedheadwind_rollbacks_manual_total- Manual rollback operationsheadwind_rollbacks_automatic_total- Automatic rollback operationsheadwind_rollbacks_failed_total- Failed rollback operationsheadwind_deployment_health_checks_total- Deployment health checks performedheadwind_deployment_health_failures_total- Deployment health check failures detectedheadwind_notifications_sent_total- Total notifications sent successfullyheadwind_notifications_failed_total- Total notification failuresheadwind_notifications_slack_sent_total- Notifications sent to Slackheadwind_notifications_teams_sent_total- Notifications sent to Teamsheadwind_notifications_webhook_sent_total- Notifications sent via webhook
Architecture
┌─────────────────┐
│ Registry │
│ (Docker Hub, │
│ Harbor, etc) │
└────┬────────┬───┘
│ │
│Webhook │Polling
│ │(optional)
▼ ▼
┌──────────────────┐
│ Headwind │
│ ┌────────────┐ │
│ │ Webhook │ │◄─── Port 8080
│ │ Server │ │
│ └──────┬─────┘ │
│ │ │
│ ┌──────▼─────┐ │
│ │ Registry │ │
│ │ Poller │ │
│ └──────┬─────┘ │
│ │ │
│ ┌──────▼─────┐ │
│ │ Policy │ │
│ │ Engine │ │
│ └──────┬─────┘ │
│ │ │
│ ┌──────▼─────┐ │
│ │ Approval │ │◄─── Port 8081 (API)
│ │ System │ │
│ └──────┬─────┘ │
│ │ │
│ ┌──────▼─────┐ │
│ │Controller │ │
│ └──────┬─────┘ │
│ │ │
│ ┌──────▼─────┐ │
│ │ Metrics │ │◄─── Port 9090
│ └────────────┘ │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Kubernetes │
│ API Server │
└──────────────────┘
Development
Build
# or
Test
# Run all tests (unit + integration)
# or
# Run only unit tests
# Run only integration tests
# Run specific integration test file
# Run with output
Test Structure
The project includes both unit and integration tests:
Unit Tests (30 tests) - Located within source modules (src/)
- Test individual functions and components in isolation
- Run with
cargo test --lib
Integration Tests (40 tests) - Located in tests/ directory
- Test end-to-end functionality and module interaction
tests/policy_integration_test.rs- Policy engine tests (12 tests)- Semantic versioning policies (patch, minor, major)
- Special policies (all, none, force, glob)
- Version prefix handling (v1.0.0)
- Prerelease and build metadata
- Real-world scenarios (Kubernetes versions, Docker tags)
tests/webhook_integration_test.rs- Webhook parsing tests (10 tests)- Docker Hub webhook format
- OCI registry webhook format
- Multiple events in single webhook
- Edge cases (missing tags, special characters)
tests/rollback_integration_test.rs- Rollback functionality tests (18 tests)- Update history tracking and serialization
- Automatic rollback configuration
- Health status monitoring
- History entry management (max entries, multiple containers)
- camelCase JSON serialization
Test Helpers - Located in tests/common/mod.rs
- Reusable test fixtures and helper functions
create_test_deployment()- Create Kubernetes Deployment fixturesheadwind_annotations()- Generate Headwind annotation setscreate_dockerhub_webhook_payload()- Docker Hub webhook JSONcreate_registry_webhook_payload()- OCI registry webhook JSON
Running Specific Test Categories
# Policy engine tests
# Webhook tests
# Test a specific policy type
# Test version handling
Development Tools
Install all development tools:
This installs:
cargo-audit- Security vulnerability scanningcargo-deny- Dependency license and security checkingcargo-udeps- Unused dependency detectioncargo-tarpaulin- Code coveragecargo-watch- Auto-rebuild on file changespre-commit- Git hooks for code quality
Pre-commit Hooks
The project uses pre-commit hooks to ensure code quality:
# Install hooks
# Run manually
# Hooks automatically run on git commit:
# - cargo fmt (formatting)
# - cargo clippy (linting)
# - cargo check (compilation)
# - YAML validation
# - Secret detection
# - Trailing whitespace removal
Run Locally
# or
RUST_LOG=headwind=debug
Requires KUBECONFIG to be set and pointing to a valid Kubernetes cluster.
Current Status
Headwind is currently in beta stage (v0.2.0-alpha). Core functionality is complete and tested:
✅ Completed Features
- ✅ Webhook events connected to controller and create UpdateRequests
- ✅ Approved updates are automatically applied to Deployments
- ✅ Registry polling with digest-based and version discovery
- ✅ Full approval workflow with UpdateRequest CRDs
- ✅ Policy engine works and is well-tested
- ✅ All servers operational (webhook:8080, API:8081, metrics:9090)
- ✅ Kubernetes controller watches and updates Deployments
- ✅ Flux HelmRelease support with version monitoring
- ✅ Minimum update interval respected
- ✅ Deduplication to avoid update request spam
- ✅ Private registry authentication (Docker Hub, ECR, GCR, ACR, Harbor, GHCR, GitLab)
- ✅ Manual rollback functionality with update history tracking
- ✅ Automatic rollback on deployment failures
- ✅ Notification integrations (Slack, Teams, webhooks)
🚧 In Progress
- 🚧 Comprehensive integration tests (70 tests passing, manual testing successful)
- 🚧 CI/CD pipeline enhancements
📋 Planned Features
- StatefulSet and DaemonSet support
- Full Helm repository querying for automatic version discovery
- Web UI for approvals
Production readiness: Core workflow is functional. Suitable for testing environments. For production use, we recommend waiting for comprehensive integration tests and private registry support.
Troubleshooting
Headwind Not Starting
# Check logs
# Common issues:
# 1. RBAC permissions - verify ServiceAccount has correct permissions
# 2. Cluster connectivity - ensure pod can reach Kubernetes API
# 3. Image pull - verify image is accessible
Webhooks Not Received
# Test webhook endpoint
# Check webhook metrics
|
Updates Not Applying
Check the status in the approval API:
|
Viewing Metrics
Or configure Prometheus to scrape:
- job_name: 'headwind'
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- headwind-system
relabel_configs:
- source_labels:
action: keep
regex: true
- source_labels:
action: replace
target_label: __address__
regex: (.+):(.+)
replacement: $1:9090
Security Considerations
Running in Production
-
Use RBAC least-privilege
- Headwind only needs permissions on resources it manages
- Review and customize
deploy/k8s/rbac.yaml
-
Secure webhook endpoints
- Use Ingress with TLS
- Implement webhook signature verification
- Use network policies to restrict access
-
Protect approval API
- Add authentication (OAuth2/OIDC)
- Use TLS for all connections
- Audit all approval actions
-
Container security
- Headwind runs as non-root (UID 1000)
- Read-only root filesystem
- No privileged escalation
- Minimal base image (Debian slim)
Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: headwind-network-policy
namespace: headwind-system
spec:
podSelector:
matchLabels:
app: headwind
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
ports:
- protocol: TCP
port: 8080 # Webhooks
- protocol: TCP
port: 8081 # API
- protocol: TCP
port: 9090 # Metrics
egress:
- to:
- namespaceSelector:
ports:
- protocol: TCP
port: 443 # Kubernetes API
Roadmap
v0.2.0 - Core Functionality ✅ COMPLETE (except testing)
- Project structure and foundation
- Connect webhook events to controller (PR #21)
- Implement update application (PR #20, PR #22)
- Respect minimum update interval (PR #21)
- UpdateRequest CRD implementation (PR #19)
- Registry polling implementation (in progress - feat/registry-polling branch)
- Add comprehensive integration tests
- CI/CD pipeline enhancements
v0.3.0 - Extended Support (Medium Priority)
- Private registry authentication (completed)
- Manual rollback functionality (completed)
- Automatic rollback on deployment failures (completed)
- Rollback metrics (completed)
- kubectl plugin for rollback and approvals (completed)
- Notification system (Slack, Teams, generic webhooks) (completed)
- Flux HelmRelease support with automatic version discovery (completed)
- Automatic chart version discovery from HTTP and OCI registries
- UpdateRequest creation for chart updates
- Approval workflow integration
- Chart version patching on approval
- Full metrics and notification support
- StatefulSet/DaemonSet support (completed)
- StatefulSet and DaemonSet controllers
- Same annotation-based configuration as Deployments
- Approval workflow integration
- Full metrics support
- Multi-architecture Docker images (arm64, amd64)
v0.4.0 - Enhanced UX (Low Priority)
- Web dashboard for approvals
- Custom Resource Definition for policy config
- Slack/Teams interactive approvals
- Advanced scheduling (maintenance windows, etc.)
Future Ideas
- Multi-cluster support
- Canary deployment integration
- Custom update strategies (blue/green, rolling window)
- A/B testing support
- Rate limiting per namespace
- Policy simulation/dry-run mode
FAQ
Q: How is this different from Argo CD or Flux?
A: Argo CD and Flux are GitOps tools that sync from Git. Headwind updates workloads when new container images are pushed to registries, regardless of Git state. They're complementary - you can use both.
Q: Can I use this with Flux/Argo?
A: Yes! Headwind can update the image tags, and Flux/Argo will see the change and sync. Or let Flux handle chart updates and Headwind handle image updates.
Q: Does this work with private registries?
A: Yes! Headwind reads credentials from your Kubernetes imagePullSecrets. Supports:
- Docker Hub (including Personal Access Tokens)
- AWS ECR
- Google GCR/Artifact Registry
- Azure ACR
- Harbor, GHCR, GitLab, and other registries
Simply configure your ServiceAccount's imagePullSecrets as usual, and Headwind will use them automatically.
Q: What about rollbacks?
A: Headwind includes both manual and automatic rollback support:
- Manual rollback: Use the API to rollback to previous versions (
POST /api/v1/rollback/{namespace}/{deployment}/{container}) - Automatic rollback: Enable
headwind.sh/auto-rollback: "true"to automatically detect and rollback failed updates - Update history: View the last 10 updates per container in deployment annotations
- You can also use
kubectl rollout undofor immediate rollbacks
Q: Can I test updates in staging first?
A: Yes! Use different policies per namespace:
# staging namespace - auto-update all
headwind.sh/policy: "all"
headwind.sh/require-approval: "false"
# production namespace - require approval
headwind.sh/policy: "minor"
headwind.sh/require-approval: "true"
Q: What if I want to pin a specific version?
A: Use policy: "none" to prevent any updates, or remove Headwind annotations entirely.
Performance
Expected performance characteristics:
- Webhook processing: <10ms per event
- Reconciliation loop: <100ms per Deployment
- Memory usage: ~50-100MB typical
- CPU usage: <0.1 core typical, <0.5 core under load
Tested with:
- 1000 Deployments with Headwind annotations
- 100 webhooks/minute
- Single replica of Headwind
For larger scale, consider:
- Running multiple replicas
- Using leader election
- Filtering namespaces with label selectors
Contributing
We welcome contributions! Please see:
- CONTRIBUTING.md - Contribution guidelines
- CLAUDE.md - Architecture and development context
- Issues - Open issues and feature requests
- Pull Requests - Current PRs
Quick Start for Contributors
# Fork and clone
# Build and test
# Run locally (requires k8s cluster)
# Create a branch
# Make changes, commit, and push
# Open a pull request
License
MIT License - see LICENSE file for details.