docs.rs failed to build mockforge-k8s-operator-0.3.5
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build:
mockforge-k8s-operator-0.3.71
MockForge Kubernetes Operator
Kubernetes operator for managing chaos engineering orchestrations as Custom Resource Definitions (CRDs).
This crate provides a Kubernetes-native way to define, schedule, and execute chaos engineering experiments across your cluster. It integrates with the MockForge chaos engineering framework to provide GitOps-style chaos testing with full Kubernetes integration.
Features
- Kubernetes CRDs: Define chaos orchestrations as Kubernetes resources
- GitOps Integration: Version control your chaos experiments
- Scheduled Executions: Cron-based automatic chaos testing
- Multi-Service Targeting: Apply chaos to specific Kubernetes services
- Lifecycle Hooks: Pre/post chaos execution hooks
- Assertions & Validation: Built-in success criteria validation
- Metrics & Monitoring: Prometheus metrics for operator health
- Admission Webhooks: Validate chaos orchestrations before execution
- RBAC Integration: Kubernetes-native access control
Installation
Using Helm
# Add MockForge Helm repository
# Install the operator
# Install with custom values
Manual Installation
# Apply CRDs
# Deploy operator
# Verify installation
Custom Resource Definition
ChaosOrchestration
Define chaos experiments as Kubernetes resources:
apiVersion: mockforge.io/v1
kind: ChaosOrchestration
metadata:
name: api-failure-test
namespace: chaos-testing
spec:
name: "API Failure Resilience Test"
description: "Test API resilience against service failures"
# Optional: Schedule automatic execution
schedule: "0 2 * * *" # Daily at 2 AM
# Global variables
variables:
api_service: "api-server"
test_duration: 300
# Target services
targetServices:
- name: api-server
namespace: production
selector:
app: api-server
- name: database
namespace: production
selector:
app: postgres
# Orchestration steps
steps:
- name: "baseline-measurement"
scenario: "baseline"
durationSeconds: 60
- name: "inject-failures"
scenario: "service_failure"
durationSeconds: 180
parameters:
failureRate: 0.1
services:
- name: "network-chaos"
scenario: "network_partition"
durationSeconds: 120
parameters:
partitionType: "partial"
affectedServices:
- name: "recovery-test"
scenario: "recovery_validation"
durationSeconds: 60
# Lifecycle hooks
hooks:
- name: "pre-chaos"
type: "pre"
actions:
- type: "http"
url: "https://slack.com/api/chat.postMessage"
method: "POST"
body: '{"text": "Starting chaos experiment: {{.Name}}"}'
- name: "post-chaos"
type: "post"
actions:
- type: "metric"
name: "chaos_experiment_completed"
value: 1
# Success criteria
assertions:
- name: "api-availability"
type: "metric"
metric: "http_request_duration_seconds"
operator: "lt"
value: 5.0
description: "API response time should stay under 5 seconds"
- name: "error-rate"
type: "metric"
metric: "http_requests_total"
labels: 'code=~"5.."'
operator: "lt"
value: 0.05
description: "Error rate should stay below 5%"
status:
phase: "Pending"
progress: "0/4 steps completed"
startTime: "2024-01-01T00:00:00Z"
conditions:
- type: "Ready"
status: "True"
lastTransitionTime: "2024-01-01T00:00:00Z"
Usage Examples
Basic Chaos Orchestration
apiVersion: mockforge.io/v1
kind: ChaosOrchestration
metadata:
name: simple-failure-test
spec:
name: "Simple Service Failure Test"
steps:
- name: "inject-pod-failure"
scenario: "pod_kill"
durationSeconds: 30
parameters:
targetPods: "app=api-server"
killCount: 2
Network Chaos
apiVersion: mockforge.io/v1
kind: ChaosOrchestration
metadata:
name: network-chaos
spec:
name: "Network Partition Test"
steps:
- name: "network-partition"
scenario: "network_chaos"
durationSeconds: 120
parameters:
action: "partition"
sourceSelector: "app=api-server"
targetSelector: "app=database"
direction: "both"
Multi-Scenario Orchestration
apiVersion: mockforge.io/v1
kind: ChaosOrchestration
metadata:
name: comprehensive-chaos
spec:
name: "Comprehensive Chaos Test"
targetServices:
- name: web-frontend
selector: "app=web"
- name: api-backend
selector: "app=api"
- name: database
selector: "app=db"
steps:
- name: "cpu-stress"
scenario: "resource_stress"
durationSeconds: 60
parameters:
resource: "cpu"
stressLevel: 80
targets:
- name: "memory-pressure"
scenario: "resource_stress"
durationSeconds: 60
parameters:
resource: "memory"
stressLevel: 90
targets:
- name: "network-latency"
scenario: "network_chaos"
durationSeconds: 120
parameters:
action: "delay"
delayMs: 500
jitterMs: 100
sourceSelector: "app=api"
targetSelector: "app=db"
- name: "pod-failures"
scenario: "pod_chaos"
durationSeconds: 180
parameters:
action: "kill"
killCount: 1
intervalSeconds: 30
targets:
Available Scenarios
Pod Chaos
- pod_kill: Randomly kill pods matching selectors
- pod_restart: Restart pods without killing them
- pod_scale: Scale deployments up/down
Network Chaos
- network_delay: Add latency to network traffic
- network_loss: Drop packets randomly
- network_partition: Create network partitions
- network_corruption: Corrupt network packets
Resource Chaos
- cpu_stress: Consume CPU resources
- memory_stress: Consume memory resources
- disk_stress: Fill disk space
- io_stress: Stress I/O operations
Application Chaos
- http_errors: Inject HTTP error responses
- service_failure: Make services unresponsive
- dependency_failure: Break service dependencies
Monitoring & Observability
Prometheus Metrics
The operator exposes comprehensive metrics:
# Operator health
mockforge_operator_up 1
# Orchestration status
mockforge_orchestration_total{phase="Running"} 3
mockforge_orchestration_total{phase="Completed"} 12
mockforge_orchestration_total{phase="Failed"} 1
# Step execution times
mockforge_step_duration_seconds{name="inject-failures", orchestration="api-test"} 180
# Assertion results
mockforge_assertion_passed_total{assertion="api-availability"} 15
mockforge_assertion_failed_total{assertion="api-availability"} 2
Viewing Orchestration Status
# List all orchestrations
# Get detailed status
# View logs
# Check metrics
RBAC Configuration
ClusterRole for Operator
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: mockforge-operator
rules:
- apiGroups:
resources:
verbs:
- apiGroups:
resources:
verbs:
- apiGroups:
resources:
verbs:
- apiGroups:
resources:
verbs:
Admission Webhooks
Enable validation webhooks:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: mockforge-operator-webhook
webhooks:
- name: chaosorchestration.mockforge.io
rules:
- operations:
apiGroups:
apiVersions:
resources:
clientConfig:
service:
name: mockforge-operator-webhook
namespace: mockforge-system
caBundle: <CA_BUNDLE>
admissionReviewVersions:
sideEffects: None
Development
Building the Operator
# Build the operator binary
# Build Docker image
Running Locally
# Use kubeconfig for local development
# Run operator locally (requires cluster access)
Testing
# Run unit tests
# Run integration tests (requires Kubernetes cluster)
Configuration
Operator Configuration
# config/config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: mockforge-operator-config
namespace: mockforge-system
data:
config.yaml: |
# Operator settings
leaderElection:
enabled: true
leaseDuration: 15s
renewDeadline: 10s
retryPeriod: 2s
# Chaos settings
chaos:
defaultDuration: 300s
maxConcurrentOrchestrations: 5
enableReporting: true
# Metrics settings
metrics:
enabled: true
port: 9090
path: /metrics
# Webhook settings
webhook:
enabled: true
port: 9443
certDir: /tmp/k8s-webhook-server/serving-certs
Troubleshooting
Common Issues
Operator not starting:
# Check operator logs
# Verify RBAC permissions
CRDs not installed:
# Install CRDs
# Verify CRDs
|
Orchestration stuck:
# Check orchestration status
# Check operator events
Webhook validation failures:
# Check webhook configuration
# Check webhook logs
API Reference
ChaosOrchestration Spec
| Field | Type | Description |
|---|---|---|
name |
string |
Human-readable name for the orchestration |
description |
string? |
Optional description |
schedule |
string? |
Cron schedule for automatic execution |
steps |
OrchestrationStep[] |
Steps to execute |
variables |
object |
Global variables available to all steps |
hooks |
OrchestrationHook[] |
Lifecycle hooks |
assertions |
OrchestrationAssertion[] |
Success criteria |
targetServices |
TargetService[] |
Kubernetes services to target |
OrchestrationStep
| Field | Type | Description |
|---|---|---|
name |
string |
Step identifier |
scenario |
string |
Chaos scenario to execute |
durationSeconds |
number? |
How long to run the scenario |
delayBeforeSeconds |
number |
Delay before starting (default: 0) |
continueOnFailure |
boolean |
Continue if step fails (default: false) |
parameters |
object |
Scenario-specific parameters |
Contributing
See the main MockForge repository for contribution guidelines.
License
Licensed under MIT OR Apache-2.0