# Deployment Architecture Summary
## ποΈ Complete Infrastructure
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Internet / Users β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
ββββββββββΌβββββββββ
β LoadBalancer β β External access
β (AWS NLB/GCP) β β Health checks
ββββββββββ¬βββββββββ β SSL termination
β
βββββββββββββββββββΌββββββββββββββββββ
β β β
ββββββΌβββββ ββββββΌβββββ ββββββΌβββββ
β Pod 1 β β Pod 2 β β Pod 3 β β HPA manages count
β Gateway β β Gateway β β Gateway β β VPA adjusts resources
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
β β β
ββββββββββ¬ββββββββ΄βββββββββ¬ββββββββ
β β
βββββββΌββββββ ββββββΌβββββββ
β Redis β β Backend β
β Cache β β Services β
βββββββββββββ βββββββββββββ
```
## π¦ Docker Images
### 1. Main Gateway Image
```dockerfile
FROM rust:1.75-slim AS builder
# Build greeter and federation binaries
FROM debian:bookworm-slim
# Runtime with minimal dependencies
```
**Features**:
- Multi-stage build (minimal size)
- Non-root user (security)
- Health checks
- Supports both greeter and federation modes
### 2. Federation Image
```dockerfile
FROM rust:1.75-slim AS builder
# Build federation with all subgraphs
FROM debian:bookworm-slim
# Runs user, product, review subgraphs
```
**Ports**:
- 8891: User subgraph
- 8892: Product subgraph
- 8893: Review subgraph
- 50051-50053: gRPC ports
- 9090: Metrics
## βΈοΈ Kubernetes Resources
### Core Resources
```
Deployment
βββ ReplicaSet (managed by HPA)
βββ Pods (3-50 replicas)
β βββ Container: gateway
β βββ Liveness probe: /health
β βββ Readiness probe: /health
βββ PodDisruptionBudget (min 2 available)
```
### Services
```
Service (ClusterIP)
βββ Session affinity: ClientIP
LoadBalancer (optional)
βββ External IP
βββ Health checks
βββ Traffic policy: Local/Cluster
```
### Autoscaling
```
HorizontalPodAutoscaler
βββ Min replicas: 3
βββ Max replicas: 10
βββ Metrics: CPU 70%, Memory 80%
βββ Behavior: gradual scale-up/down
VerticalPodAutoscaler (optional)
βββ Update mode: Off/Auto
βββ Min resources: 100m CPU, 128Mi RAM
βββ Max resources: 2000m CPU, 2Gi RAM
βββ Recommendations: continuous
```
### Networking
```
Ingress (NGINX)
βββ TLS: cert-manager
βββ Load balancing: round_robin
βββ Rate limiting: 1000 RPS
βββ CORS: enabled
NetworkPolicy (optional)
βββ Ingress: from ingress-nginx
βββ Egress: DNS + backend services
```
## π Scaling Strategies
### Horizontal Scaling (HPA)
| CPU > 70% | Scale up | Add pods (max 50) |
| Memory > 80% | Scale up | Add pods |
| CPU < 40% | Scale down | Remove pods (min 3) |
**Behavior**:
- Scale up: Fast (4 pods/30s)
- Scale down: Gradual (2 pods/60s)
- Stabilization: 5min
### Vertical Scaling (VPA)
| Off | Recommendations only | Safe with HPA |
| Initial | Set on creation | Initial sizing |
| Auto | Continuous updates | Full automation |
**Controls**:
- CPU: 100m - 2000m
- Memory: 128Mi - 2Gi
### Load Balancing
| Round Robin | Ingress annotation | Even distribution |
| Least Connections | Ingress annotation | Optimal utilization |
| IP Hash | Service affinity | Sticky sessions |
## π Federation Architecture
```
βββββββββββββββββββββββββββββββββββββββββββ
β Apollo Router (Port 4000) β
β βββββββββββββββββββ β
β β Query Planner β β
β ββββββββββ¬ββββββββββ β
ββββββββββββββββββββΌββββββββββββββββββββββββ
β
ββββββββββββββΌβββββββββββββ
β β β
βββββββΌββββββ βββββΌβββββββ ββββΌββββββββ
β User β β Product β β Review β
β Subgraph β β Subgraph β β Subgraph β
β (3 pods) β β (3 pods) β β (3 pods) β
β Port 8891 β βPort 8892 β βPort 8893 β
βββββββββββββ ββββββββββββ ββββββββββββ
β β β
βββββββββββββββΌββββββββββββββ
β
ββββββββΌβββββββ
β Backend β
β Services β
βββββββββββββββ
```
**Each Subgraph**:
- Independent scaling (HPA)
- Separate resource limits
- Entity resolution with DataLoader
- Metrics on port 9090
## π Monitoring Stack
```
ββββββββββββββββ
β Prometheus β β Scrapes metrics (port 9090)
ββββββββ¬ββββββββ
β
ββββββββΌββββββββ
β Grafana β β Visualizes metrics
ββββββββββββββββ
β
ββ Request rate
ββ Error rate
ββ Latency (p50, p95, p99)
ββ Pod count (HPA)
ββ Resource usage (VPA)
```
## π Security Layers
```
1. Network
ββ NetworkPolicy: restrict traffic
2. Container
ββ Non-root user (UID 1000)
ββ Read-only filesystem
ββ Dropped capabilities
3. Pod
ββ Security context enforced
4. Service
ββ TLS termination
ββ Source IP restrictions
5. Application
ββ Rate limiting
ββ CORS policies
ββ Query whitelisting
```
## π Resource Planning
### Development
```yaml
replicas: 1
cpu: 250m
memory: 256Mi
HPA: disabled
VPA: Off (recommendations)
```
### Staging
```yaml
replicas: 2
cpu: 500m
memory: 512Mi
HPA: 2-5 replicas
VPA: Initial
```
### Production
```yaml
replicas: 5
cpu: 1000m
memory: 1Gi
HPA: 5-50 replicas
VPA: Off (with HPA)
LoadBalancer: enabled
PDB: min 3 available
```
## π― Deployment Commands
```bash
# Development
docker-compose -f docker-compose.federation.yml up
# Staging
helm install gateway ./helm/grpc-graphql-gateway \
--namespace staging \
-f helm/values-staging.yaml
# Production
helm install gateway ./helm/grpc-graphql-gateway \
--namespace production \
-f helm/values-autoscaling-complete.yaml
```
## π Testing
```bash
# Load test
k6 run --vus 100 --duration 5m loadtest.js
# Watch scaling
watch 'kubectl get pods,hpa,vpa -n production'
# Check load distribution
kubectl get pods -o wide -l app=gateway
# View metrics
curl http://<lb-ip>/metrics
```
## π References
- Dockerfiles: `/Dockerfile`, `/Dockerfile.federation`
- Helm Chart: `/helm/grpc-graphql-gateway/`
- Docker Compose: `/docker-compose.federation.yml`
- Docs: `/docs/src/production/`
- Quick Start: `/DEPLOYMENT.md`