DevOps Project Example: From Code Push to Production with GitOps, FluxCD, and Kubernetes¶
Most DevOps tutorials show you a pipeline diagram. This one shows you a real pipeline, built on a real application, running on real Kubernetes clusters — with every tool, every workflow, and every design decision explained.
This post walks through the complete CI/CD system behind Slotmachine — a real-time multiplayer tournament app — from the moment a developer pushes code to GitHub, through six security and quality gates, all the way to automated deployment on both Nutanix on-premise clusters and AWS EKS. No hand-waving. No "and then magic happens."
The full source code is available in two repositories:
- Application repo: github.com/pkhamdee/slotmachine
- Deployment repo: github.com/pkhamdee/slotmachine-deployment
The Big Picture¶
Here is the complete flow before we drill into any single piece:
Developer
│
│ git push
▼
┌─────────────────────────────────────────────────────────────────┐
│ GitHub Actions — 6-Job CI Pipeline (slotmachine repo) │
│ │
│ [1] CodeQL SAST → [2] Build+Test → [3] Locust Load Test │
│ → [4] OWASP ZAP DAST → [5] Docker Build + Trivy Scan + Push │
│ → [6] Update Deployment Repo (Kustomize image tag update) │
└─────────────────────────────────────────────────────────────────┘
│ │
│ push images │ push to branch: development
▼ ▼
Docker Hub slotmachine-deployment repo
(pkhamdee/slotmachine) ├── branch: development ──► K8s DEV (Nutanix)
client:{sha} ├── branch: qa ──► K8s QA (Nutanix)
server:{sha} └── branch: main ──► K8s PROD (AWS EKS)
▲ ▲
PR + DevOps PR + QA Sign Off
Approval + SDM Approval
All three clusters watched by FluxCD (CD Agent):
FluxCD polls git branch → detects change → kustomize apply → K8s reconciled
NKP Management Cluster (Nutanix On-Premise):
Manages DEV + QA clusters with SSO/IdP, Governance, Platform Services
Two repositories. Three environments. One fully automated path from commit to production — with human approval gates exactly where they belong.
Part 1: The Philosophy — Why Separate Code and Deployment?¶
The single most important design decision in this architecture is something many teams overlook: the application code and the deployment configuration live in different Git repositories.
The Problem with Mixing Them¶
When you put your Kubernetes YAML files alongside your application code:
my-app/
├── src/ ← application code
├── tests/
└── k8s/ ← Kubernetes manifests ← ⚠️ problematic
├── deployment.yaml
└── service.yaml
You create tight coupling that causes real operational pain:
- A hotfix to a CSS file triggers a full deployment pipeline
- You can't audit "what's running in production" without reading application commit history
- Rolling back a bad deployment means reverting application code, not just config
- Multiple environments (dev/qa/prod) live in branches or subdirectories — a mess
The GitOps Solution: Two Repos, Two Concerns¶
slotmachine/ ← Source of truth for the APPLICATION
.github/workflows/ci.yml
client/src/ ← React frontend
server/src/ ← Node.js backend
tests/locustfile.py
slotmachine-deployment/ ← Source of truth for WHAT RUNS IN THE CLUSTER
base/ ← Shared Kubernetes manifests
overlays/development/ ← Dev environment configuration
overlays/production/ ← Prod environment configuration
Browse the code: slotmachine app repo · slotmachine-deployment repo
The rule: the application repo never contains Kubernetes manifests. The deployment repo never contains application source code. The only connection is an image tag — a short Git SHA that the CI pipeline writes into the deployment repo automatically.
This is the GitOps model: Git is the single source of truth for the desired state of the cluster. No kubectl apply from a CI runner. The cluster pulls its desired state from Git, rather than having it pushed from a pipeline.
Benefits You Feel Immediately¶
| Problem | Without Separation | With GitOps Separation |
|---|---|---|
| "What's running in prod?" | Dig through CI logs | Read overlays/production/kustomization.yaml |
| Rolling back prod | Revert app code + redeploy | Revert deployment repo PR |
| Audit trail | Mixed with code changes | Clean, deployment-only commit history |
| Environment drift | Configuration copy-paste | Kustomize base + overlays |
| Access control | Devs touch prod infra | Devs only push to app repo; infra team owns deployment repo |
Part 2: The Application — Slotmachine¶
Before the pipeline, let's understand what we're deploying.
Slotmachine is a real-time multiplayer slot tournament application. Players compete in timed sessions, spinning reels to accumulate the highest balance. A live scoreboard updates every second. An admin controls session flow.
Tech Stack¶
Frontend: React 18 + Vite (SPA) served by Nginx
Backend: Node.js + Express + Socket.io
Database: MongoDB 7 + Mongoose
Cache: Redis 7 (Socket.io multi-pod fan-out via ioredis pub/sub)
Container: Docker (client image: nginx:alpine, server image: node:20-alpine)
Architecture Inside Kubernetes¶
Internet
│
▼
AWS NLB (Production) / LoadBalancer Service (Dev/QA)
│
▼
┌─────────────────────────────────────────────┐
│ client Pod × 4 (prod) / × 1 (dev) │
│ nginx:alpine │
│ ├── serves / → React SPA (static files) │
│ ├── proxies /api/* → server:3001 │
│ └── proxies /socket.io/* → server:3001 │
└─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ server Pod × 6 (prod) / × 1 (dev) │
│ node:20-alpine │
│ ├── Express REST API (port 3001) │
│ ├── Socket.io server (real-time events) │
│ └── Mongoose (MongoDB) + ioredis (Redis) │
└─────────────────────────────────────────────┘
│ │
▼ ▼
MongoDB Redis
StatefulSet Deployment
(20Gi PVC prod) (pub/sub adapter)
The multi-pod server architecture relies on Redis as a Socket.io adapter: when one server pod broadcasts a scoreboard update, Redis fan-outs the event to all other server pods, which forward it to their connected clients. Without Redis, clients connected to different pods would see inconsistent scoreboards.
Repository Structure¶
slotmachine/
├── .github/workflows/ci.yml ← 6-job CI pipeline
├── .zap/rules.tsv ← OWASP ZAP alert filter rules
├── docker-compose.yml ← Local development stack
├── tests/locustfile.py ← Load test scenarios
├── client/
│ ├── Dockerfile
│ ├── nginx.conf ← SPA routing + API proxy config
│ ├── vite.config.js
│ └── src/
│ ├── components/ ← 12 React components
│ ├── hooks/ ← useGame, useSession
│ ├── api/gameApi.js
│ └── __tests__/ ← 97 Vitest tests
└── server/
├── Dockerfile
└── src/
├── config/gameConfig.js
├── controllers/
├── middleware/adminAuth.js
├── models/ ← 5 MongoDB schemas
├── routes/
└── services/
├── SessionManager.js ← State machine + timers
└── slotEngine.js ← Spin logic + payout table
Part 3: The CI Pipeline — 6 Jobs, Every Gate Explained¶
The CI pipeline is defined in .github/workflows/ci.yml. Every push and pull request runs all six jobs. Only the final two jobs (container push and GitOps update) are gated to the main branch.
push / PR
│
├─── [1] code-scan (CodeQL SAST)
├─── [2] build-and-test (npm + Vitest + node:test)
├─── [3] performance-test (Locust load test)
├─── [4] dast-zap (OWASP ZAP DAST)
│
└── all 4 pass?
│
├─── [5] container-build-scan-push (Docker + Trivy + DockerHub)
└─── [6] update-gitops (Kustomize image tag → deployment repo)
(main branch only for jobs 5 and 6)
Job 1: CodeQL SAST — Finding Vulnerabilities Before They Ship¶
SAST (Static Application Security Testing) analyzes source code without running it.
# .github/workflows/ci.yml (simplified)
code-scan:
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
with:
languages: javascript
queries: security-extended # Includes OWASP Top 10 patterns
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3
CodeQL builds a semantic model of the code — it understands data flow, not just text patterns. It can detect: - SQL/NoSQL injection where user input flows to a query - XSS where untrusted data reaches the DOM - Prototype pollution in JavaScript - Path traversal vulnerabilities
Results appear directly in the GitHub Security tab. A Critical finding blocks the PR.
Job 2: Build and Test — 118 Tests, Zero Compromises¶
build-and-test:
runs-on: ubuntu-latest
services:
mongodb:
image: mongo:7
ports: ['27017:27017']
redis:
image: redis:7-alpine
ports: ['6379:6379']
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20' }
- run: npm ci
- run: npm run test --workspace=client # 97 Vitest tests
- run: npm run test --workspace=server # 21 node:test tests
- run: npm run build --workspace=client # Vite production build
118 tests run against real MongoDB and Redis instances — no mocks, no stubs at the database layer. The build artifact is produced and verified before any container is built.
Job 3: Performance Test — Catching Regressions Before Users Do¶
performance-test:
runs-on: ubuntu-latest
services:
mongodb: { image: mongo:7 }
redis: { image: redis:7-alpine }
steps:
- run: pip install locust
- run: |
node server/server.js &
sleep 5
locust -f tests/locustfile.py \
--headless -u 2 -r 1 -t 10s \
--host http://localhost:3001
The locustfile.py simulates real user behavior: create a session, spin the reels, check the scoreboard. Running this on every PR catches performance regressions — a slow database query or a blocking event loop operation — before they reach any environment.
Job 4: DAST — Attacking the Running Application¶
DAST (Dynamic Application Security Testing) probes the running application from the outside, like a real attacker would. OWASP ZAP is the industry-standard open-source tool for this.
dast-zap:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Start application stack
run: docker compose up -d && sleep 15
# On PRs: baseline scan (passive, fast, ~2 min)
- name: ZAP Baseline Scan
if: github.event_name == 'pull_request'
uses: zaproxy/action-baseline@v0.13.0
with:
target: 'http://localhost:8080'
rules_file_name: '.zap/rules.tsv'
# On main push: full active scan (active attacks, ~15 min)
- name: ZAP Full Scan
if: github.ref == 'refs/heads/main'
uses: zaproxy/action-full-scan@v0.11.0
with:
target: 'http://localhost:8080'
rules_file_name: '.zap/rules.tsv'
ZAP tests for: - SQL/NoSQL injection (sending ' OR 1=1 -- style payloads) - XSS (injecting <script>alert(1)</script> into every form field) - CSRF, clickjacking, missing security headers - Authentication bypass attempts
The .zap/rules.tsv file suppresses known false positives so the signal stays clean.
Job 5: Container Build, Scan, and Push¶
container-build-scan-push:
needs: [code-scan, build-and-test, performance-test, dast-zap]
runs-on: ubuntu-latest
steps:
- name: Build client image
run: |
docker build -t pkhamdee/slotmachine:client-${{ github.sha }} \
-t pkhamdee/slotmachine:client \
./client
- name: Build server image
run: |
docker build -t pkhamdee/slotmachine:server-${{ github.sha }} \
-t pkhamdee/slotmachine:server \
./server
- name: Trivy scan — client image
uses: aquasecurity/trivy-action@master
with:
image-ref: pkhamdee/slotmachine:client-${{ github.sha }}
severity: HIGH,CRITICAL
exit-code: 1 # Fail the pipeline on HIGH or CRITICAL CVEs
- name: Trivy scan — server image
uses: aquasecurity/trivy-action@master
with:
image-ref: pkhamdee/slotmachine:server-${{ github.sha }}
severity: HIGH,CRITICAL
exit-code: 1
- name: Push to Docker Hub # Only on main branch
if: github.ref == 'refs/heads/main'
run: |
docker push pkhamdee/slotmachine:client-${{ github.sha }}
docker push pkhamdee/slotmachine:client
docker push pkhamdee/slotmachine:server-${{ github.sha }}
docker push pkhamdee/slotmachine:server
Two image tags are pushed for every release: - :client-abc1234 — the immutable, specific tag (Git SHA) used by Kustomize in deployment manifests - :client — the mutable latest-style tag for convenience
Trivy scans the container's OS packages and language dependencies against CVE databases. A HIGH or CRITICAL finding fails the build — the image never reaches Docker Hub.
Job 6: Update the Deployment Repository¶
This is where CI hands off to CD. The last job: 1. Checks out slotmachine-deployment 2. Updates the image tags to the new SHA using Kustomize 3. Commits and pushes to the development branch
update-gitops:
needs: [container-build-scan-push]
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- name: Checkout deployment repo
uses: actions/checkout@v4
with:
repository: pkhamdee/slotmachine-deployment
token: ${{ secrets.DEPLOYMENT_REPO_TOKEN }}
ref: development
- name: Update image tags
run: |
cd overlays/development
kustomize edit set image \
slotmachine-client=pkhamdee/slotmachine:client-${{ github.sha }}
kustomize edit set image \
slotmachine-server=pkhamdee/slotmachine:server-${{ github.sha }}
- name: Commit and push
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add .
git commit -m "ci: update images to ${{ github.sha }}"
git push origin development
After this push, FluxCD takes over. CI's job is done.
Part 4: The Deployment Repository — Kustomize Base and Overlays¶
The slotmachine-deployment repository contains all Kubernetes manifests. It uses Kustomize — the Kubernetes-native configuration tool that lets you write base manifests once and patch them per environment.
Repository Layout¶
slotmachine-deployment/
├── base/ ← Single source of truth for Kubernetes objects
│ ├── kustomization.yaml
│ ├── namespace.yaml
│ ├── client/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ └── kustomization.yaml
│ ├── server/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── configmap.yaml
│ │ └── kustomization.yaml
│ ├── mongodb/
│ │ ├── statefulset.yaml ← StatefulSet for stable Pod identity
│ │ ├── service.yaml
│ │ └── kustomization.yaml
│ └── redis/
│ ├── deployment.yaml
│ └── kustomization.yaml
└── overlays/
├── development/ ← Dev-specific patches
│ ├── kustomization.yaml
│ └── namespace.yaml
└── production/ ← Prod-specific patches
├── kustomization.yaml
└── namespace.yaml
The Base — Write Once, Run Everywhere¶
The base defines the structure with sensible defaults. Here's the server deployment base:
# base/server/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: server
spec:
replicas: 1 # Overridden by overlays
selector:
matchLabels:
app: slotmachine-server
template:
spec:
initContainers:
- name: wait-for-mongodb # Don't start until DB is ready
image: busybox
command: ['sh', '-c',
'until nc -z mongodb 27017; do sleep 2; done']
- name: wait-for-redis
image: busybox
command: ['sh', '-c',
'until nc -z redis 6379; do sleep 2; done']
containers:
- name: server
image: slotmachine-server:latest # Replaced by Kustomize
ports:
- containerPort: 3001
envFrom:
- configMapRef:
name: server-config
- secretRef:
name: server-secret
resources:
requests:
cpu: "200m"
memory: "256Mi"
readinessProbe:
tcpSocket:
port: 3001
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: slotmachine-server
The initContainers block is a critical detail: without it, the server Pod starts before MongoDB is ready, crashes, and Kubernetes restarts it in a loop. Init containers enforce dependency ordering cleanly.
The Overlays — Environment-Specific Patches¶
The production overlay patches the base without duplicating it:
# overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../base
namespace: production
images:
- name: slotmachine-client
newName: pkhamdee/slotmachine
newTag: client-abc1234def5 # ← auto-updated by CI
- name: slotmachine-server
newName: pkhamdee/slotmachine
newTag: server-abc1234def5 # ← auto-updated by CI
patches:
- target: { kind: Deployment, name: client }
patch: |
- op: replace
path: /spec/replicas
value: 4
- target: { kind: Deployment, name: server }
patch: |
- op: replace
path: /spec/replicas
value: 6
- target: { kind: StatefulSet, name: mongodb }
patch: |
- op: replace
path: /spec/volumeClaimTemplates/0/spec/resources/requests/storage
value: 20Gi
- target: { kind: Service, name: client }
patch: |
- op: add
path: /metadata/annotations
value:
service.beta.kubernetes.io/aws-load-balancer-type: "external"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
The development overlay is identical in structure but keeps replicas at 1, storage at 5Gi, and uses no NLB annotations.
What Kustomize enables:
Base manifest
│
┌───────────┴───────────┐
│ │
dev overlay prod overlay
replicas: 1 replicas: 4/6
storage: 5Gi storage: 20Gi
namespace: development namespace: production
no NLB annotations NLB + internet-facing
Zero copy-paste. A change to the base manifest (like adding a new environment variable) automatically flows to both environments.
Part 5: FluxCD — The CD Agent That Closes the Loop¶
FluxCD runs inside each Kubernetes cluster as a set of controllers. It watches a Git repository branch and continuously reconciles the cluster state with whatever is committed there.
How FluxCD Works¶
FluxCD Poll Loop (every 1 minute by default):
1. Connect to Git → fetch branch HEAD SHA
2. Compare with last-applied SHA
3. If different:
a. git clone the new state
b. kustomize build overlays/<env>/
c. kubectl apply the rendered manifests
d. Update last-applied SHA
4. If same: nothing to do. Sleep. Repeat.
This is pull-based CD — the cluster reaches out to Git, rather than having a pipeline push into the cluster. The advantages:
Push-based CD (traditional):
CI runner → kubectl apply → cluster
Problems:
- CI runner needs cluster credentials
- Cluster access from the internet
- Hard to audit who changed what
Pull-based CD (GitOps with FluxCD):
FluxCD (inside cluster) → polls Git → applies locally
Advantages:
- No external access to cluster needed
- Cluster credentials never leave the cluster
- Full audit trail in Git history
- Self-healing: manual kubectl edits are reverted
FluxCD Configuration for Each Environment¶
# flux-system/sources/slotmachine-deployment.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: slotmachine-deployment
namespace: flux-system
spec:
interval: 1m
url: https://github.com/pkhamdee/slotmachine-deployment
ref:
branch: development # ← development cluster watches 'development' branch
# QA cluster watches 'qa' branch
# Production cluster watches 'main' branch
secretRef:
name: flux-git-credentials
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: slotmachine
namespace: flux-system
spec:
interval: 5m
sourceRef:
kind: GitRepository
name: slotmachine-deployment
path: ./overlays/development # ← path matches the cluster's environment
prune: true # Delete resources removed from Git
wait: true # Wait for rollout to complete
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: client
namespace: development
- apiVersion: apps/v1
kind: Deployment
name: server
namespace: development
The prune: true setting is important: if a manifest is deleted from Git, FluxCD deletes the corresponding Kubernetes resource. The cluster state is always what Git says it should be — nothing more, nothing less.
Self-Healing in Practice¶
Scenario: An engineer manually scales the server Deployment to 10 replicas
with: kubectl scale deployment/server --replicas=10
FluxCD behavior (within 5 minutes):
1. Detects drift: cluster has 10 replicas, Git says 6
2. Re-applies the manifest from Git
3. Cluster returns to 6 replicas
4. Logs: "Kustomization/slotmachine: detected drift, reconciling"
Result: Manual changes to production are impossible to accidentally
leave in place. Everything flows through Git.
Part 6: Multi-Environment Promotion — The Human Gates¶
The Git branch structure maps directly to environments:
Git branches in slotmachine-deployment:
development ──────────────────────────► K8s DEV (Nutanix On-Premise)
│
│ Pull Request
│ Reviewed by: DevOps/Platform Engineer
▼
qa ─────────────────────────────────► K8s QA (Nutanix On-Premise)
│
│ Pull Request
│ Reviewed by: QA Sign-Off process + Software Delivery Manager
▼
main ────────────────────────────────► K8s PROD (AWS EKS)
How a Release Flows Through the Pipeline¶
Step 1 — Developer pushes to main in the app repo:
git push origin main
# → GitHub Actions CI starts (6 jobs, ~20 min)
# → On success: image tags written to development branch of deployment repo
# → FluxCD on DEV cluster detects change, deploys automatically
Step 2 — DevOps/Platform Engineer promotes to QA:
# In slotmachine-deployment repo:
# Create PR: development → qa
gh pr create \
--base qa \
--head development \
--title "Release: app commit abc1234" \
--body "Promoting $(git log development -1 --format='%s') to QA"
# DevOps Engineer reviews and approves
# PR merged → FluxCD on QA cluster detects change, deploys
Step 3 — QA Sign-Off and Production Release:
QA team runs acceptance tests against QA cluster
↓
QA Sign-Off: test results recorded, sign-off documented
↓
Software Delivery Manager approves
↓
PR merged: qa → main
↓
FluxCD on Production cluster detects main branch change
↓
kustomize build overlays/production/
↓
kubectl apply → rolling update on AWS EKS
(4 client pods × rolling update = zero downtime)
(6 server pods × rolling update = zero downtime)
Why PRs for Promotion (Not Another Pipeline)¶
Some teams automate promotion with scripts that merge branches. This architecture uses PRs deliberately:
- Visibility: Every promotion is a visible event in GitHub — searchable, commentable, linked to issues
- Required reviews: Branch protection rules enforce that a human approves before merging
- Rollback: Rolling back production is a one-command
git revert+ merge, not an incident procedure - Audit: Every promotion has an author, timestamp, reviewer, and message — satisfying compliance requirements
Part 7: Infrastructure — Nutanix On-Premise and AWS EKS¶
The three Kubernetes clusters run on two different infrastructure platforms, managed by a single control plane.
NKP Management Cluster (Nutanix Kubernetes Platform)¶
The NKP Management Cluster runs on Nutanix on-premise hardware and manages both the DEV and QA clusters:
NKP Management Cluster (Nutanix On-Premise)
│
├── SSO / IdP Integration
│ → Single sign-on for cluster access
│ → RBAC mapped to Active Directory groups
│
├── Governance
│ → Policy enforcement across managed clusters
│ → Resource quotas, network policies, OPA Gatekeeper
│
├── Platform Services
│ → Centralized monitoring (Prometheus + Grafana)
│ → Centralized logging (Loki / Elasticsearch)
│ → Certificate management (cert-manager)
│
└── Application Services
→ Shared ingress controllers
→ Shared storage classes
→ Backup policies
Manages:
├── Kubernetes DEV (Nutanix On-Premise)
└── Kubernetes QA (Nutanix On-Premise)
NKP provides a Cluster API-based management layer — the DEV and QA clusters are defined declaratively and the management cluster ensures they match specification. Fleet-wide policy enforcement means no cluster can be misconfigured without the management cluster detecting and alerting.
Production on AWS EKS¶
Production runs on AWS EKS in ap-southeast-7:
# Production cluster specifications
Nodes: 4× m6i.xlarge (4 vCPU, 16 GB RAM)
Zones: ap-southeast-7a, ap-southeast-7b, ap-southeast-7c
CNI: Cilium (SNAT mode)
LB: AWS Network Load Balancer (cross-zone enabled)
Scheduling: topologySpreadConstraints (maxSkew: 1)
The topologySpreadConstraints in the base manifest ensure pods distribute evenly across nodes and availability zones. With 6 server pods across 4 nodes, no single node failure takes down more than 2 server pods.
# From base/server/deployment.yaml
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: slotmachine-server
Why On-Premise for Dev/QA, Cloud for Production?¶
Cost optimization:
DEV/QA: On-premise Nutanix hardware (already owned)
→ near-zero variable cost for non-production workloads
→ spin up/down as needed
PROD: AWS EKS
→ global availability
→ AWS NLB for internet-facing traffic
→ m6i.xlarge (cost-effective for the workload size)
→ auto-scaling when traffic spikes
Part 8: Security — Defense in Depth Across Every Stage¶
The pipeline builds in security at every phase rather than bolting it on at the end:
Stage │ Tool │ What It Catches
──────────────┼────────────────┼──────────────────────────────────────
Source code │ CodeQL │ SAST: injection, XSS, data flow vulns
Running app │ OWASP ZAP │ DAST: active attack simulation
Container OS │ Trivy │ CVEs in OS packages + npm dependencies
Runtime │ K8s policies │ Privilege escalation, host path mounts
Network │ Cilium │ Network policy enforcement between pods
Secrets │ K8s Secrets │ ADMIN_PASSWORD never in plaintext YAML
Headers │ Express config │ CSP, X-Frame-Options, Permissions-Policy
The Security Headers in Code¶
// server/src/server.js — security headers applied at Express level
app.use((req, res, next) => {
res.removeHeader('X-Powered-By'); // Don't reveal Express
res.setHeader('X-Frame-Options', 'DENY'); // Clickjacking
res.setHeader('Content-Security-Policy',
"default-src 'self'; script-src 'self'"); // XSS mitigation
res.setHeader('Permissions-Policy',
'camera=(), microphone=(), geolocation=()'); // Feature restriction
next();
});
Container Hardening via Multi-Stage Builds¶
# client/Dockerfile — two-stage: build then serve
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build # Produces /app/dist
FROM nginx:alpine # Final image has NO Node.js, NO source code
COPY --from=builder /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf
# Result: ~25MB image with zero Node.js attack surface
The final client image contains only Nginx and the built static files — no Node.js runtime, no npm, no source code.
Part 9: Local Development — From Zero to Running in One Command¶
# docker-compose.yml
services:
mongodb:
image: mongo:7
ports: ["27017:27017"]
volumes:
- mongo_data:/data/db
redis:
image: redis:7-alpine
ports: ["6379:6379"]
server:
build: ./server
ports: ["3001:3001"]
environment:
MONGO_URI: mongodb://mongodb:27017/slotmachine
REDIS_URL: redis://redis:6379
ADMIN_PASSWORD: localdev
depends_on: [mongodb, redis]
client:
build: ./client
ports: ["8080:80"]
depends_on: [server]
volumes:
mongo_data:
# Start the full stack locally
docker compose up --build
# Run the full test suite
npm ci && npm test --workspaces
# Run load test against local stack
locust -f tests/locustfile.py --headless -u 5 -r 1 -t 30s \
--host http://localhost:8080
The local environment mirrors production topology exactly: the same Nginx config, the same Redis adapter, the same MongoDB schema. What works locally, works in the cluster.
Summary¶
This pipeline embodies every principle of modern DevOps practice in a real, working system.
The separation of concerns between slotmachine (application code) and slotmachine-deployment (Kubernetes manifests) is the architectural foundation everything else builds on. It enables independent evolution of the app and its infrastructure, clean audit trails, and role-based access — developers push to the app repo; the platform team owns the deployment repo.
The six-job CI pipeline enforces quality and security at every stage before a single byte reaches a container registry. CodeQL finds vulnerabilities in source code. Vitest and node:test verify correctness with 118 tests. Locust stress-tests the running server. OWASP ZAP attacks it from the outside. Trivy scans the container image for CVEs. Only after all six gates pass does the image push to Docker Hub and the deployment repo receive its new image tag.
FluxCD closes the loop by running inside each cluster and continuously reconciling cluster state against the Git branch it watches. No pipeline needs credentials to reach the cluster. No manual kubectl apply is needed. The cluster is self-healing — any drift from the desired state in Git is automatically corrected within minutes.
The three-branch promotion model (development → qa → main) maps directly to three Kubernetes clusters (Dev on Nutanix, QA on Nutanix, Production on AWS EKS), with human approval gates exactly where they provide value — not automated away, but not blocking automated delivery either.
The NKP management cluster provides fleet governance across the on-premise clusters: SSO, centralized policy enforcement, observability, and platform services — all without each cluster needing to reinvent its own security and monitoring stack.
The result is a system where a developer can push a feature, see it live in Dev within minutes, promoted to QA with a PR, and shipped to production on AWS after sign-off — with a complete audit trail, zero manual steps in the critical path, and security baked in at every layer.
The full source code is on GitHub: slotmachine app · slotmachine-deployment. Questions or discussion? Connect on LinkedIn, X or reach out via email.
Discussion
Have thoughts on this post? Share them below — questions, corrections, or your own experience are all welcome.