Building a Production-Grade RKE2 Kubernetes Cluster: A Practical Guide

What You’ll Learn: How to build a real Kubernetes cluster that can run production workloads, not just hello-world demos.

Prerequisites: You’ll need 7-8 machines (VMs or physical servers). This guide assumes you can provision these yourself—whether that’s creating VMs in Proxmox/ESXi, spinning up cloud instances, or using physical hardware. See specs below.

Why RKE2: It’s Kubernetes without the complexity overhead of managed services. You control everything. You understand everything. When something breaks at 2am, you know exactly where to look.

Time Investment: 2-3 days if you follow along, a week if you include chaos testing. Several weeks of trial and error if you figure it out yourself from scratch.

Setup & Prerequisites

Prerequisites: What You Need Before Starting
Setting Up Your Local Machine for Remote Management

Building the Cluster

Part 1: Preparing Your Nodes
Part 2: Installing HAProxy Load Balancer
Part 3: Installing RKE2 – First Control Plane Node
Part 4: Adding Additional Control Plane Nodes
Part 5: Adding Worker Nodes
Part 6: Installing Rancher Management Platform
Part 7: Installing MetalLB for LoadBalancer Services

Deploying Applications

Part 8: Deploying Your First Real Application

Advanced Topics

Part 9: Real-World Learnings (The Expensive Lessons)
Part 10: Maintenance & Operations

Wrap-Up

What You’ve Built
The Real Lessons
Questions Worth Asking
Next Steps

Let’s build something real.

Prerequisites: What You Need Before Starting

Hardware Requirements:

Absolute Minimum (will work, but tight):

Control Plane Nodes (3x): 2 vCPU, 4GB RAM, 40GB disk
Worker Nodes (3x): 4 vCPU, 8GB RAM, 80GB disk
HAProxy Load Balancer (1x): 1 vCPU, 1GB RAM, 20GB disk
Total: 7 machines, 17 vCPU, 37GB RAM, 380GB disk

Recommended (what I actually run):

Control Plane Nodes (3x): 4 vCPU, 8GB RAM, 80GB disk
Worker Nodes (4x): 6-8 vCPU, 16-24GB RAM, 100-200GB disk
HAProxy Load Balancer (1x): 2 vCPU, 2GB RAM, 20GB disk
Total: 8 machines, 38-46 vCPU, 106-122GB RAM, 700-1,000GB disk

Why these specs? Control planes idle at 2-3GB RAM with Rancher installed. Workers consume 8-12GB with monitoring stack running. The recommended specs give you room to grow—you WILL add more workloads over time.

CRITICAL: Storage Performance Matters

Use SSDs or NVMe. Not HDDs.

etcd (the control plane database) is EXTREMELY sensitive to disk latency. On HDDs:

Control plane nodes can take hours to stabilize (I’ve left clusters overnight)
API server flickers constantly (nodes show Ready, then NotReady, then Ready again)
Cluster feels sluggish and unreliable
You’ll think something is broken (it’s just slow—patience required)

On SSDs/NVMe:

Control plane nodes stabilize in 2-3 minutes
Reliable, predictable performance
Cluster feels responsive

I learned this the hard way. Spent hours debugging a “broken” cluster that was just running on spinning rust. Moved to SSDs, problems vanished.

Give your cluster time to stabilize: Even on SSDs, after first boot, wait 5-10 minutes before panicking. etcd needs to form quorum, CNI needs to initialize, pods need to schedule. Patience pays off.

Minimum Non-HA Setup (for learning only):

Control Plane (1x): 4 vCPU, 8GB RAM, 80GB SSD
Worker Nodes (2x): 6 vCPU, 16GB RAM, 100GB SSD
Total: 3 machines, 16 vCPU, 40GB RAM, 280GB disk

What you lose:

No HAProxy (single control plane = single point of failure)
Control plane dies → entire cluster dies
No etcd redundancy (quorum requires 3 nodes)
Can’t do chaos testing (killing CP1 takes down everything)
Not production-ready, learning-only setup

This guide assumes you want HA. If you’re just learning Kubernetes basics, start with k3s on 1-2 nodes instead.

Software:

OS: SUSE Leap 15.6 (officially supported) or Ubuntu 22.04/24.04 LTS
SSH access to all nodes
Static IP addresses configured
Root or sudo access

Note on SUSE Leap versions: Leap 16.0 was released with a significantly improved installation process—much more modern than 15.6’s outdated installer. However, RKE2 doesn’t officially support Leap 16 yet. I tested both in production. Leap 16 worked initially but hit compatibility issues with some RKE2 components during upgrades. Stick with 15.6 for production clusters. The ancient installer is annoying, but cluster stability matters more than installation UX.

Note on Ubuntu: 24.04 LTS works fine and is what I run in production for docker-compose deployments. For RKE2, both 22.04 and 24.04 are solid choices.

Network Setup:

192.168.1.121-123  → Control plane nodes
192.168.1.124-127  → Worker nodes  
192.168.1.128      → HAProxy

Critical: Reserve these IPs in your DHCP server. Kubernetes hates when IPs change.

Setting Up Your Local Machine for Remote Management

Before touching any nodes, set up your local machine so you can manage everything remotely. SSHing into nodes every time is tedious and error-prone.

SSH Key-Based Authentication

Generate SSH keys and copy them to all nodes:

# On your local machine (Mac/Linux)
# Generate key if you don't have one
ssh-keygen -t ed25519 -C "your-email@example.com"

# Copy key to all nodes (repeat for each IP)
ssh-copy-id root@192.168.1.121
ssh-copy-id root@192.168.1.122
ssh-copy-id root@192.168.1.123
ssh-copy-id root@192.168.1.124
ssh-copy-id root@192.168.1.125
ssh-copy-id root@192.168.1.126
ssh-copy-id root@192.168.1.127
ssh-copy-id root@192.168.1.128  # HAProxy

# Test passwordless SSH
ssh root@192.168.1.121 'hostname'

Why this matters: You’ll run commands on these nodes hundreds of times. Typing passwords is hell.

Configure kubectl on Your Local Machine

After your first control plane is up (you’ll do this in Part 3), copy the kubeconfig to your local machine:

# On your local machine
mkdir -p ~/.kube

# Copy kubeconfig from first control plane
scp root@192.168.1.121:/etc/rancher/rke2/rke2.yaml ~/.kube/rke2-config

# Edit the config to use HAProxy IP instead of localhost
sed -i 's/127.0.0.1/192.168.1.128/g' ~/.kube/rke2-config

# Set KUBECONFIG environment variable (add to ~/.bashrc or ~/.zshrc)
export KUBECONFIG=~/.kube/rke2-config

# Test it
kubectl get nodes

Now you control your entire cluster from your laptop. No more SSHing into control plane nodes.

Handy kubectl Commands (Bookmark This)

You’ll use these constantly:

# Node management
kubectl get nodes                           # List all nodes
kubectl get nodes -o wide                   # Show IPs and versions
kubectl describe node rke2-cp-01            # Detailed node info
kubectl top nodes                           # Resource usage per node

# Pod management
kubectl get pods -A                         # All pods in all namespaces
kubectl get pods -n zantu                   # Pods in specific namespace
kubectl get pods -o wide                    # Show which node pods run on
kubectl describe pod <pod-name> -n zantu    # Debug pod issues
kubectl logs -f <pod-name> -n zantu         # Follow pod logs
kubectl logs <pod-name> -n zantu --previous # Logs from crashed container

# Namespace management
kubectl get namespaces                      # List all namespaces
kubectl create namespace myapp              # Create namespace
kubectl delete namespace myapp              # Delete namespace (careful!)

# Service and ingress
kubectl get svc -A                          # All services
kubectl get ingress -A                      # All ingresses
kubectl get endpoints -n zantu              # See service endpoints

# Deployments and scaling
kubectl get deployments -n zantu            # List deployments
kubectl scale deployment zantu --replicas=5 -n zantu  # Scale up/down
kubectl rollout restart deployment zantu -n zantu     # Restart deployment
kubectl rollout status deployment zantu -n zantu      # Check rollout status

# Debugging
kubectl exec -it <pod-name> -n zantu -- /bin/sh      # Shell into pod
kubectl port-forward -n zantu svc/zantu 8080:80      # Test service locally
kubectl get events -n zantu --sort-by='.lastTimestamp'  # Recent events
kubectl get secrets -n zantu                         # List secrets in namespace
kubectl describe deployment zantu -n zantu | grep -A5 "Image Pull Secrets"  # Check if secret is attached
kubectl get serviceaccount default -n zantu -o yaml  # Check default SA for imagePullSecrets

# Cluster info
kubectl cluster-info                        # Cluster endpoints
kubectl get componentstatuses               # Control plane health
kubectl api-resources                       # All available resources

Debugging ImagePullBackOff errors specifically:

# Check if ghcr-secret exists in namespace
kubectl get secret ghcr-secret -n zantu

# If missing, you'll see: Error from server (NotFound): secrets "ghcr-secret" not found

# Verify deployment references the secret
kubectl get deployment zantu -n zantu -o yaml | grep -A3 imagePullSecrets

# Should show:
#   imagePullSecrets:
#   - name: ghcr-secret

# Check pod events for pull errors
kubectl describe pod <pod-name> -n zantu | grep -A10 Events

You can also see this in Rancher UI:

Go to your cluster → Workloads → Deployments
Click on your deployment
Scroll to “Image Pull Secrets” section
If it shows “None” but you’re pulling from private registry → that’s your problem

Pro tip: Create aliases in your shell:

# Add to ~/.bashrc or ~/.zshrc
alias k='kubectl'
alias kgp='kubectl get pods'
alias kgn='kubectl get nodes'
alias kga='kubectl get all -A'
alias kd='kubectl describe'
alias kl='kubectl logs -f'

Part 1: Preparing Your Nodes

System Preparation (All Nodes)

SSH into each node and run these commands. Yes, all of them. I learned the hard way what happens when you skip steps.

# Update system
zypper refresh && zypper update -y  # SUSE
# OR
apt update && apt upgrade -y        # Ubuntu

# Install required packages
zypper install -y curl wget git vim tmux  # SUSE
# OR
apt install -y curl wget git vim tmux     # Ubuntu

# Disable firewall (we'll configure it properly later)
systemctl disable --now firewalld  # SUSE
# OR
ufw disable                        # Ubuntu

# Disable swap (Kubernetes requirement - non-negotiable)
swapoff -a
sed -i '/ swap / s/^/#/' /etc/fstab

# Load required kernel modules
cat <<EOF > /etc/modules-load.d/k8s.conf
br_netfilter
overlay
EOF

modprobe br_netfilter
modprobe overlay

# Verify modules loaded
lsmod | grep br_netfilter
lsmod | grep overlay

# Configure sysctl parameters
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF

sysctl --system

Why this matters: I skipped the kernel modules once. Spent 4 hours debugging pod networking. Learn from my mistakes.

Set Hostnames (Important for Troubleshooting)

On each node:

# Control plane nodes
hostnamectl set-hostname rke2-cp-01.yourdomain.lan  # .121
hostnamectl set-hostname rke2-cp-02.yourdomain.lan  # .122  
hostnamectl set-hostname rke2-cp-03.yourdomain.lan  # .123

# Worker nodes
hostnamectl set-hostname rke2-worker-01.yourdomain.lan  # .124
hostnamectl set-hostname rke2-worker-02.yourdomain.lan  # .125
hostnamectl set-hostname rke2-worker-03.yourdomain.lan  # .126
hostnamectl set-hostname rke2-worker-04.yourdomain.lan  # .127

Part 2: Installing HAProxy Load Balancer

The load balancer sits in front of your control plane nodes. This is how you achieve true HA—one control plane dies, HAProxy routes to the others.

Install HAProxy

On your HAProxy node (192.168.1.128):

# Install HAProxy
zypper install -y haproxy  # SUSE
# OR
apt install -y haproxy     # Ubuntu

# Backup default config
cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.backup

# Create new config
cat > /etc/haproxy/haproxy.cfg << 'EOF'
global
    log /dev/log local0
    log /dev/log local1 notice
    maxconn 4096
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    tcp
    option  tcplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

# Stats interface (optional but useful)
listen stats
    bind *:9000
    mode http
    stats enable
    stats uri /stats
    stats refresh 10s
    stats admin if TRUE

# RKE2 API Server (6443)
frontend rke2-api
    bind *:6443
    mode tcp
    option tcplog
    default_backend rke2-api-backend

backend rke2-api-backend
    mode tcp
    balance roundrobin
    option tcp-check
    server rke2-cp-01 192.168.1.121:6443 check
    server rke2-cp-02 192.168.1.122:6443 check
    server rke2-cp-03 192.168.1.123:6443 check

# RKE2 Registration (9345)
frontend rke2-registration
    bind *:9345
    mode tcp
    option tcplog
    default_backend rke2-registration-backend

backend rke2-registration-backend
    mode tcp
    balance roundrobin
    option tcp-check
    server rke2-cp-01 192.168.1.121:9345 check
    server rke2-cp-02 192.168.1.122:9345 check
    server rke2-cp-03 192.168.1.123:9345 check
EOF

# Enable and start HAProxy
systemctl enable haproxy
systemctl start haproxy
systemctl status haproxy

Test HAProxy: Open http://192.168.1.128:9000/stats in your browser. You should see the stats page. All backends will be down (red) until we install RKE2.

Part 3: Installing RKE2 – First Control Plane Node

This is where it gets real. The first control plane node initializes the cluster.

Install RKE2 on First Control Plane (192.168.1.121)

# Download and install RKE2
curl -sfL https://get.rke2.io | sh -

# Create RKE2 config directory
mkdir -p /etc/rancher/rke2

# Create configuration file
cat > /etc/rancher/rke2/config.yaml << 'EOF'
# Cluster configuration
token: YOUR-SECRET-TOKEN-CHANGE-THIS
tls-san:
  - 192.168.1.128  # HAProxy IP
  - 192.168.1.121  # This node's IP
  - rke2-cp-01.yourdomain.lan

# Network configuration
cluster-cidr: 10.42.0.0/16
service-cidr: 10.43.0.0/16
cluster-dns: 10.43.0.10

# Disable components we don't need
disable:
  - rke2-ingress-nginx  # We'll use Traefik or custom ingress

# Enable components
cni:
  - calico  # Or canal, cilium - pick one CNI

# Kubelet configuration
kubelet-arg:
  - "max-pods=110"
EOF

# Start RKE2
systemctl enable rke2-server.service
systemctl start rke2-server.service

# Wait for it to start (this takes 2-3 minutes)
journalctl -u rke2-server -f

Watch for: “Wrote kubeconfig” in the logs. That means it’s ready.

Verify First Node

# Set up kubectl access
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
export PATH=$PATH:/var/lib/rancher/rke2/bin

# Check node status
kubectl get nodes

# Should see:
# NAME          STATUS   ROLES                       AGE   VERSION
# rke2-cp-01    Ready    control-plane,etcd,master   2m    v1.28.x+rke2

If node shows NotReady: Wait 2 more minutes. CNI takes time to initialize.

Quick reference: See the “Handy kubectl Commands” section earlier for debugging commands. Most useful right now:

kubectl describe node rke2-cp-01  # Why is node NotReady?
kubectl get pods -n kube-system   # Are core components running?

Get the Join Token

You’ll need this for other nodes:

cat /var/lib/rancher/rke2/server/node-token

Save this token. You’ll use it for all other nodes.

Part 4: Adding Additional Control Plane Nodes

Now we add redundancy. This is what makes your cluster production-grade.

On Second Control Plane (192.168.1.122)

# Install RKE2
curl -sfL https://get.rke2.io | sh -

# Create config (NOTE: we point to HAProxy, not first node directly)
mkdir -p /etc/rancher/rke2
cat > /etc/rancher/rke2/config.yaml << 'EOF'
server: https://192.168.1.128:9345  # HAProxy IP!
token: YOUR-TOKEN-FROM-FIRST-NODE
tls-san:
  - 192.168.1.128
  - 192.168.1.122
  - rke2-cp-02.yourdomain.lan
EOF

# Start service
systemctl enable rke2-server.service
systemctl start rke2-server.service

# Monitor logs
journalctl -u rke2-server -f

On Third Control Plane (192.168.1.123)

Same process, just change IPs and hostname in config.yaml:

server: https://192.168.1.128:9345
token: YOUR-TOKEN-FROM-FIRST-NODE
tls-san:
  - 192.168.1.128
  - 192.168.1.123
  - rke2-cp-03.yourdomain.lan

Verify All Control Planes

From your local machine (remember, we set up kubectl earlier):

kubectl get nodes

You should see all 3 control plane nodes in Ready state.

Part 5: Adding Worker Nodes

Workers are where your actual applications run. This is where you need more resources.

On Each Worker Node

# Install RKE2 agent (not server!)
curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sh -

# Create config
mkdir -p /etc/rancher/rke2
cat > /etc/rancher/rke2/config.yaml << 'EOF'
server: https://192.168.1.128:9345
token: YOUR-TOKEN-FROM-FIRST-NODE
node-label:
  - "node-role.kubernetes.io/worker=true"
EOF

# Start agent
systemctl enable rke2-agent.service
systemctl start rke2-agent.service

# Monitor
journalctl -u rke2-agent -f

Repeat for all worker nodes (.124, .125, .126, .127).

Verify Complete Cluster

kubectl get nodes -o wide

You should see:

3 control-plane nodes
4 worker nodes
All showing Ready status

Useful commands for this stage:

kubectl get nodes -o wide              # See IPs and Kubernetes versions
kubectl top nodes                      # Check resource usage
kubectl get pods -A -o wide            # See all pods and which nodes they're on

Part 6: Installing Rancher Management Platform

Rancher gives you a UI to manage everything. It’s optional but incredibly useful.

Install Cert-Manager First (Required)

# Add Helm (if not installed)
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml

# Wait for it to be ready
kubectl get pods -n cert-manager

Install Rancher

# Add Rancher Helm repo
helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
helm repo update

# Create namespace
kubectl create namespace cattle-system

# Install Rancher
helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --set hostname=rancher.yourdomain.lan \
  --set replicas=3 \
  --set bootstrapPassword=CHANGE-THIS-PASSWORD

# Wait for deployment
kubectl -n cattle-system rollout status deploy/rancher

Access Rancher UI

Add DNS entry: rancher.yourdomain.lan → 192.168.1.128 (HAProxy)
Or add to /etc/hosts: 192.168.1.128 rancher.yourdomain.lan
Open browser: https://rancher.yourdomain.lan
Login with bootstrap password
Set new password when prompted

You now have a Rancher-managed RKE2 cluster!

Part 7: Installing MetalLB for LoadBalancer Services

On bare metal (no cloud provider), Kubernetes LoadBalancer services stay in “Pending” state forever. MetalLB fixes this by providing load balancer functionality using your own IP pool.

Why MetalLB Matters

Without it, the only way to expose services externally is:

NodePort (ugly, non-standard ports)
Ingress (adds complexity for simple services)

With MetalLB, you get real LoadBalancer IPs just like cloud providers give you.

Install MetalLB

# Install MetalLB via manifest
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.0/config/manifests/metallb-native.yaml

# Wait for MetalLB pods to be ready
kubectl wait --namespace metallb-system \
  --for=condition=ready pod \
  --selector=app=metallb \
  --timeout=90s

Configure IP Address Pool

You need to give MetalLB a pool of IPs it can assign. These should be:

On the same network as your cluster
Not used by DHCP
Reserved/excluded from your router’s DHCP range

Example: Reserve 192.168.1.200-192.168.1.220

# metallb-config.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: default-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.168.1.200-192.168.1.220
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: default
  namespace: metallb-system
spec:
  ipAddressPools:
  - default-pool

Apply it:

kubectl apply -f metallb-config.yaml

Test MetalLB

Create a test service:

kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=LoadBalancer

Check the external IP:

kubectl get svc nginx

You should see an IP from your pool (e.g., 192.168.1.200). Open it in your browser—nginx welcome page should appear.

MetalLB is now providing LoadBalancer IPs on your bare metal cluster.

Clean up test:

kubectl delete svc nginx
kubectl delete deployment nginx

Part 8: Deploying Your First Real Application

Let’s deploy something real—not hello-world. I’ll show you how to deploy a containerized application from GitHub Container Registry.

Full disclosure: The application I’m using as an example is Zantu, the multi-tenant SaaS platform I’m building. It’s my actual production application, not a toy demo.

Why does this matter? Because I’m showing you a real production setup with:

High-availability database configuration
Pod anti-affinity rules for resilience
Actual resource limits based on real usage
Monitoring and health checks that caught real bugs

Your application will be different, but the patterns are universal. Adapt the specifics to your needs.

⚠️ CRITICAL GOTCHA: If you’re pulling images from ghcr.io (or any private registry), you MUST create an imagePullSecret in each namespace. Without it, pods will show “ImagePullBackOff” errors and you’ll waste hours debugging. I know. I did it.

Example: Deploying a SaaS Application (Zantu)

Prerequisites:

Docker image built and pushed to GitHub Container Registry (ghcr.io)
Application requires PostgreSQL database

Create Namespace and Registry Secret

CRITICAL: Before deploying anything from GitHub Container Registry, you need credentials. I spent 3 hours debugging “ImagePullBackOff” errors before realizing this. Don’t be me.

# Create namespace
kubectl create namespace zantu

# Create secret for GitHub Container Registry
# You need a GitHub Personal Access Token (PAT) with read:packages permission
kubectl create secret docker-registry ghcr-secret \
  --docker-server=ghcr.io \
  --docker-username=YOUR-GITHUB-USERNAME \
  --docker-password=YOUR-GITHUB-PAT \
  --docker-email=your-email@example.com \
  -n zantu

How to get a GitHub PAT:

GitHub → Settings → Developer Settings → Personal Access Tokens → Tokens (classic)
Generate new token
Select scope: read:packages
Copy the token (you won’t see it again!)

CRITICAL: Secrets Don’t Cross Namespaces

This secret only exists in the zantu namespace. If you:

Create a new namespace → Need to recreate the secret
Delete and recreate a namespace → Need to recreate the secret
Deploy to multiple namespaces → Need the secret in EACH one

Example – What happens if you forget:

# You deploy to a new namespace without the secret
kubectl create namespace myapp
kubectl apply -f deployment.yaml -n myapp

# Your pods will fail with ImagePullBackOff
kubectl get pods -n myapp
# NAME                     READY   STATUS             RESTARTS   AGE
# myapp-7d5f8c4b9-abcde    0/1     ImagePullBackOff   0          2m

# Check the error
kubectl describe pod myapp-7d5f8c4b9-abcde -n myapp
# Events:
# Failed to pull image "ghcr.io/user/app:latest": 
# Error: pull access denied, authentication required

The fix – recreate the secret in the new namespace:

kubectl create secret docker-registry ghcr-secret \
  --docker-server=ghcr.io \
  --docker-username=YOUR-GITHUB-USERNAME \
  --docker-password=YOUR-GITHUB-PAT \
  --docker-email=your-email@example.com \
  -n myapp  # Note: different namespace!

# Now restart the deployment to pull the image
kubectl rollout restart deployment myapp -n myapp

Pro tip – Create a script for this:

Save this as create-ghcr-secret.sh:

#!/bin/bash
# Usage: ./create-ghcr-secret.sh namespace-name

NAMESPACE=$1
GITHUB_USERNAME="your-username"
GITHUB_PAT="ghp_your_token_here"  # Or use: read -sp "GitHub PAT: " GITHUB_PAT
GITHUB_EMAIL="your-email@example.com"

if [ -z "$NAMESPACE" ]; then
    echo "Usage: $0 <namespace>"
    exit 1
fi

echo "Creating ghcr-secret in namespace: $NAMESPACE"

kubectl create secret docker-registry ghcr-secret \
  --docker-server=ghcr.io \
  --docker-username=$GITHUB_USERNAME \
  --docker-password=$GITHUB_PAT \
  --docker-email=$GITHUB_EMAIL \
  -n $NAMESPACE \
  --dry-run=client -o yaml | kubectl apply -f -

echo "Done! Secret created in $NAMESPACE"

Make it executable:

chmod +x create-ghcr-secret.sh

Now whenever you create a new namespace that needs GitHub Container Registry access:

./create-ghcr-secret.sh production
./create-ghcr-secret.sh staging
./create-ghcr-secret.sh development

Why this is so annoying:

Kubernetes doesn’t share secrets across namespaces by design (security isolation). But this means if you delete a namespace to “clean up,” you lose the secret too. When you recreate the namespace, you must recreate the secret or your deployments will fail.

I’ve wasted hours on this. Now you won’t.

Deploy PostgreSQL (Production HA Setup)

For production, you want high-availability PostgreSQL with streaming replication. This gives you:

1 primary (handles writes)
2+ replicas (handle reads, can be promoted to primary if primary fails)

Why not just replicas: 3 on a basic StatefulSet? That creates 3 independent databases, not 1 database with replication. Big difference.

Option 1: Use a Postgres Operator (Recommended)

The easiest production-ready approach is Zalando’s Postgres Operator:

# Add Zalando Helm repo
helm repo add postgres-operator-charts https://opensource.zalando.com/postgres-operator/charts/postgres-operator
helm repo update

# Install the operator
helm install postgres-operator postgres-operator-charts/postgres-operator \
  --namespace postgres-operator --create-namespace

# Install the UI (optional but useful)
helm install postgres-operator-ui postgres-operator-charts/postgres-operator-ui \
  --namespace postgres-operator

Then create a PostgreSQL cluster:

# postgres-cluster.yaml
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: zantu-postgres
  namespace: zantu
spec:
  teamId: "zantu"
  volume:
    size: 10Gi
  numberOfInstances: 3  # 1 primary + 2 replicas
  users:
    zantu:  # application user
    - superuser
    - createdb
  databases:
    zantu: zantu  # database owned by user
  postgresql:
    version: "15"
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 1000m
      memory: 2Gi
  # Pod anti-affinity - spread across workers
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchLabels:
            application: spilo
            cluster-name: zantu-postgres
        topologyKey: kubernetes.io/hostname

Apply it:

kubectl apply -f postgres-cluster.yaml

What this gives you:

Automatic failover (if primary dies, replica promotes)
Connection pooling (via pgBouncer)
Backup/restore capabilities
Read replicas for scaling reads

Get the connection string:

# Primary (read/write)
kubectl get secret zantu.zantu-postgres.credentials.postgresql.acid.zalan.do \
  -n zantu -o jsonpath='{.data.password}' | base64 -d

# Connection string:
# postgresql://zantu:<password>@zantu-postgres:5432/zantu

Option 2: Manual StatefulSet with Replication (If You Want Control)

If you want to understand what the operator is doing under the hood:

# postgres-ha.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-config
  namespace: zantu
data:
  POSTGRES_DB: "zantu"
  POSTGRES_USER: "zantu"
  PGDATA: "/var/lib/postgresql/data/pgdata"
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: zantu
spec:
  serviceName: postgres
  replicas: 3  # 1 primary + 2 replicas
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      # Pod anti-affinity - spread across workers
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - postgres
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: postgres
        image: postgres:15-alpine
        ports:
        - containerPort: 5432
          name: postgres
        envFrom:
        - configMapRef:
            name: postgres-config
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        - name: POSTGRES_REPLICATION_MODE
          value: "master"  # Override this for replicas in init script
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
          limits:
            cpu: 1000m
            memory: 2Gi
        volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
        livenessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - zantu
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - zantu
          initialDelaySeconds: 5
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: postgres-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: zantu
spec:
  selector:
    app: postgres
  ports:
  - port: 5432
  clusterIP: None  # Headless service
---
apiVersion: v1
kind: Service
metadata:
  name: postgres-primary
  namespace: zantu
spec:
  selector:
    app: postgres
    role: primary  # Only route to primary
  ports:
  - port: 5432
---
apiVersion: v1
kind: Secret
metadata:
  name: postgres-secret
  namespace: zantu
type: Opaque
stringData:
  password: "CHANGE-THIS-PASSWORD"

Apply it:

kubectl apply -f postgres-ha.yaml

Understanding Pod Anti-Affinity:

affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - postgres
      topologyKey: "kubernetes.io/hostname"

What this does:

requiredDuringScheduling → Kubernetes MUST place pods on different nodes
matchExpressions: app=postgres → Don’t put postgres pods together
topologyKey: kubernetes.io/hostname → “Different nodes” means different hostnames

Result: If you have 3 postgres replicas and 4 workers:

postgres-0 → worker-01
postgres-1 → worker-02
postgres-2 → worker-03

If worker-01 dies, postgres-0 dies, but postgres-1 and postgres-2 keep running on worker-02 and worker-03.

For my Zantu deployment, I use the Zalando operator. It handles replication, failover, backups automatically. The manual approach teaches you the concepts, but the operator is what you run in production.

Connection string for your app:

DATABASE_URL=postgresql://zantu:password@zantu-postgres:5432/zantu  # Operator
DATABASE_URL=postgresql://zantu:password@postgres-primary:5432/zantu  # Manual

Deploy Application

Now deploy Zantu with proper HA configuration:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: zantu
  namespace: zantu
spec:
  replicas: 3
  selector:
    matchLabels:
      app: zantu
  template:
    metadata:
      labels:
        app: zantu
    spec:
      # Pod anti-affinity - spread across workers
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - zantu
              topologyKey: kubernetes.io/hostname
      imagePullSecrets:
      - name: ghcr-secret
      containers:
      - name: zantu
        image: ghcr.io/your-username/zantu:latest
        ports:
        - containerPort: 5000
        env:
        - name: DATABASE_URL
          value: "postgresql://zantu:CHANGE-THIS@zantu-postgres:5432/zantu"  # Or postgres-primary
        - name: NODE_ENV
          value: "production"
        resources:
          requests:
            cpu: 100m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        livenessProbe:
          httpGet:
            path: /healthz
            port: 5000
          initialDelaySeconds: 10
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /readyz
            port: 5000
          initialDelaySeconds: 5
          periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: zantu
  namespace: zantu
spec:
  selector:
    app: zantu
  ports:
  - port: 80
    targetPort: 5000
  type: ClusterIP

Understanding the anti-affinity rule:

affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:  # "Try hard, but not required"
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - zantu
        topologyKey: kubernetes.io/hostname

What this means:

preferredDuringScheduling → Try to spread pods, but if you can’t (e.g., only 2 workers available but 3 replicas), place them anyway
weight: 100 → High priority for spreading (0-100 scale)
Result: Kubernetes tries to put each zantu pod on a different worker

Why “preferred” vs “required”?

Required: Strict. If you have 3 replicas but only 2 workers, 3rd pod stays pending forever
Preferred: Flexible. Spread when possible, but allow bunching if necessary

For application pods, use “preferred.” For stateful services like databases, use “required.”

Test the anti-affinity:

kubectl apply -f deployment.yaml

# Check pod distribution
kubectl get pods -n zantu -o wide

# You should see pods spread across different worker nodes:
# NAME                     READY   STATUS    RESTARTS   AGE   NODE
# zantu-7d5f8c4b9-abcde    1/1     Running   0          2m    rke2-worker-01
# zantu-7d5f8c4b9-fghij    1/1     Running   0          2m    rke2-worker-02
# zantu-7d5f8c4b9-klmno    1/1     Running   0          2m    rke2-worker-03

Now kill a worker and watch:

You have two options:

Option 1: Graceful drain (boring but proper)

# From your local machine
kubectl drain rke2-worker-01 --ignore-daemonsets --delete-emptydir-data

# Then shut down the VM
# In Proxmox: Right-click → Shutdown

This gives pods time to migrate. Takes 30-60 seconds. Very civilized.

Option 2: Brutal kill (what actually happens in real failures)

# In Proxmox: Right-click → Stop (not shutdown, STOP)
# Or just pull the power cord from physical server

This is what we’re doing. We like to be brutal about it and just kill the VM, hoping for the best.

Because in production, servers don’t gracefully drain themselves before the power supply fails or the kernel panics. They just DIE.

Watch what happens:

# Shut down worker-01
# In Proxmox: Stop VM

# Watch pod rescheduling
kubectl get pods -n zantu -o wide -w

# The pod from worker-01 will reschedule to worker-04
# Your application stays up because pods on worker-02 and worker-03 keep running

This is production HA in action.

Apply it:

kubectl apply -f deployment.yaml

Verify Deployment

# Check pods
kubectl get pods -n zantu

# Check logs
kubectl logs -f deployment/zantu -n zantu

# Port-forward to test locally
kubectl port-forward -n zantu service/zantu 8080:80

Open http://localhost:8080 – your app should be running!

Automating Deployments with a Script

SSHing into nodes every time you update your app is tedious. Here’s a deployment script you can run from your local machine.

Create deploy.sh on your local machine:

#!/bin/bash
# Zantu Deployment Script - Run from your local machine
# Usage: ./deploy.sh [tag]
# Example: ./deploy.sh v1.2.3

set -e

# Configuration
IMAGE_NAME="ghcr.io/your-username/zantu"
TAG="${1:-latest}"
NAMESPACE="zantu"
DEPLOYMENT="zantu"

echo "=========================================="
echo "  Deploying Zantu ${TAG}"
echo "=========================================="

# Check if kubectl is configured
if ! kubectl cluster-info &> /dev/null; then
    echo "ERROR: kubectl is not configured or cluster is unreachable"
    exit 1
fi

echo ""
echo "1. Checking current deployment status..."
kubectl get deployment ${DEPLOYMENT} -n ${NAMESPACE}

echo ""
echo "2. Updating image to ${IMAGE_NAME}:${TAG}..."
kubectl set image deployment/${DEPLOYMENT} \
    ${DEPLOYMENT}=${IMAGE_NAME}:${TAG} \
    -n ${NAMESPACE}

echo ""
echo "3. Watching rollout status..."
kubectl rollout status deployment/${DEPLOYMENT} -n ${NAMESPACE}

echo ""
echo "4. Verifying pods are running..."
kubectl get pods -n ${NAMESPACE} -l app=${DEPLOYMENT}

echo ""
echo "=========================================="
echo "  Deployment Complete!"
echo "=========================================="
echo ""
echo "Check logs: kubectl logs -f deployment/${DEPLOYMENT} -n ${NAMESPACE}"
echo "Rollback:   kubectl rollout undo deployment/${DEPLOYMENT} -n ${NAMESPACE}"
echo ""

Make it executable:

chmod +x deploy.sh

Usage:

# Deploy latest
./deploy.sh

# Deploy specific version
./deploy.sh v1.2.3

# Rollback if something breaks
kubectl rollout undo deployment/zantu -n zantu

What this does:

Updates the deployment’s container image
Triggers rolling update (zero downtime)
Waits for new pods to be healthy
Shows you the status

Pro tip: Add this to your CI/CD pipeline (GitHub Actions, GitLab CI) to auto-deploy on git push.

GitHub Actions example:

# .github/workflows/deploy.yml
name: Deploy to Kubernetes

on:
  push:
    branches: [ main ]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Build and push Docker image
        run: |
          echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin
          docker build -t ghcr.io/${{ github.repository }}:${{ github.sha }} .
          docker push ghcr.io/${{ github.repository }}:${{ github.sha }}
      
      - name: Deploy to cluster
        run: |
          echo "${{ secrets.KUBECONFIG }}" > kubeconfig
          export KUBECONFIG=./kubeconfig
          kubectl set image deployment/zantu zantu=ghcr.io/${{ github.repository }}:${{ github.sha }} -n zantu
          kubectl rollout status deployment/zantu -n zantu

Now your deployments are one git push away.

Part 9: Real-World Learnings (The Expensive Lessons)

etcd on NVMe: Worth It?

Short answer: Yes, if you have it. But understand the tradeoffs.

I moved etcd (the control plane database) to NVMe expecting massive performance gains. What I learned:

Consumer NVMe (Samsung 980 Pro, WD Black, etc):

✅ Perfect for testing and homelab
✅ Massive improvement over SATA SSDs for responsiveness
⚠️ Write amplification under constant etcd workload
⚠️ Degrades over time with small random writes
💰 Cost: $50-150 per drive

Enterprise NVMe (Intel P5800X, Micron 9300, Samsung PM series):

✅ Handles etcd write patterns without degradation
✅ Power-loss protection (critical for etcd consistency)
✅ 3-5x endurance rating vs consumer drives
💰 Cost: $400-600 per 480GB drive

Performance gain either way: 20-30% reduction in API response times under load.

My recommendation:

Homelab/learning: Consumer NVMe is totally fine
Production with real users: Enterprise NVMe or good SATA SSDs
Mission-critical production: Enterprise NVMe with proper backups

Cost analysis for production:

Enterprise 480GB NVMe: ~$400-600
3x drives for HA: ~$1,500
Alternative: Quality SATA SSDs work great and cost 1/3 as much

Verdict: Consumer NVMe for testing/learning, enterprise for production. Or just use good SATA SSDs and skip the cost entirely—they’re fine for most use cases.

Chaos Engineering: What Actually Breaks

I deliberately broke my cluster in various ways. Here’s what I learned:

Test 1: Kill a control plane node

Expected: Cluster continues normally
Reality: 40-second hiccup while etcd re-elects leader
Fix: Reduce etcd heartbeat interval (not recommended for most)

Test 2: Kill a worker node

Expected: Pods reschedule to other workers
Reality: 5-minute delay before Kubernetes notices node is down
Fix: Configure node-monitor-period and node-monitor-grace-period

Test 3: Fill a worker’s disk

Expected: Graceful degradation
Reality: Node goes into EvictionHard state, kills ALL pods
Fix: Monitor disk usage, set proper eviction thresholds

Test 4: Network partition (split-brain scenario)

Expected: Cluster handles it gracefully
Reality: Got two separate clusters until network restored
Fix: Ensure odd number of control planes (3, not 2 or 4)

Resource Limits: The Hidden Gotcha

What the documentation says: “Set resource requests and limits on all pods”

What actually happens:

Set limits too low → pods get OOMKilled constantly
Set limits too high → waste resources, can’t schedule pods
Set requests without limits → one pod can starve others

My approach after 100+ restarts:

# For typical web apps:
resources:
  requests:
    cpu: 100m      # Actual average usage
    memory: 256Mi  # 2x average usage
  limits:
    cpu: 500m      # 5x average, allows bursts
    memory: 512Mi  # 2x requests, prevents runaway

Monitor actual usage for a week, then adjust. Theoretical planning fails here.

Chaos Monkey Exercise: Kill and Replace Control Plane 1

Why do this? Because someday CP1 WILL die. Better to learn how to handle it when you’re awake and caffeinated.

The scenario: Your first control plane node dies catastrophically. Maybe motherboard failure, maybe you accidentally formatted it while half-asleep. It happens.

Step 1: Verify cluster health before chaos

kubectl get nodes
kubectl get pods -A

Everything should be healthy. Take a screenshot—you’ll want proof later.

Step 2: Kill CP1 (192.168.1.121)

Shut it down hard:

# On CP1 node
poweroff

Or in Proxmox: Right-click VM → Stop (not shutdown, STOP).

Step 3: Watch what happens

From CP2 or CP3:

# Watch nodes
kubectl get nodes -w

# Watch etcd leader election
kubectl get pods -n kube-system | grep etcd

What you’ll observe:

CP1 goes NotReady after ~40 seconds
etcd re-elects leader (2-5 second hiccup in API)
Cluster continues operating normally
Pods keep running on workers
Rancher UI might hiccup briefly but recovers

This is HA working correctly.

Step 4: Replace CP1 with fresh node

Now the fun part—rebuilding a control plane node from scratch while cluster is live.

Option A: Rebuild same node

Reinstall OS on CP1 (192.168.1.121)
Follow “Part 4: Adding Additional Control Plane Nodes” steps
Point to HAProxy: server: https://192.168.1.128:9345
Use same token from surviving control planes
Start rke2-server

The new CP1 will join the existing cluster and sync etcd from CP2/CP3.

Option B: Replace with different node

Maybe CP1 hardware is dead. Build a new VM/server:

Assign NEW IP (e.g., 192.168.1.129)
Install OS, hostname: rke2-cp-04
Follow control plane join procedure
After it’s healthy, update HAProxy config to remove old .121, add new .129

# On HAProxy node, edit config
vi /etc/haproxy/haproxy.cfg

# In both backend sections, replace:
# server rke2-cp-01 192.168.1.121:6443 check
# server rke2-cp-01 192.168.1.121:9345 check

# With:
# server rke2-cp-04 192.168.1.129:6443 check
# server rke2-cp-04 192.168.1.129:9345 check

# Reload HAProxy (no downtime!)
systemctl reload haproxy

Step 5: Clean up old node from cluster

Once new control plane is healthy:

# Delete old node from Kubernetes
kubectl delete node rke2-cp-01

# Verify cluster health
kubectl get nodes
kubectl get componentstatuses

Step 6: Verify etcd cluster

# On any surviving control plane
export ETCDCTL_API=3
etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt \
  --cert=/var/lib/rancher/rke2/server/tls/etcd/server-client.crt \
  --key=/var/lib/rancher/rke2/server/tls/etcd/server-client.key \
  member list

# Should show 3 healthy members (2 old + 1 new)

What you learned:

✅ Control plane nodes are replaceable (they’re cattle, not pets)

✅ etcd quorum (2/3) keeps cluster alive

✅ HAProxy makes the replacement transparent to clients

✅ You can rebuild infrastructure without taking down applications

Do this exercise quarterly. Muscle memory matters when production breaks.

Part 10: Maintenance & Operations

Backing Up etcd (Critical!)

You will lose data if you don’t do this.

# On a control plane node
ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt \
  --cert=/var/lib/rancher/rke2/server/tls/etcd/server-client.crt \
  --key=/var/lib/rancher/rke2/server/tls/etcd/server-client.key

# Copy snapshot to safe location
# Automate this with a cronjob!

Upgrading RKE2

Never upgrade all nodes at once. Never.

# Upgrade first control plane
systemctl stop rke2-server
curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=v1.28.x+rke2r1 sh -
systemctl start rke2-server

# Wait 5 minutes, verify cluster health
kubectl get nodes

# Repeat for other control planes one at a time
# Then upgrade workers one at a time

Monitoring Stack (Prometheus + Grafana)

Don’t skip this. You need metrics to understand what’s actually happening in your cluster.

Easy way (via Rancher UI):

Open Rancher → Your Cluster
Go to “Apps” → “Charts”
Search for “Monitoring”
Click “Install”
Set Grafana admin password
Click “Install”

Done. Rancher configures everything correctly. Access Grafana through Rancher’s UI.

Manual way (via Helm):

If you want to understand what Rancher is doing under the hood:

# Add Prometheus Community Helm repo
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Create monitoring namespace
kubectl create namespace monitoring

# Install the full stack (Prometheus + Grafana + AlertManager + exporters)
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false \
  --set grafana.adminPassword=CHANGE-THIS-PASSWORD

# Wait for pods to be ready (takes 2-3 minutes)
kubectl get pods -n monitoring -w

Access Grafana:

# Port-forward Grafana to your local machine
kubectl port-forward -n monitoring svc/kube-prometheus-stack-grafana 3000:80

Open http://localhost:3000

Username: admin
Password: Whatever you set in --set grafana.adminPassword

Pre-built dashboards: Grafana comes with dashboards already configured. Check:

Kubernetes / Compute Resources / Cluster
Kubernetes / Compute Resources / Namespace (Pods)
Node Exporter / Nodes

What you’re monitoring:

Cluster resource usage (CPU, memory, disk)
Pod health and restarts
Node health
etcd performance
API server latency

Set up alerts:

The stack includes AlertManager but you need to configure it:

# Edit AlertManager config
kubectl -n monitoring edit secret alertmanager-kube-prometheus-stack-alertmanager

Add your notification channels (Slack, email, PagerDuty). This is critical for production—you need to know when things break before your users do.

What You’ve Built

You now have:

✅ High-availability Kubernetes cluster (3 control planes)

✅ Load-balanced control plane (HAProxy)

✅ 4 worker nodes ready for workloads

✅ Rancher management UI

✅ MetalLB for LoadBalancer services on bare metal

✅ Real application deployed from GitHub

✅ Understanding of failure modes through chaos testing

✅ Maintenance procedures that actually work

The Real Lessons

Building this cluster taught me more about Kubernetes than any certification ever could.

Not because I followed a perfect plan. Because I:

Built something I actually needed (my hosting platform)
Shipped fast, iterated based on reality
Broke things deliberately to understand failure modes
Documented the expensive lessons so you don’t repeat them

Total cost: ~$0 if you have hardware, ~$100-200/month if using cloud VMs for learning

Time investment: 2-3 days following this guide for basic deployment. I spent a week doing 12-hour days including all the chaos testing, breaking things deliberately, and documenting every mistake. You get to skip my debugging sessions.

ROI: You now understand infrastructure at a level most developers never reach.

Questions Worth Asking

“Why not just use k3s?”

Valid option. RKE2 is production-hardened for enterprise. k3s is lighter. Pick based on your requirements.

“Why HAProxy instead of MetalLB for control plane?”

HAProxy gives you a single, stable endpoint for the control plane. MetalLB is for application services. Different jobs.

“Isn’t this overengineered?”

Depends on your goal. If you want to run containers, yes. If you want to understand infrastructure deeply enough to troubleshoot production failures, no.

Compare this to running the same stack with docker-compose:

Setup time: 5 minutes vs 2 days
What you learn: How to run containers vs How infrastructure actually works
What breaks at 2am: Everything, and you’re helpless vs Specific component, and you know exactly how to fix it

I wrote a companion article showing the docker-compose version—same Zantu application, 10% of the complexity, 10% of the understanding: “Deploying Zantu with Docker Compose: When 5 Minutes Beats 2 Days“

“Why bare metal instead of managed Kubernetes?”

Control and learning. Managed services hide complexity—that’s a feature until it’s a bug and you’re helpless at 2am.

Next Steps

Harden security: Configure NetworkPolicies, PodSecurityPolicies
Add storage: Install Longhorn for persistent volumes
Setup CI/CD: Deploy ArgoCD or Flux for GitOps
Monitor costs: Set up resource quotas and limits
Document everything: Future you will thank present you

Questions? Problems? The official RKE2 docs are actually good: https://docs.rke2.io

But remember: Documentation tells you what to do. Breaking things in production teaches you why.

Build it. Break it. Learn from it.

That’s the system.

For the thought leadership angle on why hands-on learning beats theory, see the companion article: “The Pottery Class Paradox: Why Rapid Iteration + Reflection Beats Both Quantity and Quality“

2 Responses

Welcome to ScottHugh.com

I’m Scott Hugh, and this is where I write about the intersection of business, technology, and practical problem-solving.

What You’ll Find Here

HPF (Hocus Pocus Focus) Newsletter – My regular deep-dives shared on LinkedIn, exploring cloud-native infrastructure, Kubernetes, enterprise Linux, and open-source technologies. From architecture decisions to hands-on implementations, I focus on what actually works in production environments rather than vendor marketing.

Solutions Architecture & Technical Insights – Thoughts on enterprise infrastructure, hybrid cloud strategy, partner enablement, and technical approaches that deliver real customer value. I write about Linux systems, container orchestration, distributed teams, and the intersection of technical depth and business outcomes.

Frameworks & Practical Tools – Decision frameworks, technical approaches, and lessons learned from 30 years in technology – from building systems as a teenager to architecting solutions for global enterprises. Engineering mindset: build it, test it, see what works, iterate.

A Bit About Me

I’m a solutions architect and technical consultant with 30 years in technology – starting from building systems and databases as a teenager – with 25+ years designing, deploying, and managing enterprise infrastructure.

My approach: Build it, test it, see what works, iterate. I believe in open-source technologies, distributed team collaboration, and technical depth over buzzword bingo. Currently exploring new opportunities where I can combine technical expertise with partner enablement and customer success.

I’ve built Kubernetes clusters, migrated hundreds of customers across infrastructure platforms, presented to C-level executives, and trained partner technical teams. I code, I architect, I solve problems – and I share what I learn here.

My Interests

Beyond technical content, you might find posts about:

Open source advocacy – Ubuntu, Linux, and the technologies shaping modern infrastructure
Cloud-native architectures – Kubernetes, containers, and distributed systems
Partner enablement – Technical workshops, reference architectures, and solution development
Distributed team leadership – Remote work, cultural collaboration, and building cohesion across time zones
Continuous learning – Books, courses, and insights that change how I approach technical and business challenges
The Weekend SaaS Builder – YouTube channel documenting technical builds, infrastructure experiments, and hands-on learning

Connect With Me

This blog is where I work through ideas in detail. For shorter updates and discussions, find me on LinkedIn. I also document technical projects and infrastructure builds on YouTube (The Weekend SaaS Builder).

Thanks for visiting. Let’s figure out what actually works.

"Losers quit when they fail. Winners fail until they succeed."

Robert Kiyosaki

Building a Production-Grade RKE2 Kubernetes Cluster: A Practical Guide

Table of Contents

Prerequisites: What You Need Before Starting

Setting Up Your Local Machine for Remote Management

SSH Key-Based Authentication

Configure kubectl on Your Local Machine

Handy kubectl Commands (Bookmark This)

Part 1: Preparing Your Nodes

System Preparation (All Nodes)

Set Hostnames (Important for Troubleshooting)

Part 2: Installing HAProxy Load Balancer

Install HAProxy

Part 3: Installing RKE2 – First Control Plane Node

Install RKE2 on First Control Plane (192.168.1.121)

Verify First Node

Get the Join Token

Part 4: Adding Additional Control Plane Nodes

On Second Control Plane (192.168.1.122)

On Third Control Plane (192.168.1.123)

Verify All Control Planes

Part 5: Adding Worker Nodes

On Each Worker Node

Verify Complete Cluster

Part 6: Installing Rancher Management Platform

Install Cert-Manager First (Required)

Install Rancher

Access Rancher UI

Part 7: Installing MetalLB for LoadBalancer Services

Why MetalLB Matters

Install MetalLB

Configure IP Address Pool

Test MetalLB

Part 8: Deploying Your First Real Application

Example: Deploying a SaaS Application (Zantu)

Create Namespace and Registry Secret

Deploy PostgreSQL (Production HA Setup)

Deploy Application

Verify Deployment

Automating Deployments with a Script

Part 9: Real-World Learnings (The Expensive Lessons)

etcd on NVMe: Worth It?

Chaos Engineering: What Actually Breaks

Resource Limits: The Hidden Gotcha

Chaos Monkey Exercise: Kill and Replace Control Plane 1

Part 10: Maintenance & Operations

Backing Up etcd (Critical!)

Upgrading RKE2

Monitoring Stack (Prometheus + Grafana)

What You’ve Built

The Real Lessons

Questions Worth Asking

Next Steps

2 Responses

Leave a Reply Cancel reply

Welcome to ScottHugh.com

What You’ll Find Here

My Interests

Beyond technical content, you might find posts about:

Connect With Me