I am a big fan of OpenShift and it’s open-source upstream OKD. Kubernetes is a bunch of building blocks and API’s that can be messy to assemble and configure and RedHat has done a great job packaging K8S up.
Installing on most IaaS’s is great….except vSphere where UPI (User Provisioned Infrastructure) is quite complex and involved and IPI (Installer Provisioned Infrastructure) is almost perfect save one major issue!
When you install via IPI on vSphere, OKD/OpenShift boots VM’s with DHCP, first a Bootstrap node, which will grab a pair of Virtual IP’s you configure it to, then the three master nodes. The Virtual IP’s are pointed to a pair of “keepalived” load balancers that track nodes as they boot. Unfortunately, the keepalived only checks if the process is running, not if the load balancer is working, so it’s common to see installations fail with timeouts when the keepalived load balancer gets stuck, maybe moving from the bootstrap node to another node. Its traffic just stalls.
The solution I used on my home lab was to set up a HAProxy load balancer on a tiny Debian box, point my DNS to that load balancer, then that load balancer starts with the two VIP addresses. I add the masters nodes as they boot and come online. If the keepalived balancer fails, everything continues to boot happily. When the workers appear I add them as a backup if the keepalived fails.
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
listen stats
bind 172.29.0.1:9090
balance
mode http
stats enable
stats auth admin:admin
stats uri /haproxy?stats
# OKD
#
frontend https-bootstrap
mode tcp
bind 172.29.0.1:22623
default_backend bootstrap
frontend https-masters
mode tcp
bind 172.29.0.1:6443
default_backend masters
frontend http-apps
mode http
bind 172.29.0.2:80
default_backend http-pool
frontend https-apps
mode tcp
bind 172.29.0.2:443
default_backend https-pool
backend bootstrap
mode tcp
balance roundrobin
server vip1 172.29.0.3:22623 check # Will move between nodes
server master1 172.16.1.55:22623 check
server master2 172.16.1.56:22623 check
server master3 172.16.1.57:22623 check
server boot1 172.16.1.58:22623 check # Bootstrap node
backend masters
mode tcp
balance roundrobin
server vip1 172.29.0.3:6443 check # Will move between nodes
server master1 172.16.1.55:6443 check
server master2 172.16.1.56:6443 check
server master3 172.16.1.57:6443 check
server boot1 172.16.1.58:6443 check # Bootstrap node
backend http-pool
mode http
balance leastconn
server vip2 172.29.0.4:80 check # Will move between nodes
server worker1a 172.16.1.157:80 check
server worker2a 172.16.1.156:80 check
server worker3a 172.16.1.61:80 check
server boot1 172.16.1.58:80 check # Bootstrap node
backend https-pool
mode tcp
balance leastconn
server vip2 172.29.0.4:443 check # Will move between nodes
server worker1a 172.16.1.157:443 check
server worker2a 172.16.1.156:443 check
server worker3a 172.16.1.61:443 check
server boot1 172.16.1.58:443 check # Bootstrap node
This isn’t a long-term solution because the D in DHCP stands for dynamic and addresses will change over time but for home labs or places where you don’t control the network this is how you can get going.