I am a big fan of OpenShift and it’s open-source upstream OKD. Kubernetes is a bunch of building blocks and API’s that can be messy to assemble and configure and RedHat has done a great job packaging K8S up.
Installing on most IaaS’s is great….except vSphere where UPI (User Provisioned Infrastructure) is quite complex and involved and IPI (Installer Provisioned Infrastructure) is almost perfect save one major issue!
When you install via IPI on vSphere, OKD/OpenShift boots VM’s with DHCP, first a Bootstrap node, which will grab a pair of Virtual IP’s you configure it to, then the three master nodes. The Virtual IP’s are pointed to a pair of “keepalived” load balancers that track nodes as they boot. Unfortunately, the keepalived only checks if the process is running, not if the load balancer is working, so it’s common to see installations fail with timeouts when the keepalived load balancer gets stuck, maybe moving from the bootstrap node to another node. Its traffic just stalls.
The solution I used on my home lab was to set up a HAProxy load balancer on a tiny Debian box, point my DNS to that load balancer, then that load balancer starts with the two VIP addresses. I add the masters nodes as they boot and come online. If the keepalived balancer fails, everything continues to boot happily. When the workers appear I add them as a backup if the keepalived fails.
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners stats timeout 30s user haproxy group haproxy daemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384 ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256 ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets defaults log global mode http option httplog option dontlognull timeout connect 5000 timeout client 50000 timeout server 50000 errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http listen stats bind 172.29.0.1:9090 balance mode http stats enable stats auth admin:admin stats uri /haproxy?stats # OKD # frontend https-bootstrap mode tcp bind 172.29.0.1:22623 default_backend bootstrap frontend https-masters mode tcp bind 172.29.0.1:6443 default_backend masters frontend http-apps mode http bind 172.29.0.2:80 default_backend http-pool frontend https-apps mode tcp bind 172.29.0.2:443 default_backend https-pool backend bootstrap mode tcp balance roundrobin server vip1 172.29.0.3:22623 check # Will move between nodes server master1 172.16.1.55:22623 check server master2 172.16.1.56:22623 check server master3 172.16.1.57:22623 check server boot1 172.16.1.58:22623 check # Bootstrap node backend masters mode tcp balance roundrobin server vip1 172.29.0.3:6443 check # Will move between nodes server master1 172.16.1.55:6443 check server master2 172.16.1.56:6443 check server master3 172.16.1.57:6443 check server boot1 172.16.1.58:6443 check # Bootstrap node backend http-pool mode http balance leastconn server vip2 172.29.0.4:80 check # Will move between nodes server worker1a 172.16.1.157:80 check server worker2a 172.16.1.156:80 check server worker3a 172.16.1.61:80 check server boot1 172.16.1.58:80 check # Bootstrap node backend https-pool mode tcp balance leastconn server vip2 172.29.0.4:443 check # Will move between nodes server worker1a 172.16.1.157:443 check server worker2a 172.16.1.156:443 check server worker3a 172.16.1.61:443 check server boot1 172.16.1.58:443 check # Bootstrap node
This isn’t a long-term solution because the D in DHCP stands for dynamic and addresses will change over time but for home labs or places where you don’t control the network this is how you can get going.