Photo by Kent Weitkamp on Unsplash
Great question.
The following is a cloud agnostic guide to installing a 3-node RKE cluster, installing the Rancher UI, and using them to run KubeCF on top for a quick, cheap development Cloud Foundry environment. Depending on the IaaS you are deploying on top of you may need to modify some of the configurations where applicable – i.e. cloud_provider
. Examples of these modifications for vSphere are included.
Table of Contents
Machine Preparation
The first step in creating our 3-node RKE cluster is prepping the machines themselves. These machines can be bare-metal, on-prem virtual, or cloud instances, it doesn’t really matter as long as they are capable of running a distribution of linux with a supporting container runtime (i.e. Docker). For the sake of this blog, we will be creating 3 Ubuntu virtual machines on vSphere each with 2 CPU, 4GB RAM, and 100GB Disk.
Once you have the VMs up and running with Ubuntu Server installed, it’s time to install Docker and the Rancher toolset.
Docker Installation
The following commands can be run to add the relevant apt repo. GPG keys and Docker are installed by:
$ sudo apt update## Install GPG
$ sudo apt install apt-transport-https ca-certificates curl software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
## Add docker repo and install
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
$ sudo apt update && sudo apt install docker-ce
Presuming all went smoothly, you should be able to check the status and see that Docker is now running:
$ sudo service docker status
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2020-02-13 11:14:33 EST; 1 months 16 days ago
Docs: https://docs.docker.com
Main PID: 1166 (dockerd)
Tasks: 42
Memory: 315.6M
CPU: 4d 8h 32min 36.342s
CGroup: /system.slice/docker.service
└─1166 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
SSH Setup
To create and orchestrate the cluster, RKE uses SSH for access to each of the machines. In this case, we are going to create a new ssh key with ssh-keygen
and add it to all of the machines with ssh-copy-id
. For ease of deployment, avoid adding a passphrase.
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/rke/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/rke/.ssh/id_rsa.
Your public key has been saved in /home/rke/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:2qtjgSJg4kj/VCT2x9lbLytYhFLJTHbz4bX8bVVIy1A [email protected]
The key's randomart image is:
+---[RSA 2048]----+
| +o.o.+Eo.|
| o ..=. +o=.o|
| . + o + ooo.|
|oo + = o . +|
|B . .. S . o . +|
|o......o o . o |
| . .o ... o o |
| .o o . . |
| ..o. . |
+----[SHA256]-----+
The following can then be performed for each of the new machines. The command will copy the ssh keys you generated to the other 2 nodes.
$ ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]<ip-addr>
Rancher & K8s Tools Installation
Now that we have Docker installed and ready and ssh configured, we need to install the tools used to create and manage the cluster. For this, all we need are rke
, helm
, and kubectl
.
Both rke
, helm
, and kubectl
need to be downloaded, made executable, and added to a place in your PATH:
## Install rke cli
$ wget https://github.com/rancher/rke/releases/download/v1.0.6/rke_linux-amd64
$ chmod +x rke_linux-amd64
$ sudo mv rke_linux-amd64 /usr/local/bin/rke
## Install helm cli
$ wget https://get.helm.sh/helm-v3.1.2-linux-amd64.tar.gz
$ tar -xvf helm-v3.1.2-linux-amd64.tar.gz linux-amd64/helm --strip 1
$ sudo mv helm /usr/local/bin/helm
## Install kubectl cli
$ wget https://storage.googleapis.com/kubernetes-release/release/v1.18.0/bin/linux/amd64/kubectl
$ chmod +x kubectl
$ sudo mv kubectl /usr/local/bin/kubectl
A quick note here about versions:
- The installation commands and instructions in this guide use Helm v3. If you reuse a jumpbox/laptop with older versions of Helm the instructions here will not work and error out in awkward ways. Verify your versions of Helm, rke and kubectl.
Creating the RKE Cluster
At this point, we are ready to configure and provision the new K8s cluster. While there are a lot of potential options to fiddle with, rke
will walk you through them and set up sane defaults to get going quickly. For our use case, we will be enabling all three roles (Control Plane, Worker, etcd) on each of our nodes.
The rke config
command will start a wizard bringing you through a series of questions with the goal of generating a cluster.yml
file. If you answer one of the questions you’ll be able to manually edit the cluster.yml
file before deploying the cluster. An example of the wizard is below:
$ rke config --name cluster.yml
[+] Cluster Level SSH Private Key Path [~/.ssh/id_rsa]:
[+] Number of Hosts [1]: 3
[+] SSH Address of host (1) [none]: 10.128.54.1
[+] SSH Port of host (1) [22]:
[+] SSH Private Key Path of host (10.128.54.1) [none]:
[-] You have entered empty SSH key path, trying fetch from SSH key parameter
[+] SSH Private Key of host (10.128.54.1) [none]:
[-] You have entered empty SSH key, defaulting to cluster level SSH key: ~/.ssh/id_rsa
[+] SSH User of host (10.128.54.1) [ubuntu]: rke
[+] Is host (10.128.54.1) a Control Plane host (y/n)? [y]: y
[+] Is host (10.128.54.1) a Worker host (y/n)? [n]: y
[+] Is host (10.128.54.1) an etcd host (y/n)? [n]: y
[+] Override Hostname of host (10.128.54.1) [none]: rke1
[+] Internal IP of host (10.128.54.1) [none]:
[+] Docker socket path on host (10.128.54.1) [/var/run/docker.sock]:
[+] SSH Address of host (2) [none]:
...
[+] Network Plugin Type (flannel, calico, weave, canal) [canal]:
[+] Authentication Strategy [x509]:
[+] Authorization Mode (rbac, none) [rbac]:
[+] Kubernetes Docker image [rancher/hyperkube:v1.17.2-rancher1]:
[+] Cluster domain [cluster.local]:
[+] Service Cluster IP Range [10.43.0.0/16]:
[+] Enable PodSecurityPolicy [n]:
[+] Cluster Network CIDR [10.42.0.0/16]:
[+] Cluster DNS Service IP [10.43.0.10]:
[+] Add addon manifest URLs or YAML files [no]:
In running the interactive command above and answering the questions regarding our machines, network config, and K8s options rke
has generated a lengthy cluster.yml
file that is the main source of truth for the deployment.
You may want to consider modifying the cluster.yml file to add in cloud_provider
options based on your underlying IaaS or change any answers you gave in the previous step before deployment. An example cloud_provider config is shown below for vSphere – we will be going into a deeper dive in another post regarding the vSphere cloud provider specifically if you run into issues or have questions regarding that. For other IaaSs – please refer to the Rancher documentation here.
cloud_provider:
name: vsphere
vsphereCloudProvider:
global:
insecure-flag: false
virtual_center:
vsphere.lab.example.com:
user: "vsphere-user"
password: "vsphere-password"
port: 443
datacenters: /Lab-Datacenter
workspace:
server: vsphere.lab.example.com
folder: /Lab-Datacenter/vm/k8s-demo-lab/vms
default-datastore: /Lab-Datacenter/datastore/Datastore-1
datacenter: /Lab-Datacenter
In addition to adding the cloud_provider section above for your specific IaaS, you also should add the below section under the services
key so that it looks like the following. This allows the cluster to sign certificate requests which is required by the KubeCF deployment for our dev environment.
services:
kube-controller:
extra_args:
cluster-signing-cert-file: /etc/kubernetes/ssl/kube-ca.pem
cluster-signing-key-file: /etc/kubernetes/ssl/kube-ca-key.pem
Now that we have our cluster.yml
prepared and ready to deploy, it can be rolled out using rke up
:
$ rke up --config cluster.yml
INFO[0000] Running RKE version: v1.0.4
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [certificates] Generating admin certificates and kubeconfig
INFO[0000] Successfully Deployed state file at [./cluster.rkestate]
INFO[0000] Building Kubernetes cluster
INFO[0000] [dialer] Setup tunnel for host [10.128.54.0]
INFO[0002] [network] No hosts added existing cluster, skipping port check
INFO[0002] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0002] Checking if container [cert-deployer] is running on host [10.128.54.0], try #1
INFO[0003] Image [rancher/rke-tools:v0.1.52] exists on host [10.128.54.0]
INFO[0010] Starting container [cert-deployer] on host [10.128.54.0], try #1
INFO[0025] Checking if container [cert-deployer] is running on host [10.128.54.0], try #1
INFO[0031] Checking if container [cert-deployer] is running on host [10.128.54.0], try #1
INFO[0031] Removing container [cert-deployer] on host [10.128.54.0], try #1
INFO[0031] [reconcile] Rebuilding and updating local kube config
INFO[0031] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0031] [reconcile] host [10.128.54.0] is active master on the cluster
INFO[0031] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0031] [reconcile] Reconciling cluster state
INFO[0031] [reconcile] Check etcd hosts to be deleted
INFO[0031] [reconcile] Check etcd hosts to be added
INFO[0031] [reconcile] Rebuilding and updating local kube config
INFO[0031] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0031] [reconcile] host [10.128.54.0] is active master on the cluster
INFO[0031] [reconcile] Reconciled cluster state successfully
INFO[0031] Pre-pulling kubernetes images
...
INFO[0038] [addons] Saving ConfigMap for addon rke-ingress-controller to Kubernetes
INFO[0038] [addons] Successfully saved ConfigMap for addon rke-ingress-controller to Kubernetes
INFO[0038] [addons] Executing deploy job rke-ingress-controller
INFO[0038] [ingress] ingress controller nginx deployed successfully
INFO[0038] [addons] Setting up user addons
INFO[0038] [addons] no user addons defined
INFO[0038] Finished building Kubernetes cluster successfully
At this point we should have a cluster up and running and a few new files will have been generated – cluster.rkestate
and kube_config_cluster.yml
. In order to perform future updates against the cluster you need to preserve the cluster.rkestate
file otherwise rke wont be able to properly interact with the cluster.
We can run some basic commands to ensure that the new cluster is up and running and then move on to installing the Rancher UI:
$ export KUBECONFIG=$(pwd)/kube_config_cluster.yml
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
rancher Ready controlplane,etcd,worker 21d v1.17.2
$ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
canal-p7jgr 2/2 Running 2 21d
coredns-7c5566588d-hrhtr 1/1 Running 1 21d
coredns-autoscaler-65bfc8d47d-fz285 1/1 Running 1 21d
metrics-server-6b55c64f86-mq99l 1/1 Running 1 21d
rke-coredns-addon-deploy-job-7vgcd 0/1 Completed 0 21d
rke-ingress-controller-deploy-job-97tln 0/1 Completed 0 21d
rke-metrics-addon-deploy-job-lk4qk 0/1 Completed 0 21d
rke-network-plugin-deploy-job-vlhvq 0/1 Completed 0 21d
Assuming everything looks similar you should be ready to proceed.
Installing the Rancher UI
One prerequisite to installing Rancher UI is cert-manager
presuming you are not using are not bringing your own certs or using letsencrypt. Thankfully, the installation process is just one command:
$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.yaml
And to check that it is working, make sure all the pods come up ok:
$ kubectl get pods --namespace cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-64b6c865d9-kss6c 1/1 Running 0 21d
cert-manager-cainjector-bfcf448b8-q98q6 1/1 Running 0 21d
cert-manager-webhook-7f5bf9cbdf-d66k8 1/1 Running 0 21d
Now we can install Rancher via helm
:
$ helm install rancher rancher-latest/rancher \
--namespace cattle-system \
--set hostname=rancher.lab.example.com
And wait for the deployment to roll out:
$ kubectl -n cattle-system rollout status deploy/rancher
Waiting for deployment "rancher" rollout to finish: 0 of 3 updated replicas are available...
deployment "rancher" successfully rolled out
Presuming all went according to plan (and you configured DNS accordingly to point at your nodes) – the Rancher UI should now be available at the domain you configured.
After setting up the admin account, you should be able to sign in and view your new cluster:
Installing KubeCF
KubeCF is currently deployed in two parts – the cf-operator
and the kubecf deployment
which leverages the cf-operator to turn a traditional manifest into K8s native spec.
Another quick note about versions:
- The versions of the
cf-operator
andKubeCF
are very important. As of this writing there is not a matrix of operator compatability but the examples provided below have been tested to work. The release notes for KubeCF reference the version of the cf-operator that particular version was tested with. For example, the release for KubeCF we are deploying can be found here hascf-operator
v3.3.0 as listed under dependencies.
So let’s start with deploying cf-operator via helm
:
$ kubectl create namespace cf-operator
$ helm install cf-operator \
--namespace cf-operator \
--set "global.operator.watchNamespace=kubecf" \
https://s3.amazonaws.com/cf-operators/release/helm-charts/cf-operator-3.3.0%2B0.gf32b521e.tgz
After deploying, there should be two pods created in the cf-operator
namespace. We should check to make sure they are both up and ready (STATUS=Running) before deploying KubeCF:
$ kubectl -n cf-operator get pods
NAME READY STATUS RESTARTS AGE
cf-operator-69848766f6-lw82r 1/1 Running 0 29s
cf-operator-quarks-job-5bb6fc7bd6-qlg8l 1/1 Running 0 29s
Now it’s time to deploy KubeCF, for this environment we are going to deploy with the defaults with the exception of using Eirini for application workloads. For more information regarding the different deployment options and features of KubeCF, check out our previous blog here.
$ helm install kubecf \
--namespace kubecf \
--set system_domain=system.kubecf.example.com \
--set features.eirini.enabled=true \
https://github.com/cloudfoundry-incubator/kubecf/releases/download/v1.0.1/kubecf-v1.0.1.tgz
After running the helm deploy, it’ll take a few minutes to start spinning CF pods in the kubecf
namespace. We can then watch the pods come up and wait for them all to have a ready status – on average it should take between 20-45 minutes depending on the options you selected and the specs of cluster you are deploying to. You may see some of the pods failing and restarting a few times as the cluster comes up as they are waiting for different dependencies to be available.
$ watch kubectl get po -n kubecf
NAME READY STATUS RESTARTS AGE
ig-kubecf-51d0cf09745042ad-l7xnb 0/20 Init:4/37 0 3m11s
kubecf-database-0 2/2 Running 0 3m24s
While this is deploying, check out the IP address associated with the kubecf-router-public
loadbalancer and add a wildcard DNS record for the system_domain
you specified above as well as any additional application domains:
$ kubectl get svc -n kubecf | grep -i load
kubecf-cc-uploader ClusterIP 10.43.196.54 <none> 9090/TCP,9091/TCP
kubecf-router-public LoadBalancer 10.43.212.247 10.128.54.241 80:32019/TCP,443:32255/TCP
kubecf-ssh-proxy-public LoadBalancer 10.43.174.207 10.128.54.240 2222:31768/TCP
kubecf-tcp-router-public LoadBalancer 10.43.167.176 10.128.54.242 80:30897/TCP,20000:30896/TCP,20001
The final deployed state should look like the following:
$ kubectl get po -n kubecf
NAME READY STATUS RESTARTS AGE
kubecf-adapter-0 4/4 Running 0 24m
kubecf-api-0 15/15 Running 1 24m
kubecf-bits-0 6/6 Running 0 23m
kubecf-bosh-dns-59cd464989-bh2dp 1/1 Running 0 24m
kubecf-bosh-dns-59cd464989-mgw7z 1/1 Running 0 24m
kubecf-cc-worker-0 4/4 Running 0 23m
kubecf-credhub-0 5/6 Running 0 24m
kubecf-database-0 2/2 Running 0 36m
kubecf-diego-api-0 6/6 Running 2 24m
kubecf-doppler-0 9/9 Running 0 24m
kubecf-eirini-0 9/9 Running 0 23m
kubecf-log-api-0 7/7 Running 0 23m
kubecf-nats-0 4/4 Running 0 24m
kubecf-router-0 5/5 Running 0 23m
kubecf-routing-api-0 4/4 Running 0 23m
kubecf-scheduler-0 8/8 Running 0 23m
kubecf-singleton-blobstore-0 6/6 Running 0 24m
kubecf-tcp-router-0 5/5 Running 0 24m
kubecf-uaa-0 7/7 Running 6 24m
Logging Into CF
Assuming you have the CF CLI already installed, (see this if not), you can target and authenticate to the Cloud Foundry deployment as seen below, remembering to update the system_domain
to the one you deployed with:
$ cf api --skip-ssl-validation "https://api.<system_domain>"
$ admin_pass=$(kubectl get secret \
--namespace kubecf kubecf.var-cf-admin-password \
-o jsonpath='{.data.password}' \
| base64 --decode)
$ cf auth admin "${admin_pass}"
Pushing a Test Application
Now that our new foundation is up and running, it’s time to test it by adding a space and pushing an application. Let’s start by creating the system
space within the system
org.
$ cf target -o system
$ cf create-space system
$ cf target -s system
Now the app we will be deploying is called cf-env
, it is a simple application used for debugging/testing. It displays its running Environment and HTTP Request Headers.
To deploy it, clone the repo and push it to the new foundation
$ git clone [email protected]:cloudfoundry-community/cf-env.git
$ cd cf-env
$ cf push -n test
The first deployment usually takes a couple minutes to stage and start running, but after the app comes up you should be able to visit http://test.<system_domain>
and see output similar to the following.
Running Smoke Tests
KubeCF uses cf-deployment
under the hood as the blueprint for deploying Cloud Foundry. Inside of cf-deployment
you can run “smoke-tests” to run non-destructive validation that your Cloud Foundry deployment is in a happy state.
To run the smoke-tests at any time run a simple kubectl
patch command to invoke the smoke tests:
$ kubectl patch qjob kubecf-smoke-tests --namespace kubecf --type merge --patch '{ "spec": { "trigger": { "strategy": "now" } } }'
In v4 of the cf-operator, replace kubecf-smoke-tests
with smoke-tests
.
This will create a new job
and pod
each prefixed with kubecf-smoke-tests-*
.
There are a few containers which will spin up in the pod, if you tail the logs on the smoke-tests-smoke-tests
container you will see the logs:
$ k logs kubecf-smoke-tests-4078f266ae3dff68-rdhz4 -c smoke-tests-smoke-tests -n kubecf -f
Running smoke tests...
Running binaries smoke/isolation_segments/isolation_segments.test
smoke/logging/logging.test
smoke/runtime/runtime.test
[1585940920] CF-Isolation-Segment-Smoke-Tests - 4 specs - 4 nodes SSSS SUCCESS! 29.974196268s
[1585940920] CF-Logging-Smoke-Tests - 2 specs - 4 nodes S• SUCCESS! 1m56.090729823s
[1585940920] CF-Runtime-Smoke-Tests - 2 specs - 4 nodes S• SUCCESS! 2m37.907767486s
Ginkgo ran 3 suites in 5m4.100902481s
Test Suite Passed
Adding KubeCF Namespaces to a Rancher Project
Now that the foundation is happily running, it’s time to add it to a Rancher project for ease of visibility and management. Rancher projects allow you to group a collection of namespaces together within the Rancher UI and also allows for setting of quotas and sharing of secrets across all the underlying namespaces.
From the cluster dashboard, click on Projects/Namespaces
.
As you can see from the following image, the three KubeCF namespaces ( kubecf
, kubecf-eirini
, and cf-operator
) all do not currently belong to a project. Let’s fix that, starting by selecting Add Project
.
For this deployment we are just going to fill in a name, leave all of the other options as default, and click Create
.
Then from the Projects/Namespaces
screen, we are going to select the three KubeCF namespaces and then click Move
.
Select the new project you just created and confirm by selecting Move
. At this point, the namespaces are added to your new project and their resources can now be easily accessed from the UI.
At this point, your new foundation on top of RKE is ready to roll.