Deploying HA Cluster-API-Provider-vSphere Kubernetes Clusters with CAPI version v1alpha2
In this post I’ll cover standing up a v1alpha2 cluster-api-vsphere based HA cluster on vSphere 6.7U3. There are a lot of posts out there that talk about running Kubernetes on vSphere leveraging CAPV, but I’m unable to find anything that represents a more robust HA control-plane deployment. Like this:
A NOTE: This guide is for v1alpha2 clusters. Currently work is being done in the cluster-api-provider-vsphere repository to enable v1alpha3 features. The master branch of the repository can be broken at times. I recommend using the 0.5.4 release which maps to v1alpha2 for stability’s sake.
Cluster API Provider vSphere basics
For those unfamiliar with cluster-api-vsphere, I recommend checking out the getting started guide from the cluster-api-vsphere repository For a higher level overview of CAPV review this blog by Chris Milstead and this blog by Eric Shanks.
As mentioned in Chris’ blog, an implementation of an initial Kubernetes control-plane node for vSphere requires KubeadmConfig
, Machine
, and VSphereMachine
objects to be created in a cluster-api management cluster. A framework for creating these objects is provided by leveraging the quickstart’s documentation. Nearly all posts stop here and move on to worker node MachineDeployment
configuration. In this post, we’ll cover what else is needed to deploy an HA control-plane.
What’s needed to complete the HA setup?
To configure an HA control-plane, we’ll add two additional control plane nodes and a load balancer. Here is a list of the changes needed:
- Two additional control plane nodes.
- A load balancer for the Kubernetes API server on each of the three control-plane nodes.
- Static IP’s for your control plane nodes.
- KubeadmConfigs for the joining control plane nodes.
Load balancing
Kubernetes doesn’t care about the load balancer in front of the api server, as long as it forwards TCP traffic to port 6443 (port can be altered to taste). In my lab, I’m using Nginx to provide load balancing from a static IP address to static IP based control-plane nodes. The setup looks like this:
My nginx.conf looks like this:
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
# Load dynamic modules. See /usr/share/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
stream {
upstream kube_api {
server 192.168.10.31:6443;
server 192.168.10.32:6443;
server 192.168.10.33:6443;
}
server {
listen 6443;
proxy_pass kube_api;
}
}
To enable use of this load balancer configuration edits are needed to the controlplane.yaml as created by following the instructions here Edit the KubeadmConfig
of the controlplane.yaml:
- Add apiServerCertSANS to the clusterConfiguration spec
- Add a controlPlaneEndpoint to the clusterConfiguration spec
Both should be configured to leverage the IP and port of the load balancer configured to support the control-plane. An example of this full configuration from my lab is below:
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
metadata:
name: capv-5-controlplane-0
namespace: default
spec:
clusterConfiguration:
apiServer:
extraArgs:
cloud-provider: external
apiServerCertSANS: # Add the apiServerCertSANS
- 192.168.10.18 # Specify an IP address
controllerManager:
extraArgs:
cloud-provider: external
controlPlaneEndpoint: 192.168.10.18:6443 # Load balancer IP address goes here
imageRepository: k8s.gcr.io
initConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
kubeletExtraArgs:
cloud-provider: external
name: '{{ ds.meta_data.hostname }}'
preKubeadmCommands:
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
users:
- name: capv
sshAuthorizedKeys:
- "The public side of an SSH key pair."
sudo: ALL=(ALL) NOPASSWD:ALL
Control-Plane Node Static IP Assignment
I’m using Ubuntu images in my lab as I found some open GitHub issues that referenced static IP configuration problems in CentOS related to the upstream image-builder tooling for kubernetes. If you’re new to the Kubernetes space on vSphere and can use Ubuntu my recommendation would be do so for now.
To configure static IP’s for control plane nodes edit the related VSphereMachine
network spec setting dhcp4 to false, adding gateway4, ipAddrs, nameservers, and searchDomains. Here’s how this section of my controlplane.yaml ends up looking:
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
metadata:
labels:
cluster.x-k8s.io/cluster-name: management-cluster
cluster.x-k8s.io/control-plane: "true"
name: capv-5-cluster-controlplane-0
namespace: default
spec:
datacenter: lab
diskGiB: 50
memoryMiB: 4096
network:
devices:
- dhcp4: false # Update this to false
dhcp6: false
gateway4: 192.168.10.1 # Add a gateway
ipAddrs:
- 192.168.10.31/24 # Specify an IP address
nameservers: # Add nameservers
- 192.168.10.50 # nameserver IP
- 192.168.10.51 # nameserver IP
networkName: management
searchDomains: # add searchDomains
- timcarr.net # provide a serch domain
numCPUs: 4
template: ubuntu-1804-kube-v1.16.3
Your controlplane.yaml file should include configurations for Machine
, VSphereMachine
, and the updated KubeadmConfig
. These three configurations define a how the single controlplane-0 VM are configured in vSphere (Machine
and VSphereMachine
), and bootstrapped (KubeadmConfig
) in Kubernetes leveraging our external load balancer. Applying this configuration to your CAPV enabled management cluster should provision that single node.
Adding Additional Control Plane Nodes
For each of the additional two control-plane nodes, Machine
,VSphereMachine
, and KubeadmConfig
specifying a joinConfiguration
objects are needed.
Machine
settings
Each control-plane machine
configuration needs to contain a label cluster.x-k8s.io/control-plane: "true"
that informs cluster-api that the machine specified is a part of the referenced cluster’s control-plane. You’ll also note that the Machine
object references bootstrap (KubeadmConfig
) and infrastructure (VSphereMachine
) objects. Here’s an example from my controlplane.yaml file including the needed label:
---
apiVersion: cluster.x-k8s.io/v1alpha2
kind: Machine
metadata:
labels:
cluster.x-k8s.io/cluster-name: capv-5
cluster.x-k8s.io/control-plane: "true" # Add control-plane line here
name: capv-5-controlplane-1 # Note that we're creating a new machine
namespace: default
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
name: capv-5-controlplane-1
namespace: default
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
name: capv-5-controlplane-1
namespace: default
version: 1.16.3
VSphereMachine
settings
EachVSphereMachine
configuration requires same cluster.x-k8s.io/control-plane: "true"
as well as changes to specify a static IP. Here’s an example from my controlplane.yaml file:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
metadata:
labels:
cluster.x-k8s.io/cluster-name: capv-5
cluster.x-k8s.io/control-plane: "true" # Add control-plane line here
name: capv-5-controlplane-1
namespace: default
spec:
datacenter: lab
diskGiB: 50
memoryMiB: 2048
network:
devices:
- dhcp4: false # Update this to false
dhcp6: false
gateway4: 192.168.10.1 # Add a gateway
ipAddrs:
- 192.168.10.32/24 # Specify an IP address
nameservers: # Add nameservers
- 192.168.10.50 # nameserver IP
- 192.168.10.51 # nameserver IP
networkName: management
searchDomains: # add searchDomains
- timcarr.net # provide a serch domain
numCPUs: 2
template: ubuntu-1804-kube-v1.16.3
KubeadmConfig
settings
The KubeadmConfig
setting here differs from the initial configuration for control-plane-0 as it just tells subsequent nodes that they should join the cluster using the included joinConfiguration
. Here is an example of my configuration.
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
metadata:
name: capv-5-controlplane-1
namespace: default
spec:
joinConfiguration: # Note the joinConfiguration here
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
kubeletExtraArgs:
cloud-provider: external
name: '{{ ds.meta_data.hostname }}'
preKubeadmCommands:
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
users:
- name: capv
sshAuthorizedKeys:
- "The public side of an SSH key pair."
sudo: ALL=(ALL) NOPASSWD:ALL
All of these configuration settings are great, but how do I find them?
Actually this is somewhat of a complicated process at the moment. I’ll help by pointing out the v1alpha2 types available within the code on GitHub. Opening the cluster-api-vsphere GitHub repository and clicking api and then v1alpha2 folders reveals the following types.go file:
Within that file, we’re able to review all of the configuration options that we are able to define for our virtual machine’s networking configuration:
Wrapping it up.
At this point you should be able to copy the exact same settings above and modify the names in the configuration files and IP addresses to provide a third node.
Here’s a final version of my controlplane.yaml with all of these objects defined omitting my public SSH key:
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
metadata:
name: capv-5-controlplane-0
namespace: default
spec:
clusterConfiguration:
apiServer:
extraArgs:
cloud-provider: external
apiServerCertSANS:
- 192.168.10.18
controllerManager:
extraArgs:
cloud-provider: external
controlPlaneEndpoint: 192.168.10.18:6443
imageRepository: k8s.gcr.io
initConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
kubeletExtraArgs:
cloud-provider: external
name: '{{ ds.meta_data.hostname }}'
preKubeadmCommands:
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
users:
- name: capv
sshAuthorizedKeys:
- "The public side of an SSH key pair."
sudo: ALL=(ALL) NOPASSWD:ALL
---
apiVersion: cluster.x-k8s.io/v1alpha2
kind: Machine
metadata:
labels:
cluster.x-k8s.io/cluster-name: capv-5
cluster.x-k8s.io/control-plane: "true"
name: capv-5-controlplane-0
namespace: default
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
name: capv-5-controlplane-0
namespace: default
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
name: capv-5-controlplane-0
namespace: default
version: 1.16.3
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
metadata:
labels:
cluster.x-k8s.io/cluster-name: capv-5
cluster.x-k8s.io/control-plane: "true"
name: capv-5-controlplane-0
namespace: default
spec:
datacenter: lab
diskGiB: 50
memoryMiB: 2048
network:
devices:
- dhcp4: false
dhcp6: false
gateway4: 192.168.10.1
ipAddrs:
- 192.168.10.31/24
nameservers:
- 192.168.10.50
- 192.168.10.51
networkName: management
searchDomains:
- timcarr.net
numCPUs: 2
template: ubuntu-1804-kube-v1.16.3
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
metadata:
name: capv-5-controlplane-1
namespace: default
spec:
joinConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
kubeletExtraArgs:
cloud-provider: external
name: '{{ ds.meta_data.hostname }}'
preKubeadmCommands:
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
users:
- name: capv
sshAuthorizedKeys:
- "The public side of an SSH key pair."
sudo: ALL=(ALL) NOPASSWD:ALL
---
apiVersion: cluster.x-k8s.io/v1alpha2
kind: Machine
metadata:
labels:
cluster.x-k8s.io/cluster-name: capv-5
cluster.x-k8s.io/control-plane: "true"
name: capv-5-controlplane-1
namespace: default
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
name: capv-5-controlplane-1
namespace: default
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
name: capv-5-controlplane-1
namespace: default
version: 1.16.3
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
metadata:
labels:
cluster.x-k8s.io/cluster-name: capv-5
cluster.x-k8s.io/control-plane: "true"
name: capv-5-controlplane-1
namespace: default
spec:
datacenter: lab
diskGiB: 50
memoryMiB: 2048
network:
devices:
- dhcp4: false
dhcp6: false
gateway4: 192.168.10.1
ipAddrs:
- 192.168.10.32/24
nameservers:
- 192.168.10.50
- 192.168.10.51
networkName: management
searchDomains:
- timcarr.net
numCPUs: 2
template: ubuntu-1804-kube-v1.16.3
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
metadata:
name: capv-5-controlplane-2
namespace: default
spec:
joinConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
kubeletExtraArgs:
cloud-provider: external
name: '{{ ds.meta_data.hostname }}'
preKubeadmCommands:
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
users:
- name: capv
sshAuthorizedKeys:
- "The public side of an SSH key pair."
sudo: ALL=(ALL) NOPASSWD:ALL
---
apiVersion: cluster.x-k8s.io/v1alpha2
kind: Machine
metadata:
labels:
cluster.x-k8s.io/cluster-name: capv-5
cluster.x-k8s.io/control-plane: "true"
name: capv-5-controlplane-2
namespace: default
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
name: capv-5-controlplane-2
namespace: default
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
name: capv-5-controlplane-2
namespace: default
version: 1.16.3
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: VSphereMachine
metadata:
labels:
cluster.x-k8s.io/cluster-name: capv-5
cluster.x-k8s.io/control-plane: "true"
name: capv-5-controlplane-2
namespace: default
spec:
datacenter: lab
diskGiB: 50
memoryMiB: 2048
network:
devices:
- dhcp4: false
dhcp6: false
gateway4: 192.168.10.1
ipAddrs:
- 192.168.10.33/24
nameservers:
- 192.168.10.50
- 192.168.10.51
networkName: management
searchDomains:
- timcarr.net
numCPUs: 2
template: ubuntu-1804-kube-v1.16.3
In the end your vSphere environment should look like this:
One final note. I’ve setup my lab to ensure that my management VM has a static IP - you should know how to do that now! Hope this helps everyone looking to get started with Cluster-API-Provider-vSphere.