Migrating a TKG cluster control-plane endpoint from kube-vip to NSX-ALB

With the introduction of TKG 1.4 you can now use NSX Advanced Load Balancer (NSA ALB) to supply the control plane endpoint VIP instead of kube-proxy. This is a great advancement but it’s not clear how to take advantage of this feature for upgraded management and workload clusters. In this post, I’ll walk through the process of migrating from kube-vip to NSX ALB for your control plane endpoint VIPs.

For reference, you can read more about kube-vip at https://kube-vip.io/.

kube-vip is run as a static pod on each control plane node (/etc/kubernetes/manifests/kube-vip.yaml).

kubectl -n kube-system get po kube-vip-tkg-mgmt-control-plane-9p6wr -o yaml

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/config.hash: 07e79284b8211d618d8dbeb6f0e31b06
    kubernetes.io/config.mirror: 07e79284b8211d618d8dbeb6f0e31b06
    kubernetes.io/config.seen: "2021-09-11T17:32:03.689331427Z"
    kubernetes.io/config.source: file
  creationTimestamp: "2021-09-11T17:32:10Z"
  name: kube-vip-tkg-mgmt-control-plane-9p6wr
  namespace: kube-system
  ownerReferences:
  - apiVersion: v1
    controller: true
    kind: Node
    name: tkg-mgmt-control-plane-9p6wr
    uid: c3da0343-8756-46c4-a1cd-0379ac478391
  resourceVersion: "1476631"
  uid: 3a6ecf8b-17fa-4bfe-aa03-d079d2206f8c
spec:
  containers:
  - args:
    - start
    env:
    - name: vip_arp
      value: "true"
    - name: vip_leaderelection
      value: "true"
    - name: address
      value: 192.168.130.128
    - name: vip_interface
      value: eth0
    - name: vip_leaseduration
      value: "15"
    - name: vip_renewdeadline
      value: "10"
    - name: vip_retryperiod
      value: "2"
    image: projects.registry.vmware.com/tkg/kube-vip:v0.3.2_vmware.1
    imagePullPolicy: IfNotPresent
    name: kube-vip
    resources: {}
    securityContext:
      capabilities:
        add:
        - NET_ADMIN
        - SYS_TIME
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/kubernetes/admin.conf
      name: kubeconfig
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  nodeName: tkg-mgmt-control-plane-9p6wr
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    operator: Exists
  volumes:
  - hostPath:
      path: /etc/kubernetes/admin.conf
      type: FileOrCreate
    name: kubeconfig

You can see from this output that the kube-vip pod has an environment variable named address with a value of 192.168.130.128. This is the address that was assigned to the management cluster’s VSPHERE_CONTROL_PLANE_ENDPOINT value during creation.

Before we get too far, we can take a look in NSX ALB to see that we only have a single virtual service configured. This is the service that is providing a load balancer address to the contour/envoy application in the workload cluster.

We first need to make some static entries for any existing control plane endpoint addresses in NSX ALB. To get started, navigate to Infrastructure, Network and click the Edit button next to the network containing the control-plane endpoint (k8s-workload in this example):

Click the Edit button next to the appropriate subnet (192.168.130.0/24 in this example):

Click the Add Static IP Address Pool button:

Check the Use for VIPs radio button. Enter the management cluster control plane endpoint address as a range:

Click the Add Static IP Address Pool button again to add the IP address (again as a range) for the first workload cluster (192.168.130.129 in my example). Be sure to check the Use for VIPs radio button as well. Repeat for any additional workload clusters.

Click the Save button.

Click the Save button again.

You will need to issue a command similar to the following to annotate the management cluster such that the control plane endpoint address is noted on the cluster. Replace the cluster name and IP address with appropriate values:

kubectl -n tkg-system annotate --overwrite cluster tkg-mgmt tkg.tanzu.vmware.com/cluster-controlplane-endpoint='192.168.130.128'

cluster.cluster.x-k8s.io/tkg-mgmt annotated

You can check to see that the annotation is in place:

kubectl -n tkg-system get cluster tkg-mgmt -o jsonpath='{.metadata.annotations.tkg\.tanzu\.vmware\.com\/cluster-controlplane-endpoint}'

192.168.130.128

And you can repeat the annotation for any workload clusters. I only have the one so it was just the following command:

kubectl annotate --overwrite cluster tkg-wld tkg.tanzu.vmware.com/cluster-controlplane-endpoint='192.168.130.129'

cluster.cluster.x-k8s.io/tkg-wld annotated

If you followed the Update the AKO operator steps from my previous post, Upgrading from TKG 1.3 to 1.4 (including extensions) on vSphere, you should have an ako-operator-addon-manifest.yaml file that can be used for the next step. If you didn’t, you’ll need to go through those steps first and then make some additional changes to the ako-operator-addon-manifest.yaml file.

You should have an ako-operator-addon-manifest.yaml file similar to the following:

apiVersion: v1
kind: Secret
metadata:
  annotations:
    tkg.tanzu.vmware.com/addon-type: networking/ako-operator
  labels:
    clusterctl.cluster.x-k8s.io/move: ""
    tkg.tanzu.vmware.com/addon-name: ako-operator
    tkg.tanzu.vmware.com/cluster-name: tkg-mgmt
  name: tkg-mgmt-ako-operator-addon
  namespace: tkg-system
stringData:
  values.yaml: |
    #@data/values
    #@overlay/match-child-defaults missing_ok=True
    ---
    akoOperator:
      avi_enable: true
      namespace: tkg-system-networking
      cluster_name: tkg-mgmt
      config:
        avi_disable_ingress_class: true
        avi_ingress_default_ingress_controller: false
        avi_ingress_shard_vs_size: ""
        avi_ingress_service_type: ""
        avi_ingress_node_network_list: '""'
        avi_admin_credential_name: avi-controller-credentials
        avi_ca_name: avi-controller-ca
        avi_controller: nsxalb-cluster.corp.tanzu
        avi_username: admin
        avi_password: VMware1!
        avi_cloud_name: Default-Cloud
        avi_service_engine_group: Default-Group
        avi_data_network: K8s-Frontend
        avi_data_network_cidr: 192.168.220.0/23
        avi_ca_data_b64: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZhekNDQTFPZ0F3SUJBZ0lRTWZaeTA4bXV2SVZLZFpWRHo3L3JZekFOQmdrcWhraUc5dzBCQVFzRkFEQkkKTVJVd0V3WUtDWkltaVpQeUxHUUJHUllGZEdGdWVuVXhGREFTQmdvSmtpYUprL0lzWkFFWkZnUmpiM0p3TVJrdwpGd1lEVlFRREV4QkRUMDVVVWs5TVEwVk9WRVZTTFVOQk1CNFhEVEl3TURneE9URTNNakEwTkZvWERUTXdNRGd4Ck9URTNNekF6TlZvd1NERVZNQk1HQ2dtU0pvbVQ4aXhrQVJrV0JYUmhibnAxTVJRd0VnWUtDWkltaVpQeUxHUUIKR1JZRVkyOXljREVaTUJjR0ExVUVBeE1RUTA5T1ZGSlBURU5GVGxSRlVpMURRVENDQWlJd0RRWUpLb1pJaHZjTgpBUUVCQlFBRGdnSVBBRENDQWdvQ2dnSUJBTEtJZFg3NjQzUHp2dFZYbHFOSXdEdU5xK3JoY0hGMGZqUjQxNGorCjFJR1FVdVhyeWtqaFNEdGhQUCs4QkdON21CZ0hUOEFqQVMxYjk1eGM4QjBTMkZobG4zQW9SRTl6MDNHdGZzQnUKRlNCUlVWd0FpZlg2b1h1OTdXemZmaHFQdHhaZkxKWGJoT29tamxrWDZpZmZBczJUT0xVeDJPajR3MnZ5Ymh6agpsY0E3MGFpKzBTbDZheFNvM2xNWjRLa3VaMldnZkVjYURqamozMy9wVjMvYm5GSys3eWRQdHRjMlRlazV4c0k4ClhOTWlySVZ4VWlVVDRZTHk0V0xpUzIwMEpVZmJwMVpuTXZuYlE4SnYxUW5abDlXN1dtQlBjZ3hSNEFBdWIwSzQKdlpMWHU2TVhpYm9UbHprTUIvWXRoQ2tUTmxKY0traEhmNjBZUi9UNlN4MVQybnVweUJhNGRlbzVVR1B6aFJpSgpwTjM3dXFxQWRLMXFNRHBDakFSalM2VTdMZjlKS2pmaXJpTHpMZXlBalA4a2FONFRkSFNaZDBwY1FvWlN4ZXhRCjluKzRFNE1RbTRFSjREclZaQ2lsc3lMMkJkRVRjSFhLUGM3cStEYjRYTTdqUEtORzVHUDFFTVY0WG9odjU4eVoKL3JSZm1LNjRnYXI4QU1uT0tUMkFQNjgxcWRaczdsbGpPTmNYVUFMemxYNVRxSWNoWVQwRFZRbUZMWW9NQmVaegowbDIxUWpiSzBZV25QemE2WWkvTjRtNnJGYkVCNFdYaXFoWVNreHpyTXZvY1ZVZ2Q0QUFQMXZmSE5uRkVzblVSCm5Tc2lnbEZIL3hseU8zY0JGcm1vWkF4YkEyMDkxWEhXaEI0YzBtUUVJM2hPcUFCOFVvRkdCclFwbVErTGVzb0MKMUxaOUFnTUJBQUdqVVRCUE1Bc0dBMVVkRHdRRUF3SUJoakFQQmdOVkhSTUJBZjhFQlRBREFRSC9NQjBHQTFVZApEZ1FXQkJURkF4U3ZZNjRRNWFkaG04SVllY0hCQVV1b2J6QVFCZ2tyQmdFRUFZSTNGUUVFQXdJQkFEQU5CZ2txCmhraUc5dzBCQVFzRkFBT0NBZ0VBamcvdjRtSVA3Z0JWQ3c0cGVtdEduM1BTdERoL2FCOXZiV3lqQXl4U05hYUgKSDBuSUQ1cTV3b3c5dWVCaURmalRQbmhiZjNQNzY4SEc4b0wvKzlDK1ZtLzBsaUZCZCswL0RhYXlLcEFORk1MQgpCVitzMmFkV1JoUXVjTFFmWFB3dW04UnliV3Y4MndrUmtXQ0NkT0JhQXZBTXVUZ2swOFN3Skl5UWZWZ3BrM25ZCjBPd2pGd1NBYWR2ZXZmK0xvRC85TDhSOU5FdC9uNFdKZStMdEVhbW85RVZiK2wrY1lxeXh5dWJBVlkwWTZCTTIKR1hxQWgzRkVXMmFRTXB3b3VoLzVTN3c1b1NNWU42bWlZMW9qa2k4Z1BtMCs0K0NJTFBXaC9mcjJxME8vYlB0YgpUcisrblBNbVo4b3Y5ZXBOR0l1cWh0azVqYTIvSnVZK1JXNDZJUmM4UXBGMUV5VWFlMDJFNlUyVmFjczdHZ2UyCkNlU0lOa29MRkZtaUtCZkluL0hBY2hsbWU5YUw2RGxKOXdBcmVCREgzRThrSDdnUkRXYlNLMi9RRDBIcWFjK0UKZ2VHSHdwZy84T3RCT0hVTW5NN2VMT1hCSkZjSm9zV2YwWG5FZ1M0dWJnYUhncURFdThwOFBFN3JwQ3h0VU51cgp0K3gyeE9OSS9yQldnZGJwNTFsUHI3bzgxOXpQSkN2WVpxMVBwMXN0OGZiM1JsVVNXdmJRTVBGdEdBeWFCeStHCjBSZ1o5V1B0eUVZZ25IQWI1L0RxNDZzbmU5L1FuUHd3R3BqdjFzMW9FM1pGUWpodm5HaXM4K2RxUnhrM1laQWsKeWlEZ2hXN2FudHpZTDlTMUNDOHNWZ1ZPd0ZKd2ZGWHBkaWlyMzVtUWx5U0czMDFWNEZzUlYrWjBjRnA0TmkwPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t
        avi_labels: '""'
        avi_disable_static_route_sync: true
        avi_cni_plugin: antrea
        avi_control_plane_ha_provider: false
        avi_management_cluster_vip_network_name: K8s-Frontend
        avi_management_cluster_vip_network_cidr: 192.168.220.0/23
        avi_control_plane_endpoint_port: 6443
type: tkg.tanzu.vmware.com/addon

You will need to edit the file and set avi_control_plane_ha_provider: to true in the config: section and possibly update the avi_management_cluster_vip_network_name and avi_management_cluster_vip_network_cidr values. In my case, I did need to update these since my control plane endpoint addresses are in the range supplied by the K8s-Workload network, not the K8s-Frontend network. The modified file should look like the following:

apiVersion: v1
kind: Secret
metadata:
  annotations:
    tkg.tanzu.vmware.com/addon-type: networking/ako-operator
  labels:
    clusterctl.cluster.x-k8s.io/move: ""
    tkg.tanzu.vmware.com/addon-name: ako-operator
    tkg.tanzu.vmware.com/cluster-name: tkg-mgmt
  name: tkg-mgmt-ako-operator-addon
  namespace: tkg-system
stringData:
  values.yaml: |
    #@data/values
    #@overlay/match-child-defaults missing_ok=True
    ---
    akoOperator:
      avi_enable: true
      namespace: tkg-system-networking
      cluster_name: tkg-mgmt
      config:
        avi_disable_ingress_class: true
        avi_ingress_default_ingress_controller: false
        avi_ingress_shard_vs_size: ""
        avi_ingress_service_type: ""
        avi_ingress_node_network_list: '""'
        avi_admin_credential_name: avi-controller-credentials
        avi_ca_name: avi-controller-ca
        avi_controller: nsxalb-cluster.corp.tanzu
        avi_username: admin
        avi_password: VMware1!
        avi_cloud_name: Default-Cloud
        avi_service_engine_group: Default-Group
        avi_data_network: K8s-Frontend
        avi_data_network_cidr: 192.168.220.0/23
        avi_ca_data_b64: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZhekNDQTFPZ0F3SUJBZ0lRTWZaeTA4bXV2SVZLZFpWRHo3L3JZekFOQmdrcWhraUc5dzBCQVFzRkFEQkkKTVJVd0V3WUtDWkltaVpQeUxHUUJHUllGZEdGdWVuVXhGREFTQmdvSmtpYUprL0lzWkFFWkZnUmpiM0p3TVJrdwpGd1lEVlFRREV4QkRUMDVVVWs5TVEwVk9WRVZTTFVOQk1CNFhEVEl3TURneE9URTNNakEwTkZvWERUTXdNRGd4Ck9URTNNekF6TlZvd1NERVZNQk1HQ2dtU0pvbVQ4aXhrQVJrV0JYUmhibnAxTVJRd0VnWUtDWkltaVpQeUxHUUIKR1JZRVkyOXljREVaTUJjR0ExVUVBeE1RUTA5T1ZGSlBURU5GVGxSRlVpMURRVENDQWlJd0RRWUpLb1pJaHZjTgpBUUVCQlFBRGdnSVBBRENDQWdvQ2dnSUJBTEtJZFg3NjQzUHp2dFZYbHFOSXdEdU5xK3JoY0hGMGZqUjQxNGorCjFJR1FVdVhyeWtqaFNEdGhQUCs4QkdON21CZ0hUOEFqQVMxYjk1eGM4QjBTMkZobG4zQW9SRTl6MDNHdGZzQnUKRlNCUlVWd0FpZlg2b1h1OTdXemZmaHFQdHhaZkxKWGJoT29tamxrWDZpZmZBczJUT0xVeDJPajR3MnZ5Ymh6agpsY0E3MGFpKzBTbDZheFNvM2xNWjRLa3VaMldnZkVjYURqamozMy9wVjMvYm5GSys3eWRQdHRjMlRlazV4c0k4ClhOTWlySVZ4VWlVVDRZTHk0V0xpUzIwMEpVZmJwMVpuTXZuYlE4SnYxUW5abDlXN1dtQlBjZ3hSNEFBdWIwSzQKdlpMWHU2TVhpYm9UbHprTUIvWXRoQ2tUTmxKY0traEhmNjBZUi9UNlN4MVQybnVweUJhNGRlbzVVR1B6aFJpSgpwTjM3dXFxQWRLMXFNRHBDakFSalM2VTdMZjlKS2pmaXJpTHpMZXlBalA4a2FONFRkSFNaZDBwY1FvWlN4ZXhRCjluKzRFNE1RbTRFSjREclZaQ2lsc3lMMkJkRVRjSFhLUGM3cStEYjRYTTdqUEtORzVHUDFFTVY0WG9odjU4eVoKL3JSZm1LNjRnYXI4QU1uT0tUMkFQNjgxcWRaczdsbGpPTmNYVUFMemxYNVRxSWNoWVQwRFZRbUZMWW9NQmVaegowbDIxUWpiSzBZV25QemE2WWkvTjRtNnJGYkVCNFdYaXFoWVNreHpyTXZvY1ZVZ2Q0QUFQMXZmSE5uRkVzblVSCm5Tc2lnbEZIL3hseU8zY0JGcm1vWkF4YkEyMDkxWEhXaEI0YzBtUUVJM2hPcUFCOFVvRkdCclFwbVErTGVzb0MKMUxaOUFnTUJBQUdqVVRCUE1Bc0dBMVVkRHdRRUF3SUJoakFQQmdOVkhSTUJBZjhFQlRBREFRSC9NQjBHQTFVZApEZ1FXQkJURkF4U3ZZNjRRNWFkaG04SVllY0hCQVV1b2J6QVFCZ2tyQmdFRUFZSTNGUUVFQXdJQkFEQU5CZ2txCmhraUc5dzBCQVFzRkFBT0NBZ0VBamcvdjRtSVA3Z0JWQ3c0cGVtdEduM1BTdERoL2FCOXZiV3lqQXl4U05hYUgKSDBuSUQ1cTV3b3c5dWVCaURmalRQbmhiZjNQNzY4SEc4b0wvKzlDK1ZtLzBsaUZCZCswL0RhYXlLcEFORk1MQgpCVitzMmFkV1JoUXVjTFFmWFB3dW04UnliV3Y4MndrUmtXQ0NkT0JhQXZBTXVUZ2swOFN3Skl5UWZWZ3BrM25ZCjBPd2pGd1NBYWR2ZXZmK0xvRC85TDhSOU5FdC9uNFdKZStMdEVhbW85RVZiK2wrY1lxeXh5dWJBVlkwWTZCTTIKR1hxQWgzRkVXMmFRTXB3b3VoLzVTN3c1b1NNWU42bWlZMW9qa2k4Z1BtMCs0K0NJTFBXaC9mcjJxME8vYlB0YgpUcisrblBNbVo4b3Y5ZXBOR0l1cWh0azVqYTIvSnVZK1JXNDZJUmM4UXBGMUV5VWFlMDJFNlUyVmFjczdHZ2UyCkNlU0lOa29MRkZtaUtCZkluL0hBY2hsbWU5YUw2RGxKOXdBcmVCREgzRThrSDdnUkRXYlNLMi9RRDBIcWFjK0UKZ2VHSHdwZy84T3RCT0hVTW5NN2VMT1hCSkZjSm9zV2YwWG5FZ1M0dWJnYUhncURFdThwOFBFN3JwQ3h0VU51cgp0K3gyeE9OSS9yQldnZGJwNTFsUHI3bzgxOXpQSkN2WVpxMVBwMXN0OGZiM1JsVVNXdmJRTVBGdEdBeWFCeStHCjBSZ1o5V1B0eUVZZ25IQWI1L0RxNDZzbmU5L1FuUHd3R3BqdjFzMW9FM1pGUWpodm5HaXM4K2RxUnhrM1laQWsKeWlEZ2hXN2FudHpZTDlTMUNDOHNWZ1ZPd0ZKd2ZGWHBkaWlyMzVtUWx5U0czMDFWNEZzUlYrWjBjRnA0TmkwPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t
        avi_labels: '""'
        avi_disable_static_route_sync: true
        avi_cni_plugin: antrea
        avi_control_plane_ha_provider: true
        avi_management_cluster_vip_network_name: K8s-Workload
        avi_management_cluster_vip_network_cidr: 192.168.130.0/24
        avi_control_plane_endpoint_port: 6443
type: tkg.tanzu.vmware.com/addon

Issue the following command to apply the ako-operator-addon-manifest secret:

kubectl apply -f ako-operator-addon-manifest.yaml

You can issue the following command to validate that the changes have been put in place:

kubectl -n tkg-system get secrets ako-operator-data-values -o jsonpath={.data.values\\.yaml} | base64 -d

#@data/values
#@overlay/match-child-defaults missing_ok=True
---
akoOperator:
  avi_enable: true
  namespace: tkg-system-networking
  cluster_name: tkg-mgmt
  config:
    avi_disable_ingress_class: true
    avi_ingress_default_ingress_controller: false
    avi_ingress_shard_vs_size: ""
    avi_ingress_service_type: ""
    avi_ingress_node_network_list: '""'
    avi_admin_credential_name: avi-controller-credentials
    avi_ca_name: avi-controller-ca
    avi_controller: nsxalb-cluster.corp.tanzu
    avi_username: admin
    avi_password: VMware1!
    avi_cloud_name: Default-Cloud
    avi_service_engine_group: Default-Group
    avi_data_network: K8s-Frontend
    avi_data_network_cidr: 192.168.220.0/23
    avi_ca_data_b64: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZhekNDQTFPZ0F3SUJBZ0lRTWZaeTA4bXV2SVZLZFpWRHo3L3JZekFOQmdrcWhraUc5dzBCQVFzRkFEQkkKTVJVd0V3WUtDWkltaVpQeUxHUUJHUllGZEdGdWVuVXhGREFTQmdvSmtpYUprL0lzWkFFWkZnUmpiM0p3TVJrdwpGd1lEVlFRREV4QkRUMDVVVWs5TVEwVk9WRVZTTFVOQk1CNFhEVEl3TURneE9URTNNakEwTkZvWERUTXdNRGd4Ck9URTNNekF6TlZvd1NERVZNQk1HQ2dtU0pvbVQ4aXhrQVJrV0JYUmhibnAxTVJRd0VnWUtDWkltaVpQeUxHUUIKR1JZRVkyOXljREVaTUJjR0ExVUVBeE1RUTA5T1ZGSlBURU5GVGxSRlVpMURRVENDQWlJd0RRWUpLb1pJaHZjTgpBUUVCQlFBRGdnSVBBRENDQWdvQ2dnSUJBTEtJZFg3NjQzUHp2dFZYbHFOSXdEdU5xK3JoY0hGMGZqUjQxNGorCjFJR1FVdVhyeWtqaFNEdGhQUCs4QkdON21CZ0hUOEFqQVMxYjk1eGM4QjBTMkZobG4zQW9SRTl6MDNHdGZzQnUKRlNCUlVWd0FpZlg2b1h1OTdXemZmaHFQdHhaZkxKWGJoT29tamxrWDZpZmZBczJUT0xVeDJPajR3MnZ5Ymh6agpsY0E3MGFpKzBTbDZheFNvM2xNWjRLa3VaMldnZkVjYURqamozMy9wVjMvYm5GSys3eWRQdHRjMlRlazV4c0k4ClhOTWlySVZ4VWlVVDRZTHk0V0xpUzIwMEpVZmJwMVpuTXZuYlE4SnYxUW5abDlXN1dtQlBjZ3hSNEFBdWIwSzQKdlpMWHU2TVhpYm9UbHprTUIvWXRoQ2tUTmxKY0traEhmNjBZUi9UNlN4MVQybnVweUJhNGRlbzVVR1B6aFJpSgpwTjM3dXFxQWRLMXFNRHBDakFSalM2VTdMZjlKS2pmaXJpTHpMZXlBalA4a2FONFRkSFNaZDBwY1FvWlN4ZXhRCjluKzRFNE1RbTRFSjREclZaQ2lsc3lMMkJkRVRjSFhLUGM3cStEYjRYTTdqUEtORzVHUDFFTVY0WG9odjU4eVoKL3JSZm1LNjRnYXI4QU1uT0tUMkFQNjgxcWRaczdsbGpPTmNYVUFMemxYNVRxSWNoWVQwRFZRbUZMWW9NQmVaegowbDIxUWpiSzBZV25QemE2WWkvTjRtNnJGYkVCNFdYaXFoWVNreHpyTXZvY1ZVZ2Q0QUFQMXZmSE5uRkVzblVSCm5Tc2lnbEZIL3hseU8zY0JGcm1vWkF4YkEyMDkxWEhXaEI0YzBtUUVJM2hPcUFCOFVvRkdCclFwbVErTGVzb0MKMUxaOUFnTUJBQUdqVVRCUE1Bc0dBMVVkRHdRRUF3SUJoakFQQmdOVkhSTUJBZjhFQlRBREFRSC9NQjBHQTFVZApEZ1FXQkJURkF4U3ZZNjRRNWFkaG04SVllY0hCQVV1b2J6QVFCZ2tyQmdFRUFZSTNGUUVFQXdJQkFEQU5CZ2txCmhraUc5dzBCQVFzRkFBT0NBZ0VBamcvdjRtSVA3Z0JWQ3c0cGVtdEduM1BTdERoL2FCOXZiV3lqQXl4U05hYUgKSDBuSUQ1cTV3b3c5dWVCaURmalRQbmhiZjNQNzY4SEc4b0wvKzlDK1ZtLzBsaUZCZCswL0RhYXlLcEFORk1MQgpCVitzMmFkV1JoUXVjTFFmWFB3dW04UnliV3Y4MndrUmtXQ0NkT0JhQXZBTXVUZ2swOFN3Skl5UWZWZ3BrM25ZCjBPd2pGd1NBYWR2ZXZmK0xvRC85TDhSOU5FdC9uNFdKZStMdEVhbW85RVZiK2wrY1lxeXh5dWJBVlkwWTZCTTIKR1hxQWgzRkVXMmFRTXB3b3VoLzVTN3c1b1NNWU42bWlZMW9qa2k4Z1BtMCs0K0NJTFBXaC9mcjJxME8vYlB0YgpUcisrblBNbVo4b3Y5ZXBOR0l1cWh0azVqYTIvSnVZK1JXNDZJUmM4UXBGMUV5VWFlMDJFNlUyVmFjczdHZ2UyCkNlU0lOa29MRkZtaUtCZkluL0hBY2hsbWU5YUw2RGxKOXdBcmVCREgzRThrSDdnUkRXYlNLMi9RRDBIcWFjK0UKZ2VHSHdwZy84T3RCT0hVTW5NN2VMT1hCSkZjSm9zV2YwWG5FZ1M0dWJnYUhncURFdThwOFBFN3JwQ3h0VU51cgp0K3gyeE9OSS9yQldnZGJwNTFsUHI3bzgxOXpQSkN2WVpxMVBwMXN0OGZiM1JsVVNXdmJRTVBGdEdBeWFCeStHCjBSZ1o5V1B0eUVZZ25IQWI1L0RxNDZzbmU5L1FuUHd3R3BqdjFzMW9FM1pGUWpodm5HaXM4K2RxUnhrM1laQWsKeWlEZ2hXN2FudHpZTDlTMUNDOHNWZ1ZPd0ZKd2ZGWHBkaWlyMzVtUWx5U0czMDFWNEZzUlYrWjBjRnA0TmkwPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t
    avi_labels: '""'
    avi_disable_static_route_sync: true
    avi_cni_plugin: antrea
    avi_control_plane_ha_provider: true
    avi_management_cluster_vip_network_name: K8s-Workload
    avi_management_cluster_vip_network_cidr: 192.168.130.0/24
    avi_control_plane_endpoint_port: 6443

You will need to restart the existing ako statefuleset in the avi-system namespace so that it will pick up the changes:

kubectl -n avi-system rollout restart statefulset ako

statefulset.apps/ako restarted

After a very short time, you should see new loadbalancer services in the tkg-system namespace (for the management cluster) and in the default namespace (for the workload cluster):

kubectl -n tkg-system get svc

NAME                                TYPE           CLUSTER-IP       EXTERNAL-IP       PORT(S)          AGE
packaging-api                       ClusterIP      100.67.55.56     <none>            443/TCP          9h
tkg-system-tkg-mgmt-control-plane   LoadBalancer   100.65.130.172   192.168.130.128   6443:31452/TCP   19m

kubectl get svc

NAME                            TYPE           CLUSTER-IP      EXTERNAL-IP       PORT(S)          AGE
default-tkg-wld-control-plane   LoadBalancer   100.67.140.21   192.168.130.129   6443:30438/TCP   19m
kubernetes                      ClusterIP      100.64.0.1      <none>            443/TCP          167d

You will also see two new virtual services in the NSX ALB UI:

The final step is to remove the kube-vip installation.

ssh to each control plane node in the management cluster and remove the /etc/kubernetes/manifests/kube-vip.yaml file (see Connect to Cluster Nodes with SSH for instructions on ssh-ing to TKG nodes). Shortly afterwards you should see that there is no kube-vip pod running any longer:

kubectl get po -A |grep kube-vip

<no results returned>

You will also need to edit the kubeadmcontrolplane (kcp) object to remove all references to the kube-vip deployment. If you have not enabled audit logging, supplied any custom certificates or made any other manual configuration changes to the files: section of the kcp object, you can essentially, just delete the spec.kubeadmConfigSpec.files: section. The simplest way to do this is via a kubectl patch command similar to the following:

kubectl patch kcp tkg-wld-control-plane --type='json' -p='[{"op": "remove", "path": "/spec/kubeadmConfigSpec/files"}]'

kubeadmcontrolplane.controlplane.cluster.x-k8s.io/tkg-wld-control-plane patched

Otherwise, you could manually edit the kcp object (which will be necessary if you have enabled audit logging, supplied custom certificaates or configured any other files to be created):

kubectl -n tkg-system edit kcp tkg-mgmt-control-plane

The section under files: to be removed should look similar to the following:

    - content: |
        apiVersion: v1
        kind: Pod
        metadata:
          creationTimestamp: null
          name: kube-vip
          namespace: kube-system
        spec:
          containers:
          - args:
            - start
            env:
            - name: vip_arp
              value: "true"
            - name: vip_leaderelection
              value: "true"
            - name: address
              value: 192.168.130.128
            - name: vip_interface
              value: eth0
            - name: vip_leaseduration
              value: "15"
            - name: vip_renewdeadline
              value: "10"
            - name: vip_retryperiod
              value: "2"
            image: projects.registry.vmware.com/tkg/kube-vip:v0.3.2_vmware.1
            imagePullPolicy: IfNotPresent
            name: kube-vip
            resources: {}
            securityContext:
              capabilities:
                add:
                - NET_ADMIN
                - SYS_TIME
            volumeMounts:
            - mountPath: /etc/kubernetes/admin.conf
              name: kubeconfig
          hostNetwork: true
          volumes:
          - hostPath:
              path: /etc/kubernetes/admin.conf
              type: FileOrCreate
            name: kubeconfig
        status: {}
      owner: root:root
      path: /etc/kubernetes/manifests/kube-vip.yaml

Type :wq to save the edit when you’re done.

You should see new control plane nodes get created in the vSphere Client after making this change.

Repeat the previous few steps for removing the kube-vip pod and it’s definition in the kubeadmcontrolplane (in the default namespace) for any workload clusters.

1 thought on “Migrating a TKG cluster control-plane endpoint from kube-vip to NSX-ALB”

  1. Pingback: Using multiple availability zones for a workload cluster in TKG 1.4 on vSphere – Little Stuff

Leave a Comment

Your email address will not be published. Required fields are marked *