Revisiting installing vSphere with Tanzu while using NSX Advanced Load Balancer

In my earlier post, Deploying vSphere 7.0 U2 with Tanzu while using NSX Advanced Load Balancer, I conducted a thorough walkthrough of the process of installing vSphere with Tanzu while using NSX Advanced Load Balancer (NSX ALB) for load balancer services. While vSphere 7.0 U3 does not differ too largely from 7.0 U2, I wanted to go through some of the differences you might see while deploying it.

While I was going through the process of enabling Workload Management, the first thing that caught my eye was that the Control Plane Size page was no longer present. For reference, this is what it looked like 7.0 U2:

Not to worry though, you’ll find this on the Review and Confirm page.

A very welcome addition was found on the Management Network page. You can choose DHCP for your management network IP addresses if you like, instead of specifying a start address and letting WCP pick the next four addresses after it (I still went with the older option).

Additionally, you can also choose DHCP for for your workload network (again, I went with a pool of IP addresses).

I did note that even when choosing Static for the workload network, the configuration is all on one page and a little different from how things looked in 7.0 U2 where the configuration was spread across three pages.

Before (7.0U2):

After (7.0U3):

If you happen to be deploying vSphere with Tanzu while using NSX-T, you’ll find the Workload Network page even more different:

As noted earlier, you’ll find the ability to choose a size for your control plane nodes on the Review and Confirm page:

Once vSphere with Tanzu was configured, I moved on to setting up a namespace. While adding permissions, I noticed that there was a new Role option of “Owner”. This was introduced in vSphere 7.0 U2a (after I deployed 7.0 U2) and goes along with the new self-service namespace creation feature. This functionality allows a vSphere administrator to delegate authority for creating supervisor namespaces to devops users. You can read more about this feature at Provision a Self-Service Namespace Template.

Another new area (that has been around before 7.0 U3 but not when I deployed 7.0 U2) is the VM Service pane. This is where you can configure VM Classes to be used when creating clusters.

I wanted to have access to all VM classes so I selected them all.

If you don’t complete this step, when you go to create a Tanzu Kubernetes cluster (TKc), you will find that there are no vmclasses available and you cannot create a cluster.

You will also need to associate a content library to be able get the needed node OS images.

With this done, you should be able to see both vmclasses and vmclass bindings:

kubectl get vmclass

NAME                  CPU   MEMORY   AGE
best-effort-2xlarge   8     64Gi     8m13s
best-effort-4xlarge   16    128Gi    8m13s
best-effort-8xlarge   32    128Gi    8m13s
best-effort-large     4     16Gi     8m12s
best-effort-medium    2     8Gi      7m32s
best-effort-small     2     4Gi      7m41s
best-effort-xlarge    4     32Gi     8m13s
best-effort-xsmall    2     2Gi      8m13s
guaranteed-2xlarge    8     64Gi     8m13s
guaranteed-4xlarge    16    128Gi    3m25s
guaranteed-8xlarge    32    128Gi    6m10s
guaranteed-large      4     16Gi     8m8s
guaranteed-medium     2     8Gi      7m52s
guaranteed-small      2     4Gi      8m13s
guaranteed-xlarge     4     32Gi     8m3s
guaranteed-xsmall     2     2Gi      8m11s
kubectl get virtualmachineclassbindings

NAME                  VIRTUALMACHINECLASS   AGE
best-effort-2xlarge   best-effort-2xlarge   11m
best-effort-4xlarge   best-effort-4xlarge   86s
best-effort-8xlarge   best-effort-8xlarge   11m
best-effort-large     best-effort-large     9m38s
best-effort-medium    best-effort-medium    6m54s
best-effort-small     best-effort-small     9m38s
best-effort-xlarge    best-effort-xlarge    11m
best-effort-xsmall    best-effort-xsmall    11m
guaranteed-2xlarge    guaranteed-2xlarge    11m
guaranteed-4xlarge    guaranteed-4xlarge    85s
guaranteed-8xlarge    guaranteed-8xlarge    85s
guaranteed-large      guaranteed-large      9m38s
guaranteed-medium     guaranteed-medium     85s
guaranteed-small      guaranteed-small      85s
guaranteed-xlarge     guaranteed-xlarge     9m38s
guaranteed-xsmall     guaranteed-xsmall     9m38s

And since the same content library is associated with the supervisor cluster, you can see all of the available tanzukubernetesreleases:

kubectl get tkr

NAME                                VERSION                          READY   COMPATIBLE   CREATED   UPDATES AVAILABLE
v1.16.12---vmware.1-tkg.1.da7afe7   1.16.12+vmware.1-tkg.1.da7afe7   True    True         91m       [1.17.17+vmware.1-tkg.1.d44d45a 1.16.14+vmware.1-tkg.1.ada4837]
v1.16.14---vmware.1-tkg.1.ada4837   1.16.14+vmware.1-tkg.1.ada4837   True    True         91m       [1.17.17+vmware.1-tkg.1.d44d45a]
v1.16.8---vmware.1-tkg.3.60d2ffd    1.16.8+vmware.1-tkg.3.60d2ffd    False   False        91m       [1.17.17+vmware.1-tkg.1.d44d45a 1.16.14+vmware.1-tkg.1.ada4837]
v1.17.11---vmware.1-tkg.1.15f1e18   1.17.11+vmware.1-tkg.1.15f1e18   True    True         91m       [1.18.19+vmware.1-tkg.1.17af790 1.17.17+vmware.1-tkg.1.d44d45a]
v1.17.11---vmware.1-tkg.2.ad3d374   1.17.11+vmware.1-tkg.2.ad3d374   True    True         91m       [1.18.19+vmware.1-tkg.1.17af790 1.17.17+vmware.1-tkg.1.d44d45a]
v1.17.13---vmware.1-tkg.2.2c133ed   1.17.13+vmware.1-tkg.2.2c133ed   True    True         91m       [1.18.19+vmware.1-tkg.1.17af790 1.17.17+vmware.1-tkg.1.d44d45a]
v1.17.17---vmware.1-tkg.1.d44d45a   1.17.17+vmware.1-tkg.1.d44d45a   True    True         91m       [1.18.19+vmware.1-tkg.1.17af790]
v1.17.7---vmware.1-tkg.1.154236c    1.17.7+vmware.1-tkg.1.154236c    True    True         91m       [1.18.19+vmware.1-tkg.1.17af790 1.17.17+vmware.1-tkg.1.d44d45a]
v1.17.8---vmware.1-tkg.1.5417466    1.17.8+vmware.1-tkg.1.5417466    True    True         91m       [1.18.19+vmware.1-tkg.1.17af790 1.17.17+vmware.1-tkg.1.d44d45a]
v1.18.10---vmware.1-tkg.1.3a6cd48   1.18.10+vmware.1-tkg.1.3a6cd48   True    True         91m       [1.19.14+vmware.1-tkg.1.8753786 1.18.19+vmware.1-tkg.1.17af790]
v1.18.15---vmware.1-tkg.1.600e412   1.18.15+vmware.1-tkg.1.600e412   True    True         91m       [1.19.14+vmware.1-tkg.1.8753786 1.18.19+vmware.1-tkg.1.17af790]
v1.18.15---vmware.1-tkg.2.ebf6117   1.18.15+vmware.1-tkg.2.ebf6117   True    True         91m       [1.19.14+vmware.1-tkg.1.8753786 1.18.19+vmware.1-tkg.1.17af790]
v1.18.19---vmware.1-tkg.1.17af790   1.18.19+vmware.1-tkg.1.17af790   True    True         91m       [1.19.14+vmware.1-tkg.1.8753786]
v1.18.5---vmware.1-tkg.1.c40d30d    1.18.5+vmware.1-tkg.1.c40d30d    True    True         91m       [1.19.14+vmware.1-tkg.1.8753786 1.18.19+vmware.1-tkg.1.17af790]
v1.19.11---vmware.1-tkg.1.9d9b236   1.19.11+vmware.1-tkg.1.9d9b236   True    True         91m       [1.20.9+vmware.1-tkg.1.a4cee5b 1.19.14+vmware.1-tkg.1.8753786]
v1.19.14---vmware.1-tkg.1.8753786   1.19.14+vmware.1-tkg.1.8753786   True    True         91m       [1.20.9+vmware.1-tkg.1.a4cee5b]
v1.19.7---vmware.1-tkg.1.fc82c41    1.19.7+vmware.1-tkg.1.fc82c41    True    True         91m       [1.20.9+vmware.1-tkg.1.a4cee5b 1.19.14+vmware.1-tkg.1.8753786]
v1.19.7---vmware.1-tkg.2.f52f85a    1.19.7+vmware.1-tkg.2.f52f85a    True    True         91m       [1.20.9+vmware.1-tkg.1.a4cee5b 1.19.14+vmware.1-tkg.1.8753786]
v1.20.2---vmware.1-tkg.1.1d4f79a    1.20.2+vmware.1-tkg.1.1d4f79a    True    True         91m       [1.21.2+vmware.1-tkg.1.ee25d55 1.20.9+vmware.1-tkg.1.a4cee5b]
v1.20.2---vmware.1-tkg.2.3e10706    1.20.2+vmware.1-tkg.2.3e10706    True    True         91m       [1.21.2+vmware.1-tkg.1.ee25d55 1.20.9+vmware.1-tkg.1.a4cee5b]
v1.20.7---vmware.1-tkg.1.7fb9067    1.20.7+vmware.1-tkg.1.7fb9067    True    True         91m       [1.21.2+vmware.1-tkg.1.ee25d55 1.20.9+vmware.1-tkg.1.a4cee5b]
v1.20.9---vmware.1-tkg.1.a4cee5b    1.20.9+vmware.1-tkg.1.a4cee5b    True    True         91m       [1.21.2+vmware.1-tkg.1.ee25d55]
v1.21.2---vmware.1-tkg.1.ee25d55    1.21.2+vmware.1-tkg.1.ee25d55    True    True         91m

A very welcome feature is the inclusion of a tkgserviceconfiguration CRD. This lets you specify items that you want to have configured on your clusters. The only thing at this point that I wanted to be included was the CA certificate used by my external Harbor installation. There have been various means of doing this for Tanzu Kubernetes clusters but this seems to be the most elegant yet. You can read more about tkgserviceconfiguration settings at Configuration Parameters for the Tanzu Kubernetes Grid Service v1alpha2 API.

apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TkgServiceConfiguration
metadata:
  name: tkg-service-configuration
spec:
  defaultCNI: antrea
  trust:
    additionalTrustedCAs:
      - name: corp.tanzu
        data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZhekNDQTFPZ0F3SUJBZ0lRTWZaeTA4bXV2SVZLZFpWRHo3L3JZekFOQmdrcWhraUc5dzBCQVFzRkFEQkkKTVJVd0V3WUtDWkltaVpQeUxHUUJHUllGZEdGdWVuVXhGREFTQmdvSmtpYUprL0lzWkFFWkZnUmpiM0p3TVJrdwpGd1lEVlFRREV4QkRUMDVVVWs5TVEwVk9WRVZTTFVOQk1CNFhEVEl3TURneE9URTNNakEwTkZvWERUTXdNRGd4Ck9URTNNekF6TlZvd1NERVZNQk1HQ2dtU0pvbVQ4aXhrQVJrV0JYUmhibnAxTVJRd0VnWUtDWkltaVpQeUxHUUIKR1JZRVkyOXljREVaTUJjR0ExVUVBeE1RUTA5T1ZGSlBURU5GVGxSRlVpMURRVENDQWlJd0RRWUpLb1pJaHZjTgpBUUVCQlFBRGdnSVBBRENDQWdvQ2dnSUJBTEtJZFg3NjQzUHp2dFZYbHFOSXdEdU5xK3JoY0hGMGZqUjQxNGorCjFJR1FVdVhyeWtqaFNEdGhQUCs4QkdON21CZ0hUOEFqQVMxYjk1eGM4QjBTMkZobG4zQW9SRTl6MDNHdGZzQnUKRlNCUlVWd0FpZlg2b1h1OTdXemZmaHFQdHhaZkxKWGJoT29tamxrWDZpZmZBczJUT0xVeDJPajR3MnZ5Ymh6agpsY0E3MGFpKzBTbDZheFNvM2xNWjRLa3VaMldnZkVjYURqamozMy9wVjMvYm5GSys3eWRQdHRjMlRlazV4c0k4ClhOTWlySVZ4VWlVVDRZTHk0V0xpUzIwMEpVZmJwMVpuTXZuYlE4SnYxUW5abDlXN1dtQlBjZ3hSNEFBdWIwSzQKdlpMWHU2TVhpYm9UbHprTUIvWXRoQ2tUTmxKY0traEhmNjBZUi9UNlN4MVQybnVweUJhNGRlbzVVR1B6aFJpSgpwTjM3dXFxQWRLMXFNRHBDakFSalM2VTdMZjlKS2pmaXJpTHpMZXlBalA4a2FONFRkSFNaZDBwY1FvWlN4ZXhRCjluKzRFNE1RbTRFSjREclZaQ2lsc3lMMkJkRVRjSFhLUGM3cStEYjRYTTdqUEtORzVHUDFFTVY0WG9odjU4eVoKL3JSZm1LNjRnYXI4QU1uT0tUMkFQNjgxcWRaczdsbGpPTmNYVUFMemxYNVRxSWNoWVQwRFZRbUZMWW9NQmVaegowbDIxUWpiSzBZV25QemE2WWkvTjRtNnJGYkVCNFdYaXFoWVNreHpyTXZvY1ZVZ2Q0QUFQMXZmSE5uRkVzblVSCm5Tc2lnbEZIL3hseU8zY0JGcm1vWkF4YkEyMDkxWEhXaEI0YzBtUUVJM2hPcUFCOFVvRkdCclFwbVErTGVzb0MKMUxaOUFnTUJBQUdqVVRCUE1Bc0dBMVVkRHdRRUF3SUJoakFQQmdOVkhSTUJBZjhFQlRBREFRSC9NQjBHQTFVZApEZ1FXQkJURkF4U3ZZNjRRNWFkaG04SVllY0hCQVV1b2J6QVFCZ2tyQmdFRUFZSTNGUUVFQXdJQkFEQU5CZ2txCmhraUc5dzBCQVFzRkFBT0NBZ0VBamcvdjRtSVA3Z0JWQ3c0cGVtdEduM1BTdERoL2FCOXZiV3lqQXl4U05hYUgKSDBuSUQ1cTV3b3c5dWVCaURmalRQbmhiZjNQNzY4SEc4b0wvKzlDK1ZtLzBsaUZCZCswL0RhYXlLcEFORk1MQgpCVitzMmFkV1JoUXVjTFFmWFB3dW04UnliV3Y4MndrUmtXQ0NkT0JhQXZBTXVUZ2swOFN3Skl5UWZWZ3BrM25ZCjBPd2pGd1NBYWR2ZXZmK0xvRC85TDhSOU5FdC9uNFdKZStMdEVhbW85RVZiK2wrY1lxeXh5dWJBVlkwWTZCTTIKR1hxQWgzRkVXMmFRTXB3b3VoLzVTN3c1b1NNWU42bWlZMW9qa2k4Z1BtMCs0K0NJTFBXaC9mcjJxME8vYlB0YgpUcisrblBNbVo4b3Y5ZXBOR0l1cWh0azVqYTIvSnVZK1JXNDZJUmM4UXBGMUV5VWFlMDJFNlUyVmFjczdHZ2UyCkNlU0lOa29MRkZtaUtCZkluL0hBY2hsbWU5YUw2RGxKOXdBcmVCREgzRThrSDdnUkRXYlNLMi9RRDBIcWFjK0UKZ2VHSHdwZy84T3RCT0hVTW5NN2VMT1hCSkZjSm9zV2YwWG5FZ1M0dWJnYUhncURFdThwOFBFN3JwQ3h0VU51cgp0K3gyeE9OSS9yQldnZGJwNTFsUHI3bzgxOXpQSkN2WVpxMVBwMXN0OGZiM1JsVVNXdmJRTVBGdEdBeWFCeStHCjBSZ1o5V1B0eUVZZ25IQWI1L0RxNDZzbmU5L1FuUHd3R3BqdjFzMW9FM1pGUWpodm5HaXM4K2RxUnhrM1laQWsKeWlEZ2hXN2FudHpZTDlTMUNDOHNWZ1ZPd0ZKd2ZGWHBkaWlyMzVtUWx5U0czMDFWNEZzUlYrWjBjRnA0TmkwPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t

Once you’ve created a file like this, you just need to apply it while in the supervisor cluster namespace.

You can now choose to use the older v1alpha1 API when creating a new cluster or the newer v1alpha2 API. VMware has excellent documentation on the required and optional parameters for creating clusters under either version at Provisioning Tanzu Kubernetes Clusters Using the Tanzu Kubernetes Grid Service v1alpha1 API and Provisioning Tanzu Kubernetes Clusters Using the Tanzu Kubernetes Grid Service v1alpha2 API.

I went with the v1alpha2 API and have a very simple cluster configuration file here:

apiVersion: run.tanzu.vmware.com/v1alpha2
kind: TanzuKubernetesCluster
metadata:
  name: tkgs-cluster
  namespace: tkg
spec:
  topology:
    controlPlane:
      replicas: 1
      vmClass: best-effort-small
      storageClass: k8s-policy
      tkr:
        reference:
          name: v1.21.2---vmware.1-tkg.1.ee25d55
    nodePools:
    - name: tkg-cluster-nodepool-1
      replicas: 2
      vmClass: best-effort-medium
      storageClass: k8s-policy
      tkr:
        reference:
          name: v1.21.2---vmware.1-tkg.1.ee25d55

You’ll notice the inclusion of nodePools now, which will allow for different configurations of Kubernetes nodes within the same cluster. By comparison ,the following is the cluster configuration I was using in 7.0U2 with the v1alpha1 API:

apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TanzuKubernetesCluster
metadata:
  name: tkgs-cluster 
  namespace: tkg
spec:
  topology:
    controlPlane:
      count: 1
      class: best-effort-small 
      storageClass: k8s-policy
    workers:
      count: 2
      class: best-effort-medium 
      storageClass: k8s-policy
  distribution:
    version: v1.19

Once my cluster was deployed I wanted to validate that the Harbor CA certificate was indeed included on the worker nodes. I was first able to check that the kubeadmconfigtemplate object had the required certificate data included:

kubectl get kubeadmconfigtemplates tkgs-cluster-tkg-cluster-nodepool-1-xd2ct -o jsonpath='{.spec.template.spec}' |jq -r '.files[] | select(.path=="/etc/ssl/certs/tkg-corp.tanzu-ca.pem")'

{
  "content": "-----BEGIN CERTIFICATE-----\nMIIFazCCA1OgAwIBAgIQMfZy08muvIVKdZVDz7/rYzANBgkqhkiG9w0BAQsFADBI\nMRUwEwYKCZImiZPyLGQBGRYFdGFuenUxFDASBgoJkiaJk/IsZAEZFgRjb3JwMRkw\nFwYDVQQDExBDT05UUk9MQ0VOVEVSLUNBMB4XDTIwMDgxOTE3MjA0NFoXDTMwMDgx\nOTE3MzAzNVowSDEVMBMGCgmSJomT8ixkARkWBXRhbnp1MRQwEgYKCZImiZPyLGQB\nGRYEY29ycDEZMBcGA1UEAxMQQ09OVFJPTENFTlRFUi1DQTCCAiIwDQYJKoZIhvcN\nAQEBBQADggIPADCCAgoCggIBALKIdX7643PzvtVXlqNIwDuNq+rhcHF0fjR414j+\n1IGQUuXrykjhSDthPP+8BGN7mBgHT8AjAS1b95xc8B0S2Fhln3AoRE9z03GtfsBu\nFSBRUVwAifX6oXu97WzffhqPtxZfLJXbhOomjlkX6iffAs2TOLUx2Oj4w2vybhzj\nlcA70ai+0Sl6axSo3lMZ4KkuZ2WgfEcaDjjj33/pV3/bnFK+7ydPttc2Tek5xsI8\nXNMirIVxUiUT4YLy4WLiS200JUfbp1ZnMvnbQ8Jv1QnZl9W7WmBPcgxR4AAub0K4\nvZLXu6MXiboTlzkMB/YthCkTNlJcKkhHf60YR/T6Sx1T2nupyBa4deo5UGPzhRiJ\npN37uqqAdK1qMDpCjARjS6U7Lf9JKjfiriLzLeyAjP8kaN4TdHSZd0pcQoZSxexQ\n9n+4E4MQm4EJ4DrVZCilsyL2BdETcHXKPc7q+Db4XM7jPKNG5GP1EMV4Xohv58yZ\n/rRfmK64gar8AMnOKT2AP681qdZs7lljONcXUALzlX5TqIchYT0DVQmFLYoMBeZz\n0l21QjbK0YWnPza6Yi/N4m6rFbEB4WXiqhYSkxzrMvocVUgd4AAP1vfHNnFEsnUR\nnSsiglFH/xlyO3cBFrmoZAxbA2091XHWhB4c0mQEI3hOqAB8UoFGBrQpmQ+LesoC\n1LZ9AgMBAAGjUTBPMAsGA1UdDwQEAwIBhjAPBgNVHRMBAf8EBTADAQH/MB0GA1Ud\nDgQWBBTFAxSvY64Q5adhm8IYecHBAUuobzAQBgkrBgEEAYI3FQEEAwIBADANBgkq\nhkiG9w0BAQsFAAOCAgEAjg/v4mIP7gBVCw4pemtGn3PStDh/aB9vbWyjAyxSNaaH\nH0nID5q5wow9ueBiDfjTPnhbf3P768HG8oL/+9C+Vm/0liFBd+0/DaayKpANFMLB\nBV+s2adWRhQucLQfXPwum8RybWv82wkRkWCCdOBaAvAMuTgk08SwJIyQfVgpk3nY\n0OwjFwSAadvevf+LoD/9L8R9NEt/n4WJe+LtEamo9EVb+l+cYqyxyubAVY0Y6BM2\nGXqAh3FEW2aQMpwouh/5S7w5oSMYN6miY1ojki8gPm0+4+CILPWh/fr2q0O/bPtb\nTr++nPMmZ8ov9epNGIuqhtk5ja2/JuY+RW46IRc8QpF1EyUae02E6U2Vacs7Gge2\nCeSINkoLFFmiKBfIn/HAchlme9aL6DlJ9wAreBDH3E8kH7gRDWbSK2/QD0Hqac+E\ngeGHwpg/8OtBOHUMnM7eLOXBJFcJosWf0XnEgS4ubgaHgqDEu8p8PE7rpCxtUNur\nt+x2xONI/rBWgdbp51lPr7o819zPJCvYZq1Pp1st8fb3RlUSWvbQMPFtGAyaBy+G\n0RgZ9WPtyEYgnHAb5/Dq46sne9/QnPwwGpjv1s1oE3ZFQjhvnGis8+dqRxk3YZAk\nyiDghW7antzYL9S1CC8sVgVOwFJwfFXpdiir35mQlySG301V4FsRV+Z0cFp4Ni0=\n-----END CERTIFICATE-----",
  "owner": "root:root",
  "path": "/etc/ssl/certs/tkg-corp.tanzu-ca.pem",
  "permissions": "0644"
}

I was also able to ssh to one of the created nodes to see that this certificate is loaded in the keystore:

vmware-system-user@tkgs-cluster-tkg-cluster-nodepool-1-xlfqs-5467f55b45-qjnw5 [ ~ ]$ cat /etc/ssl/certs/tkg-corp.tanzu-ca.pem

-----BEGIN CERTIFICATE-----
MIIFazCCA1OgAwIBAgIQMfZy08muvIVKdZVDz7/rYzANBgkqhkiG9w0BAQsFADBI
MRUwEwYKCZImiZPyLGQBGRYFdGFuenUxFDASBgoJkiaJk/IsZAEZFgRjb3JwMRkw
FwYDVQQDExBDT05UUk9MQ0VOVEVSLUNBMB4XDTIwMDgxOTE3MjA0NFoXDTMwMDgx
OTE3MzAzNVowSDEVMBMGCgmSJomT8ixkARkWBXRhbnp1MRQwEgYKCZImiZPyLGQB
GRYEY29ycDEZMBcGA1UEAxMQQ09OVFJPTENFTlRFUi1DQTCCAiIwDQYJKoZIhvcN
AQEBBQADggIPADCCAgoCggIBALKIdX7643PzvtVXlqNIwDuNq+rhcHF0fjR414j+
1IGQUuXrykjhSDthPP+8BGN7mBgHT8AjAS1b95xc8B0S2Fhln3AoRE9z03GtfsBu
FSBRUVwAifX6oXu97WzffhqPtxZfLJXbhOomjlkX6iffAs2TOLUx2Oj4w2vybhzj
lcA70ai+0Sl6axSo3lMZ4KkuZ2WgfEcaDjjj33/pV3/bnFK+7ydPttc2Tek5xsI8
XNMirIVxUiUT4YLy4WLiS200JUfbp1ZnMvnbQ8Jv1QnZl9W7WmBPcgxR4AAub0K4
vZLXu6MXiboTlzkMB/YthCkTNlJcKkhHf60YR/T6Sx1T2nupyBa4deo5UGPzhRiJ
pN37uqqAdK1qMDpCjARjS6U7Lf9JKjfiriLzLeyAjP8kaN4TdHSZd0pcQoZSxexQ
9n+4E4MQm4EJ4DrVZCilsyL2BdETcHXKPc7q+Db4XM7jPKNG5GP1EMV4Xohv58yZ
/rRfmK64gar8AMnOKT2AP681qdZs7lljONcXUALzlX5TqIchYT0DVQmFLYoMBeZz
0l21QjbK0YWnPza6Yi/N4m6rFbEB4WXiqhYSkxzrMvocVUgd4AAP1vfHNnFEsnUR
nSsiglFH/xlyO3cBFrmoZAxbA2091XHWhB4c0mQEI3hOqAB8UoFGBrQpmQ+LesoC
1LZ9AgMBAAGjUTBPMAsGA1UdDwQEAwIBhjAPBgNVHRMBAf8EBTADAQH/MB0GA1Ud
DgQWBBTFAxSvY64Q5adhm8IYecHBAUuobzAQBgkrBgEEAYI3FQEEAwIBADANBgkq
hkiG9w0BAQsFAAOCAgEAjg/v4mIP7gBVCw4pemtGn3PStDh/aB9vbWyjAyxSNaaH
H0nID5q5wow9ueBiDfjTPnhbf3P768HG8oL/+9C+Vm/0liFBd+0/DaayKpANFMLB
BV+s2adWRhQucLQfXPwum8RybWv82wkRkWCCdOBaAvAMuTgk08SwJIyQfVgpk3nY
0OwjFwSAadvevf+LoD/9L8R9NEt/n4WJe+LtEamo9EVb+l+cYqyxyubAVY0Y6BM2
GXqAh3FEW2aQMpwouh/5S7w5oSMYN6miY1ojki8gPm0+4+CILPWh/fr2q0O/bPtb
Tr++nPMmZ8ov9epNGIuqhtk5ja2/JuY+RW46IRc8QpF1EyUae02E6U2Vacs7Gge2
CeSINkoLFFmiKBfIn/HAchlme9aL6DlJ9wAreBDH3E8kH7gRDWbSK2/QD0Hqac+E
geGHwpg/8OtBOHUMnM7eLOXBJFcJosWf0XnEgS4ubgaHgqDEu8p8PE7rpCxtUNur
t+x2xONI/rBWgdbp51lPr7o819zPJCvYZq1Pp1st8fb3RlUSWvbQMPFtGAyaBy+G
0RgZ9WPtyEYgnHAb5/Dq46sne9/QnPwwGpjv1s1oE3ZFQjhvnGis8+dqRxk3YZAk
yiDghW7antzYL9S1CC8sVgVOwFJwfFXpdiir35mQlySG301V4FsRV+Z0cFp4Ni0=
-----END CERTIFICATE-----

One oddity that I discovered was that there was a new virtual service created in NSX ALB for the CSI (container storage interface) driver to export metrics. This in itself wasn’t so odd but the fact that it was in an error state was:

I was able to check the alert on this object and see that there appeared to be a communication issue:

The short answer for why this is not working is that the node that the virtual service is pointed at (192.168.220.7) in this example is not allowing traffic over the noted ports, 2112 and 2113. You can make a very unsupported change to the iptables configuration on the node in question to fix this but it will only be temporary…if the supervisor cluster ever detects a problem that requires the node to be rebuilt, or if the node is rebuilt due to an upgrade or other type of configuration change, this fix will be lost. VMware is working on this and we’ll hopefully see a fix out soon.

When this service is working, you can browse to the noted IP/port and see what metrics are available:

http://192.168.220.2:2112/metrics

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.000139116
go_gc_duration_seconds{quantile="0.25"} 0.002080295
go_gc_duration_seconds{quantile="0.5"} 0.002455077
go_gc_duration_seconds{quantile="0.75"} 0.00321926
go_gc_duration_seconds{quantile="1"} 0.04137968
go_gc_duration_seconds_sum 6.507952752
go_gc_duration_seconds_count 2514
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 51
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.16.1b7"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 7.029864e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 7.948767312e+09
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.631695e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 1.5719899e+07
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
# TYPE go_memstats_gc_cpu_fraction gauge
go_memstats_gc_cpu_fraction 3.728570089327933e-05
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 5.457936e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 7.029864e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 5.6860672e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 9.330688e+06
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 24583
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 5.4321152e+07
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.619136e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.6337903057318528e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 1.5744482e+07
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 4800
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 116144
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 147456
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 1.1924304e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 955945
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 917504
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 917504
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 7.531828e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 12
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 241.62
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 20
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 6.2509056e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.63348501812e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.405046784e+09
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# TYPE promhttp_metric_handler_requests_in_flight gauge
promhttp_metric_handler_requests_in_flight 1
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
promhttp_metric_handler_requests_total{code="200"} 0
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0
# HELP vsphere_csi_info CSI Info
# TYPE vsphere_csi_info gauge
vsphere_csi_info{version=""} 1

I didn’t spend much more time on this but also didn’t notice much more that was different from previous versions. The changes are very welcome and the process for deploying and configuring vSphere with Tanzu seems to get easier with each new version.

2 thoughts on “Revisiting installing vSphere with Tanzu while using NSX Advanced Load Balancer”

  1. Hi Chris,

    thanks for this great article explaining the differences between 7.0U2 and 7.0U3.

    I do have some questions:
    – how can i assign an certificate to the K8s-API-Server (for scenarios talking directly to the TKGS cluster API)?
    – do you know an orchestration method for this?
    – do you have some pointers to additional material to read?

    thanks

    Erich

    1. Hello Erich. I’ll do my best to answer your questions but I don’t think I’ll be providing much value here:

      1. I’m not aware of any way (currently) to replace the API endpoint certificate for a TKGS cluster. While it’s fairly easy to do for the supervisor cluster, the functionality to do this for TKc’s does not appear to be present. I did some asking around internally on this but you might file a support request to get a more definitive answer.

      2. While I have not tried it myself, I am aware of the ability to use vRA to automate the installation of vSphere with Tanzu and deployment of TKGS clusters. You can read a bit more about this at https://blogs.vmware.com/management/2020/08/v7-vra-tanzu.html (a little bit old) and https://veducate.co.uk/vra-deploy-tanzu-clusters/.

      3. I would always recommend starting with the official vSphere with Tanzu documentation, https://docs.vmware.com/en/VMware-vSphere/7.0/vmware-vsphere-with-tanzu/GUID-152BE7D2-E227-4DAA-B527-557B564D9718.html. William Lam also has several great blog posts at https://williamlam.com/tag/vsphere-with-tanzu. You’ll likely find lots of good content at https://tanzu.vmware.com/blog also.

Leave a Comment

Your email address will not be published.