Installing Tanzu Kubernetes Grid 1.3 on vSphere with NSX Advanced Load Balancer

In my earlier post, Deploying NSX Advanced Load Balancer for use with Tanzu Kubernetes Grid and vSphere with Tanzu, I got NSX ALB up and running in a configuration that could be used for either vSphere with Tanzu or Tanzu Kubernetes Grid. In this post, I’ll go through the process of installing TKG 1.3 while using NSX ALB for Load Balancer services.

Install the tanzu CLI

One of the first things you’ll notice when you get started with TKG 1.3 is that the tkg CLI has been replaced with the tanzu CLI. It is a vast improvement over the former tkg CLI and you should see it’s functionality expand out to include support for other Tanzu products in the future.

The basics of installing the tanzu CLI are the same as the former tkg CLI….download the binary appropriate to your operating system, copy it to a location in your path, make sure it’s executable if on Linux or a Mac.

One interesting new facet of the tanzu CLI is the concept of plugins. This allows for easy extensibility of the CLI for future functionality. You can see what plugins are currently available by running the tanzu plugin list command.

tanzu plugin list

  NAME                LATEST VERSION  DESCRIPTION                                                        REPOSITORY  VERSION  STATUS
  alpha               v1.3.0          Alpha CLI commands                                                 core                 not installed
  cluster             v1.3.0          Kubernetes cluster operations                                      core        v1.3.0   installed
  login               v1.3.0          Login to the platform                                              core        v1.3.0   installed
  pinniped-auth       v1.3.0          Pinniped authentication operations (usually not directly invoked)  core        v1.3.0   installed
  kubernetes-release  v1.3.0          Kubernetes release operations                                      core        v1.3.0   installed
  management-cluster  v1.3.0          Kubernetes management cluster operations                           tkg         v1.3.0   installed

Using the tanzu CLI is a little bit different than the tkg CLI but starts to make a lot more sense as you get used to it. Functions are in a hierarchical structure. For example, everything to do with creating or managing a managing cluster is under the management-cluster sub-command. Similarly, everything to do with workload clusters is under the cluster sub-command.

Another welcome functional addition that comes with the tanzu CLI is the inclusion of shell completion. All that’s needed to enable it (on a Linux system at least) is to run source <(tanzu completion bash). With this in place, you don’t need to reference a list of available sub-commands as tab’ing will show you what commands are available. The following is an example of tab’ing after typing tanzu man:

ceip-participation  delete              import              register
create              get                 kubeconfig          upgrade
credentials         help                permissions

Deploy a Kubernetes node OVA

This is pretty straightforward and similar to the process uses in earlier TKG versions but you now have the choice of using pre-built Photon OS or Ubuntu images on vSphere. You can also build your own image if desired.

Once the OVA is deployed, be sure to save it is as a template or TKG won’t know that it’s available for use as a node OS.

Create a management cluster via the UI

The process for building out the management cluster is also fairly similar but you’ll see that there are some new configuration options present.

 tanzu management-cluster create --ui

Validating the pre-requisites...
Serving kickstart UI at http://127.0.0.1:8080

A browser should be launched automatically and you can choose your platform for deploying TKG (vSphere in this example).

Enter the information necessary to connect to your vCenter Server and press the Connect button.

You’ll see a message similar to the following…press Continue if it looks good.

And another message, this time wanting you to confirm that you’re deploying a TKG management cluster and not trying to stand up vSphere with Tanzu.

Select an appropriate Datacenter and paste in a public key that can be used to validate SSH clients to the Kubernetes nodes.

You have a lot of choices here but I’m going with a simple and small Development cluster. This will result in a single control plane node and a single worker node. You need to specify an IP address for the Control Plane Endpoint value that is in the same subnet as used by the Kubernetes nodes. The value supplied here will end up being the address used by kube-vip for the Kubernetes API endpoint address.

Here is what I was waiting for, the configuration of NSX Advanced Load Balancer (ALB).

The first step is to supply some simple connection information and click the Verify Credentials button.

If the connection was successful, the Verify Connection button will be greyed out and say Verified now. You should be able to choose the Cloud Name and Service Engine Group Name values from the dropdowns. The VIP Network Name needs to match the network name selected in NSX ALB for VIP addresses, just as the VIP Network CIDR value needs to match the subnet configured on the same network in NSX ALB. Supplying the certificate was easy for me since I used my own wildcard certificate. If you don’t have the NSX ALB certificate handy, you can download it from the NSX ALB UI via the Templates, Security, SSL/TLS Certificates page.

As a reminder, you can see the values I used for VIP Network Name and VIP Network CIDR taken from the NSX ALB UI (the K8s-Frontend row):

You can skip the Optional Metadata page unless you want to provide custom labels.

Choose an appropriate VM folder, Datastore and Cluster.

Choose an appropriate portgroup for you Kubernetes node VMs. You can leave the service and pod CIDRs as is or update them as needed.

You might have noticed that Enable Proxy Settings toggle in the previous screen. This is a very welcome addition as deploying TKG when a proxy server was in the mix was not very intuitive in earlier versions. I’m not using a proxy server but wanted to get a screenshot of what you can configure if you need to.

This was another welcome addition…the ability to configure Identity Management from the installer UI. I have written two posts, How to Configure Dex and Gangway for Active Directory Authentication in TKG and How to Deploy an OIDC-Enabled Cluster on vSphere in TKG 1.2, that deal only with configuration authentication in TKG. I’m very happy to see that those two posts won’t be needed going forward. Obviously, you can use what I’ve configured here as an primer for how to configure Active Directory integration but will need to customize each parameter for your deployment.

If you properly uploaded an OVA and converted it to a template, you should be able to select it (or from one of several).

The ability to register a TKG management cluster with Tanzu Mission Control is new for 1.3. You can manually do it after the fact via a process similar to what I documented in Attaching a vSphere 7.0 with Tanzu supervisor cluster to Tanzu Mission Control and creating new Tanzu Kubernetes clusters (instructions coming later), or you can do it here. Once your management cluster is up and running you’ll be able to provision TKG workload clusters via the TMC UI.

Barring any internal policies prohibiting it, you should always participate in the Customer Experience Improvement Program.

If you’re happy with your configuration, click the Deploy Management Cluster button.

You might have noticed in the previous screenshot that there was a file referenced, /home/ubuntu/.tanzu/tkg/clusterconfigs/jboy134b9x.yaml, that we didn’t create. The installer actually took everything we entered in the UI and saved it. The really nice thing about this is that you can quickly create other management clusters (or recreate this one if you decide to destroy it) from this same file (I’ll show this later).

jboy134b9x.yaml
AVI_CA_DATA_B64: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZhekNDQTFPZ0F3SUJBZ0lRTWZaeTA4bXV2SVZLZFpWRHo3L3JZekFOQmdrcWhraUc5dzBCQVFzRkFEQkkKTVJVd0V3WUtDWkltaVpQeUxHUUJHUllGZEdGdWVuVXhGREFTQmdvSmtpYUprL0lzWkFFWkZnUmpiM0p3TVJrdwpGd1lEVlFRREV4QkRUMDVVVWs5TVEwVk9WRVZTTFVOQk1CNFhEVEl3TURneE9URTNNakEwTkZvWERUTXdNRGd4Ck9URTNNekF6TlZvd1NERVZNQk1HQ2dtU0pvbVQ4aXhrQVJrV0JYUmhibnAxTVJRd0VnWUtDWkltaVpQeUxHUUIKR1JZRVkyOXljREVaTUJjR0ExVUVBeE1RUTA5T1ZGSlBURU5GVGxSRlVpMURRVENDQWlJd0RRWUpLb1pJaHZjTgpBUUVCQlFBRGdnSVBBRENDQWdvQ2dnSUJBTEtJZFg3NjQzUHp2dFZYbHFOSXdEdU5xK3JoY0hGMGZqUjQxNGorCjFJR1FVdVhyeWtqaFNEdGhQUCs4QkdON21CZ0hUOEFqQVMxYjk1eGM4QjBTMkZobG4zQW9SRTl6MDNHdGZzQnUKRlNCUlVWd0FpZlg2b1h1OTdXemZmaHFQdHhaZkxKWGJoT29tamxrWDZpZmZBczJUT0xVeDJPajR3MnZ5Ymh6agpsY0E3MGFpKzBTbDZheFNvM2xNWjRLa3VaMldnZkVjYURqamozMy9wVjMvYm5GSys3eWRQdHRjMlRlazV4c0k4ClhOTWlySVZ4VWlVVDRZTHk0V0xpUzIwMEpVZmJwMVpuTXZuYlE4SnYxUW5abDlXN1dtQlBjZ3hSNEFBdWIwSzQKdlpMWHU2TVhpYm9UbHprTUIvWXRoQ2tUTmxKY0traEhmNjBZUi9UNlN4MVQybnVweUJhNGRlbzVVR1B6aFJpSgpwTjM3dXFxQWRLMXFNRHBDakFSalM2VTdMZjlKS2pmaXJpTHpMZXlBalA4a2FONFRkSFNaZDBwY1FvWlN4ZXhRCjluKzRFNE1RbTRFSjREclZaQ2lsc3lMMkJkRVRjSFhLUGM3cStEYjRYTTdqUEtORzVHUDFFTVY0WG9odjU4eVoKL3JSZm1LNjRnYXI4QU1uT0tUMkFQNjgxcWRaczdsbGpPTmNYVUFMemxYNVRxSWNoWVQwRFZRbUZMWW9NQmVaegowbDIxUWpiSzBZV25QemE2WWkvTjRtNnJGYkVCNFdYaXFoWVNreHpyTXZvY1ZVZ2Q0QUFQMXZmSE5uRkVzblVSCm5Tc2lnbEZIL3hseU8zY0JGcm1vWkF4YkEyMDkxWEhXaEI0YzBtUUVJM2hPcUFCOFVvRkdCclFwbVErTGVzb0MKMUxaOUFnTUJBQUdqVVRCUE1Bc0dBMVVkRHdRRUF3SUJoakFQQmdOVkhSTUJBZjhFQlRBREFRSC9NQjBHQTFVZApEZ1FXQkJURkF4U3ZZNjRRNWFkaG04SVllY0hCQVV1b2J6QVFCZ2tyQmdFRUFZSTNGUUVFQXdJQkFEQU5CZ2txCmhraUc5dzBCQVFzRkFBT0NBZ0VBamcvdjRtSVA3Z0JWQ3c0cGVtdEduM1BTdERoL2FCOXZiV3lqQXl4U05hYUgKSDBuSUQ1cTV3b3c5dWVCaURmalRQbmhiZjNQNzY4SEc4b0wvKzlDK1ZtLzBsaUZCZCswL0RhYXlLcEFORk1MQgpCVitzMmFkV1JoUXVjTFFmWFB3dW04UnliV3Y4MndrUmtXQ0NkT0JhQXZBTXVUZ2swOFN3Skl5UWZWZ3BrM25ZCjBPd2pGd1NBYWR2ZXZmK0xvRC85TDhSOU5FdC9uNFdKZStMdEVhbW85RVZiK2wrY1lxeXh5dWJBVlkwWTZCTTIKR1hxQWgzRkVXMmFRTXB3b3VoLzVTN3c1b1NNWU42bWlZMW9qa2k4Z1BtMCs0K0NJTFBXaC9mcjJxME8vYlB0YgpUcisrblBNbVo4b3Y5ZXBOR0l1cWh0azVqYTIvSnVZK1JXNDZJUmM4UXBGMUV5VWFlMDJFNlUyVmFjczdHZ2UyCkNlU0lOa29MRkZtaUtCZkluL0hBY2hsbWU5YUw2RGxKOXdBcmVCREgzRThrSDdnUkRXYlNLMi9RRDBIcWFjK0UKZ2VHSHdwZy84T3RCT0hVTW5NN2VMT1hCSkZjSm9zV2YwWG5FZ1M0dWJnYUhncURFdThwOFBFN3JwQ3h0VU51cgp0K3gyeE9OSS9yQldnZGJwNTFsUHI3bzgxOXpQSkN2WVpxMVBwMXN0OGZiM1JsVVNXdmJRTVBGdEdBeWFCeStHCjBSZ1o5V1B0eUVZZ25IQWI1L0RxNDZzbmU5L1FuUHd3R3BqdjFzMW9FM1pGUWpodm5HaXM4K2RxUnhrM1laQWsKeWlEZ2hXN2FudHpZTDlTMUNDOHNWZ1ZPd0ZKd2ZGWHBkaWlyMzVtUWx5U0czMDFWNEZzUlYrWjBjRnA0TmkwPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t
AVI_CLOUD_NAME: Default-Cloud
AVI_CONTROLLER: nsxalb-cluster.corp.tanzu
AVI_DATA_NETWORK: K8s-Frontend
AVI_DATA_NETWORK_CIDR: 192.168.220.0/23
AVI_ENABLE: "true"
AVI_LABELS: ""
AVI_PASSWORD: 
AVI_SERVICE_ENGINE_GROUP: Default-Group
AVI_USERNAME: admin
CLUSTER_CIDR: 100.96.0.0/11
CLUSTER_NAME: tkg-mgmt
CLUSTER_PLAN: dev
ENABLE_CEIP_PARTICIPATION: "true"
ENABLE_MHC: "true"
IDENTITY_MANAGEMENT_TYPE: ldap
INFRASTRUCTURE_PROVIDER: vsphere
LDAP_BIND_DN: cn=Administrator,cn=Users,dc=corp,dc=tanzu
LDAP_BIND_PASSWORD: 
LDAP_GROUP_SEARCH_BASE_DN: dc=corp,dc=tanzu
LDAP_GROUP_SEARCH_FILTER: (objectClass=group)
LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: member
LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn
LDAP_GROUP_SEARCH_USER_ATTRIBUTE: DN
LDAP_HOST: controlcenter.corp.tanzu:636
LDAP_ROOT_CA_DATA_B64: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZhekNDQTFPZ0F3SUJBZ0lRTWZaeTA4bXV2SVZLZFpWRHo3L3JZekFOQmdrcWhraUc5dzBCQVFzRkFEQkkKTVJVd0V3WUtDWkltaVpQeUxHUUJHUllGZEdGdWVuVXhGREFTQmdvSmtpYUprL0lzWkFFWkZnUmpiM0p3TVJrdwpGd1lEVlFRREV4QkRUMDVVVWs5TVEwVk9WRVZTTFVOQk1CNFhEVEl3TURneE9URTNNakEwTkZvWERUTXdNRGd4Ck9URTNNekF6TlZvd1NERVZNQk1HQ2dtU0pvbVQ4aXhrQVJrV0JYUmhibnAxTVJRd0VnWUtDWkltaVpQeUxHUUIKR1JZRVkyOXljREVaTUJjR0ExVUVBeE1RUTA5T1ZGSlBURU5GVGxSRlVpMURRVENDQWlJd0RRWUpLb1pJaHZjTgpBUUVCQlFBRGdnSVBBRENDQWdvQ2dnSUJBTEtJZFg3NjQzUHp2dFZYbHFOSXdEdU5xK3JoY0hGMGZqUjQxNGorCjFJR1FVdVhyeWtqaFNEdGhQUCs4QkdON21CZ0hUOEFqQVMxYjk1eGM4QjBTMkZobG4zQW9SRTl6MDNHdGZzQnUKRlNCUlVWd0FpZlg2b1h1OTdXemZmaHFQdHhaZkxKWGJoT29tamxrWDZpZmZBczJUT0xVeDJPajR3MnZ5Ymh6agpsY0E3MGFpKzBTbDZheFNvM2xNWjRLa3VaMldnZkVjYURqamozMy9wVjMvYm5GSys3eWRQdHRjMlRlazV4c0k4ClhOTWlySVZ4VWlVVDRZTHk0V0xpUzIwMEpVZmJwMVpuTXZuYlE4SnYxUW5abDlXN1dtQlBjZ3hSNEFBdWIwSzQKdlpMWHU2TVhpYm9UbHprTUIvWXRoQ2tUTmxKY0traEhmNjBZUi9UNlN4MVQybnVweUJhNGRlbzVVR1B6aFJpSgpwTjM3dXFxQWRLMXFNRHBDakFSalM2VTdMZjlKS2pmaXJpTHpMZXlBalA4a2FONFRkSFNaZDBwY1FvWlN4ZXhRCjluKzRFNE1RbTRFSjREclZaQ2lsc3lMMkJkRVRjSFhLUGM3cStEYjRYTTdqUEtORzVHUDFFTVY0WG9odjU4eVoKL3JSZm1LNjRnYXI4QU1uT0tUMkFQNjgxcWRaczdsbGpPTmNYVUFMemxYNVRxSWNoWVQwRFZRbUZMWW9NQmVaegowbDIxUWpiSzBZV25QemE2WWkvTjRtNnJGYkVCNFdYaXFoWVNreHpyTXZvY1ZVZ2Q0QUFQMXZmSE5uRkVzblVSCm5Tc2lnbEZIL3hseU8zY0JGcm1vWkF4YkEyMDkxWEhXaEI0YzBtUUVJM2hPcUFCOFVvRkdCclFwbVErTGVzb0MKMUxaOUFnTUJBQUdqVVRCUE1Bc0dBMVVkRHdRRUF3SUJoakFQQmdOVkhSTUJBZjhFQlRBREFRSC9NQjBHQTFVZApEZ1FXQkJURkF4U3ZZNjRRNWFkaG04SVllY0hCQVV1b2J6QVFCZ2tyQmdFRUFZSTNGUUVFQXdJQkFEQU5CZ2txCmhraUc5dzBCQVFzRkFBT0NBZ0VBamcvdjRtSVA3Z0JWQ3c0cGVtdEduM1BTdERoL2FCOXZiV3lqQXl4U05hYUgKSDBuSUQ1cTV3b3c5dWVCaURmalRQbmhiZjNQNzY4SEc4b0wvKzlDK1ZtLzBsaUZCZCswL0RhYXlLcEFORk1MQgpCVitzMmFkV1JoUXVjTFFmWFB3dW04UnliV3Y4MndrUmtXQ0NkT0JhQXZBTXVUZ2swOFN3Skl5UWZWZ3BrM25ZCjBPd2pGd1NBYWR2ZXZmK0xvRC85TDhSOU5FdC9uNFdKZStMdEVhbW85RVZiK2wrY1lxeXh5dWJBVlkwWTZCTTIKR1hxQWgzRkVXMmFRTXB3b3VoLzVTN3c1b1NNWU42bWlZMW9qa2k4Z1BtMCs0K0NJTFBXaC9mcjJxME8vYlB0YgpUcisrblBNbVo4b3Y5ZXBOR0l1cWh0azVqYTIvSnVZK1JXNDZJUmM4UXBGMUV5VWFlMDJFNlUyVmFjczdHZ2UyCkNlU0lOa29MRkZtaUtCZkluL0hBY2hsbWU5YUw2RGxKOXdBcmVCREgzRThrSDdnUkRXYlNLMi9RRDBIcWFjK0UKZ2VHSHdwZy84T3RCT0hVTW5NN2VMT1hCSkZjSm9zV2YwWG5FZ1M0dWJnYUhncURFdThwOFBFN3JwQ3h0VU51cgp0K3gyeE9OSS9yQldnZGJwNTFsUHI3bzgxOXpQSkN2WVpxMVBwMXN0OGZiM1JsVVNXdmJRTVBGdEdBeWFCeStHCjBSZ1o5V1B0eUVZZ25IQWI1L0RxNDZzbmU5L1FuUHd3R3BqdjFzMW9FM1pGUWpodm5HaXM4K2RxUnhrM1laQWsKeWlEZ2hXN2FudHpZTDlTMUNDOHNWZ1ZPd0ZKd2ZGWHBkaWlyMzVtUWx5U0czMDFWNEZzUlYrWjBjRnA0TmkwPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t
LDAP_USER_SEARCH_BASE_DN: cn=Users,dc=corp,dc=tanzu
LDAP_USER_SEARCH_FILTER: (objectClass=person)
LDAP_USER_SEARCH_NAME_ATTRIBUTE: userPrincipalName
LDAP_USER_SEARCH_USERNAME: userPrincipalName
OIDC_IDENTITY_PROVIDER_CLIENT_ID: ""
OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: ""
OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: ""
OIDC_IDENTITY_PROVIDER_ISSUER_URL: ""
OIDC_IDENTITY_PROVIDER_NAME: ""
OIDC_IDENTITY_PROVIDER_SCOPES: ""
OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: ""
SERVICE_CIDR: 100.64.0.0/13
TKG_HTTP_PROXY_ENABLED: "false"
VSPHERE_CONTROL_PLANE_DISK_GIB: "20"
VSPHERE_CONTROL_PLANE_ENDPOINT: 192.168.130.128
VSPHERE_CONTROL_PLANE_MEM_MIB: "4096"
VSPHERE_CONTROL_PLANE_NUM_CPUS: "2"
VSPHERE_DATACENTER: /RegionA01
VSPHERE_DATASTORE: /RegionA01/datastore/map-vol
VSPHERE_FOLDER: /RegionA01/vm
VSPHERE_NETWORK: K8s-Workload
VSPHERE_PASSWORD: 
VSPHERE_RESOURCE_POOL: /RegionA01/host/RegionA01-MGMT/Resources
VSPHERE_SERVER: vcsa-01a.corp.tanzu
VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC5KYNeWQgVHrDHaEhBCLF1vIR0OAtUIJwjKYkY4E/5HhEu8fPFvBOIHPFTPrtkX4vzSiMFKE5WheKGQIpW3HHlRbmRPc9oe6nNKlsUfFAaJ7OKF146Gjpb7lWs/C34mjdtxSb1D/YcHSyqK5mxhyHAXPeK8lrxG5MLOJ3X2A3iUvXcBo1NdhRdLRWQmyjs16fnPx6840x9n5NqeiukFYIVhDMFErq42AkeewsWcbZQuwViSLk2cIc09eykAjaXMojCmSbjrj0kC3sbYX+HD2OWbKohTqqO6/UABtjYgTjIS4PqsXWk63dFdcxF6ukuO6ZHaiY7h3xX2rTg9pv1oT8WBR44TYgvyRp0Bhe0u2/n/PUTRfp22cOWTA2wG955g7jOd7RVGhtMHi9gFXeUS2KodO6C4XEXC7Y2qp9p9ARlNvu11QoaDyH3l0h57Me9we+3XQNuteV69TYrJnlgWecMa/x+rcaEkgr7LD61dY9sTuufttLBP2ro4EIWoBY6F1Ozvcp8lcgi/55uUGxwiKDA6gQ+UA/xtrKk60s6MvYMzOxJiUQbWYr3MJ3NSz6PJVXMvlsAac6U+vX4U9eJP6/C1YDyBaiT96cb/B9TkvpLrhPwqMZdYVomVHsdY7YriJB93MRinKaDJor1aIE/HMsMpbgFCNA7mma9x5HS/57Imw==
    admin@corp.local
VSPHERE_TLS_THUMBPRINT: DB:CC:DC:80:F9:17:DA:37:4F:AC:F7:65:6D:3D:AC:99:B8:A0:5A:BE
VSPHERE_USERNAME: administrator@vsphere.local
VSPHERE_WORKER_DISK_GIB: "20"
VSPHERE_WORKER_MEM_MIB: "4096"
VSPHERE_WORKER_NUM_CPUS: "2"

You’ll be able to follow the installation process at a high level in UI.

What’s happening at this point is that the system where you launched the installer from has downloaded a kind image and it should now be running as a container.

docker ps

CONTAINER ID        IMAGE                                                         COMMAND                  CREATED             STATUS              PORTS                       NAMES
960738ac17aa        projects.registry.vmware.com/tkg/kind/node:v1.20.4_vmware.1   "/usr/local/bin/entr…"   41 seconds ago      Up 32 seconds       127.0.0.1:38463->6443/tcp   tkg-kind-c1bjuas09c6l97afuk7g-control-plane

Back in the UI, we’ve moved on to the next stage and the kind container that is now running is being configured as a bootstrap Kubernetes cluster.

We can get access to this cluster via a temporary kubeconfig file that is created under .kube-tkg/tmp.

kubectl --kubeconfig=.kube-tkg/tmp/config_Vw5ZgNkH get nodes

NAME                                          STATUS   ROLES                  AGE   VERSION
tkg-kind-c1bjuas09c6l97afuk7g-control-plane   Ready    control-plane,master   38s   v1.20.4+vmware.1

You can see that it’s still coming up at this point.

kubectl --kubeconfig=.kube-tkg/tmp/config_fuSlXpaf get po -A

NAMESPACE                           NAME                                                                  READY   STATUS              RESTARTS   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager-5bdd64499b-zfsck            0/2     ContainerCreating   0          12s
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager-7f89b8594d-p54vp        0/2     ContainerCreating   0          9s
capi-system                         capi-controller-manager-c4f5f9c76-fdpvj                               0/2     ContainerCreating   0          16s
capi-webhook-system                 capi-controller-manager-768b989cbc-nsl67                              0/2     ContainerCreating   0          19s
capi-webhook-system                 capi-kubeadm-bootstrap-controller-manager-67444bbcc9-ms5hh            0/2     ContainerCreating   0          15s
capi-webhook-system                 capi-kubeadm-control-plane-controller-manager-5466b4d4d6-dz2rn        0/2     ContainerCreating   0          11s
capi-webhook-system                 capv-controller-manager-7cb98c468d-66sh8                              0/2     ContainerCreating   0          8s
capv-system                         capv-controller-manager-767cc6b6bf-4kj8j                              0/2     ContainerCreating   0          2s
cert-manager                        cert-manager-6cbfc68c4b-wpp9j                                         1/1     Running             0          45s
cert-manager                        cert-manager-cainjector-796775c48f-4557v                              1/1     Running             0          45s
cert-manager                        cert-manager-webhook-7646d5bc94-xzdqv                                 1/1     Running             0          44s
kube-system                         coredns-68d49685bd-27vcj                                              1/1     Running             0          66s
kube-system                         coredns-68d49685bd-hhpjv                                              1/1     Running             0          66s
kube-system                         etcd-tkg-kind-c1bjuas09c6l97afuk7g-control-plane                      0/1     Running             0          84s
kube-system                         kindnet-kb84v                                                         1/1     Running             0          66s
kube-system                         kube-apiserver-tkg-kind-c1bjuas09c6l97afuk7g-control-plane            1/1     Running             0          84s
kube-system                         kube-controller-manager-tkg-kind-c1bjuas09c6l97afuk7g-control-plane   1/1     Running             0          82s
kube-system                         kube-proxy-rhg9m                                                      1/1     Running             0          66s
kube-system                         kube-scheduler-tkg-kind-c1bjuas09c6l97afuk7g-control-plane            0/1     Running             0          82s
local-path-storage                  local-path-provisioner-8b46957d4-nmrzx                                1/1     Running             0          66s

And once the pods in the bootstrap cluster are fully running, we can examine the logs in the capv-controller-manager pod to get a more detailed view of what’s happening. I like to stream these logs during installation to make sure nothing looks out of the ordinary.

kubectl --kubeconfig=.kube-tkg/tmp/config_Vw5ZgNkH -n capv-system logs capv-controller-manager-767cc6b6bf-4kj8j logs manager -f

tkg-system/tkg-mgmt-control-plane-8lsnp "msg"="status.ready not found"  "vmGVK"="infrastructure.cluster.x-k8s.io/v1alpha3, Kind=VSphereVM" "vmName"="tkg-mgmt-control-plane-s2vrv" "vmNamespace"="tkg-system"
I0321 12:46:35.214952       1 vspheremachine_controller.go:364] capv-controller-manager/vspheremachine-controller/tkg-system/tkg-mgmt-control-plane-8lsnp "msg"="waiting for ready state"
I0321 12:46:35.215794       1 controller.go:281] controller-runtime/controller "msg"="Successfully Reconciled" "controller"="vspheremachine" "name"="tkg-mgmt-control-plane-8lsnp" "namespace"="tkg-system"
I0321 12:46:35.239429       1 util.go:127] capv-controller-manager/vspherevm-controller/tkg-system/tkg-mgmt-control-plane-s2vrv/task-8051 "msg"="task found"  "description-id"="VirtualMachine.clone" "state"="running"
I0321 12:46:35.239607       1 util.go:133] capv-controller-manager/vspherevm-controller/tkg-system/tkg-mgmt-control-plane-s2vrv/task-8051 "msg"="task is still running"  "description-id"="VirtualMachine.clone"
I0321 12:46:35.239632       1 vspherevm_controller.go:333] capv-controller-manager/vspherevm-controller/tkg-system/tkg-mgmt-control-plane-s2vrv "msg"="VM state is not reconciled"  "actual-vm-state"="pending" "expected-vm-state"="ready"
I0321 12:46:35.240986       1 vspherevm_controller.go:248] capv-controller-manager/vspherevm-controller/tkg-system/tkg-mgmt-control-plane-s2vrv "msg"="resource patch was not required"  "local-resource-version"="1610" "remote-resource-version"="1610"
I0321 12:46:35.241098       1 controller.go:281] controller-runtime/controller "msg"="Successfully Reconciled" "controller"="vspherevm" "name"="tkg-mgmt-control-plane-s2vrv" "namespace"="tkg-system"
I0321 12:46:37.855066       1 vspherecluster_controller.go:250] capv-controller-manager/vspherecluster-controller/tkg-system/tkg-mgmt "msg"="Reconciling VSphereCluster"
I0321 12:46:37.855165       1 vspherecluster_controller.go:346] capv-controller-manager/vspherecluster-controller/tkg-system/tkg-mgmt "msg"="skipping load balancer reconciliation"  "reason"="VSphereCluster.Spec.LoadBalancerRef is nil"
I0321 12:46:37.855306       1 vspherecluster_controller.go:501] capv-controller-manager/vspherecluster-controller/tkg-system/tkg-mgmt "msg"="skipping reconcile when API server is online"  "reason"="alreadyPolling"
I0321 12:46:40.664355       1 controller.go:281] controller-runtime/controller "msg"="Successfully Reconciled" "controller"="vspherecluster" "name"="tkg-mgmt" "namespace"="tkg-system"
I0321 12:46:43.942460       1 vspherecluster_controller.go:250] capv-controller-manager/vspherecluster-controller/tkg-system/tkg-mgmt "msg"="Reconciling VSphereCluster"
I0321 12:46:43.942598       1 vspherecluster_controller.go:346] capv-controller-manager/vspherecluster-controller/tkg-system/tkg-mgmt "msg"="skipping load balancer reconciliation"  "reason"="VSphereCluster.Spec.LoadBalancerRef is nil"
I0321 12:46:43.942713       1 vspherecluster_controller.go:501] capv-controller-manager/vspherecluster-controller/tkg-system/tkg-mgmt "msg"="skipping reconcile when API server is online"  "reason"="alreadyPolling"
I0321 12:46:46.811520       1 controller.go:281] controller-runtime/controller "msg"="Successfully Reconciled" "controller"="vspherecluster" "name"="tkg-mgmt" "namespace"="tkg-system"

Back in the UI we can see that the process has moved on to actually creating the management cluster.

And if we check the vSphere Client, we’ll see that a control plane VM has been provisioned.

A short while later we’ll see that the control plane VM is functional and the rest of the cluster instantiation has started.

The worker VM is now being provisioned.

The last step is “pivot” the management functionality from the bootstrap cluster to the new management cluster. This is a fairly quick process and you should see it done in less than a few minutes.

Success!

From the command line, we can start to inspect our new management cluster

The tanzu management-cluster get command not only tells us the basics of our management cluster like the old tkg get mc but gives us a bit of detail about the nodes and components.

tanzu management-cluster get

  NAME      NAMESPACE   STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES
  tkg-mgmt  tkg-system  running  1/1           1/1      v1.20.4+vmware.1  management


Details:

NAME                                                         READY  SEVERITY  REASON                           SINCE  MESSAGE
/tkg-mgmt                                                    True                                              42s
├─ClusterInfrastructure - VSphereCluster/tkg-mgmt            True                                              46s
├─ControlPlane - KubeadmControlPlane/tkg-mgmt-control-plane  True                                              42s
│ └─Machine/tkg-mgmt-control-plane-ffztp                     True                                              43s
└─Workers                                                                                                         
  └─MachineDeployment/tkg-mgmt-md-0                                                                               
    └─Machine/tkg-mgmt-md-0-766c78c69c-pz6dl                 True                                              46s


Providers:

  NAMESPACE                          NAME                    TYPE                    PROVIDERNAME  VERSION  WATCHNAMESPACE
  capi-kubeadm-bootstrap-system      bootstrap-kubeadm       BootstrapProvider       kubeadm       v0.3.14        
  capi-kubeadm-control-plane-system  control-plane-kubeadm   ControlPlaneProvider    kubeadm       v0.3.14        
  capi-system                        cluster-api             CoreProvider            cluster-api   v0.3.14        
  capv-system                        infrastructure-vsphere  InfrastructureProvider  vsphere       v0.7.6       

You can see that we have a new kubectl context created and set as current.

kubectl config get-contexts

CURRENT   NAME                      CLUSTER    AUTHINFO         NAMESPACE
*         tkg-mgmt-admin@tkg-mgmt   tkg-mgmt   tkg-mgmt-admin

That context is what is known as an “admin” context and does not use LDAP for authentication. In order to provide a kubeconfig file to a an LDAP user, we run the following:

tanzu management-cluster kubeconfig get --export-file /tmp/ldaps-tkg-mgmt-kubeconfig

You can now access the cluster by running 'kubectl config use-context tanzu-cli-tkg-mgmt@tkg-mgmt' under path '/tmp/ldaps-tkg-mgmt-kubeconfig'

From here, we can deliver that kubeconfig file (/tmp/ldaps-tkg-mgmt-kubeconfig) to any user and they can use to work with the cluster.

kubectl --kubeconfig=/tmp/ldaps-tkg-mgmt-kubeconfig get nodes

The first time they run this command, a browser will open where they can login with their LDAP credentials.

If you’re successfully logged in, you’ll get a message asking you to close the browser tab.

And running the same command from before gets the user a little farther along.

kubectl --kubeconfig=/tmp/ldaps-tkg-mgmt-kubeconfig get nodes

Error from server (Forbidden): nodes is forbidden: User "tkgadmin@corp.tanzu" cannot list resource "nodes" in API group "" at the cluster scope

Since the tanzuadmins group has no privileges in the cluster, we need to create a clusterrolebinding.

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tanzuadmins
subjects:
  - kind: Group
    name: tanzuadmins
    apiGroup:
roleRef:
  kind: ClusterRole #this must be Role or ClusterRole
  name: cluster-admin # this must match the name of the Role or ClusterRole you wish to bind to
  apiGroup: rbac.authorization.k8s.io
kubectl apply -f tanuadmin-clusterrolebinding.yaml

Now the user is able to login and work with cluster-admin privileges (or lower if you use a different clusterrolebinding)

kubectl --kubeconfig=/tmp/ldaps-tkg-mgmt-kubeconfig get nodes

NAME                             STATUS   ROLES                  AGE     VERSION
tkg-mgmt-control-plane-ffztp     Ready    control-plane,master   8m47s   v1.20.4+vmware.1
tkg-mgmt-md-0-766c78c69c-pz6dl   Ready    <none>                 5m42s   v1.20.4+vmware.1

To give you an idea of how much work was done to get this cluster up and running with all of the functionality needed for creating workload clusters (and then some), these are the deployments that are present in a typical management cluster:

kubectl get deployments -A

NAMESPACE                           NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager       1/1     1            1           4m48s
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager   1/1     1            1           4m42s
capi-system                         capi-controller-manager                         1/1     1            1           4m54s
capi-webhook-system                 capi-controller-manager                         1/1     1            1           4m57s
capi-webhook-system                 capi-kubeadm-bootstrap-controller-manager       1/1     1            1           4m53s
capi-webhook-system                 capi-kubeadm-control-plane-controller-manager   1/1     1            1           4m46s
capi-webhook-system                 capv-controller-manager                         1/1     1            1           4m40s
capv-system                         capv-controller-manager                         1/1     1            1           4m35s
cert-manager                        cert-manager                                    1/1     1            1           8m39s
cert-manager                        cert-manager-cainjector                         1/1     1            1           8m39s
cert-manager                        cert-manager-webhook                            1/1     1            1           8m39s
kube-system                         antrea-controller                               1/1     1            1           8m10s
kube-system                         coredns                                         2/2     2            2           9m1s
kube-system                         metrics-server                                  1/1     1            1           2m31s
kube-system                         vsphere-csi-controller                          1/1     1            1           3m25s
pinniped-concierge                  pinniped-concierge                              2/2     2            2           3m38s
pinniped-supervisor                 pinniped-supervisor                             2/2     2            2           3m38s
tanzu-system-auth                   dex                                             1/1     1            1           3m37s
tkg-system-networking               ako-operator-controller-manager                 1/1     1            1           8m54s
tkg-system                          kapp-controller                                 1/1     1            1           8m54s
tkg-system                          tanzu-addons-controller-manager                 1/1     1            1           8m15s
tkr-system                          tkr-controller-manager                          1/1     1            1           8m54s

One really nice thing with creating a management cluster in TKG 1.3 is that it caches the information you enter the first time. If you ever need to go back and create another cluster (or recreate your first one) you’ll get a pop-up similar to the following where you can click the Restore Data button.

Create a management cluster via the CLI

I can’t stress enough how much easier it is to create a management cluster via the UI but there are times when doing it via the command line may be needed. If that’s the case, you first need to run the tanzu management-cluster create command to build out the .tanzu folder structure (it might throw an error since not enough parameters were supplied, this is okay). From there, you’ll want to create a configuration file that will have all of the information needed for building your cluster. This file should look very much like the one noted earlier, /home/ubuntu/.tanzu/tkg/clusterconfigs/jboy134b9x.yaml. From there, you run a single command to start the cluster build process:

tanzu management-cluster create --file /home/ubuntu/.tanzu/tkg/clusterconfigs/jboy134b9x.yaml -y --deploy-tkg-on-vSphere7 -v 6

The process proceeds just as it did from the UI and you can follow the process at the command line and from the bootstrap cluster.

Create a workload cluster

The best way to create a workload cluster in TKG 1.3 is to create a configuration file with all of the parameters needed and then call it with the tanzu cluster create command. This allows you to easily see what you’re trying to configure and re-use the cluster definition for creating other clusters in the future.

This example is very similar to the management cluster file that was noted earlier and has just about the bare minimum of information needed to stand up a cluster.

CLUSTER_CIDR: 100.96.0.0/11
SERVICE_CIDR: 100.64.0.0/13
CLUSTER_NAME: tkg-wld
CLUSTER_PLAN: dev
NAMESPACE: default
CNI: antrea
ENABLE_MHC: "true"
MHC_UNKNOWN_STATUS_TIMEOUT: 5m
MHC_FALSE_STATUS_TIMEOUT: 5m
VSPHERE_CONTROL_PLANE_DISK_GIB: "20"
VSPHERE_CONTROL_PLANE_ENDPOINT: 192.168.130.129
VSPHERE_CONTROL_PLANE_MEM_MIB: "4096"
VSPHERE_CONTROL_PLANE_NUM_CPUS: "2"
VSPHERE_DATACENTER: /RegionA01
VSPHERE_DATASTORE: /RegionA01/datastore/map-vol
VSPHERE_FOLDER: /RegionA01/vm
VSPHERE_NETWORK: K8s-Workload
VSPHERE_PASSWORD: <encoded:Vk13YXJlMSE=>
VSPHERE_RESOURCE_POOL: /RegionA01/host/RegionA01-MGMT/Resources
VSPHERE_SERVER: vcsa-01a.corp.tanzu
VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC5KYNeWQgVHrDHaEhBCLF1vIR0OAtUIJwjKYkY4E/5HhEu8fPFvBOIHPFTPrtkX4vzSiMFKE5WheKGQIpW3HHlRbmRPc9oe6nNKlsUfFAaJ7OKF146Gjpb7lWs/C34mjdtxSb1D/YcHSyqK5mxhyHAXPeK8lrxG5MLOJ3X2A3iUvXcBo1NdhRdLRWQmyjs16fnPx6840x9n5NqeiukFYIVhDMFErq42AkeewsWcbZQuwViSLk2cIc09eykAjaXMojCmSbjrj0kC3sbYX+HD2OWbKohTqqO6/UABtjYgTjIS4PqsXWk63dFdcxF6ukuO6ZHaiY7h3xX2rTg9pv1oT8WBR44TYgvyRp0Bhe0u2/n/PUTRfp22cOWTA2wG955g7jOd7RVGhtMHi9gFXeUS2KodO6C4XEXC7Y2qp9p9ARlNvu11QoaDyH3l0h57Me9we+3XQNuteV69TYrJnlgWecMa/x+rcaEkgr7LD61dY9sTuufttLBP2ro4EIWoBY6F1Ozvcp8lcgi/55uUGxwiKDA6gQ+UA/xtrKk60s6MvYMzOxJiUQbWYr3MJ3NSz6PJVXMvlsAac6U+vX4U9eJP6/C1YDyBaiT96cb/B9TkvpLrhPwqMZdYVomVHsdY7YriJB93MRinKaDJor1aIE/HMsMpbgFCNA7mma9x5HS/57Imw==
    admin@corp.local
VSPHERE_TLS_THUMBPRINT: DB:CC:DC:80:F9:17:DA:37:4F:AC:F7:65:6D:3D:AC:99:B8:A0:5A:BE
VSPHERE_USERNAME: administrator@vsphere.local
VSPHERE_WORKER_DISK_GIB: "20"
VSPHERE_WORKER_MEM_MIB: "8192"
VSPHERE_WORKER_NUM_CPUS: "8"
ENABLE_AUDIT_LOGGING: false
ENABLE_DEFAULT_STORAGE_CLASS: true

The only real difference between how this cluster is created and how the management cluster was created is that I’m sizing the worker nodes slightly larger (8 CPUs, 8GB RAM). You can see this in the VSPHERE_WORKER_MEM_MIB and VSPHERE_WORKER_NUM_CPUS value. The VSPHERE_CONTROL_PLANE_ENDPOINT value is also different but this is required to be unique between clusters.

One last thing we need to supply is the Kubernetes version for our cluster. You can get the available versions by querying for the tanzukuberentesreleases (tkr) custom resource.

kubectl get tkr

NAME                        VERSION                   COMPATIBLE   CREATED
v1.17.16---vmware.2-tkg.1   v1.17.16+vmware.2-tkg.1   True         15m
v1.18.16---vmware.1-tkg.1   v1.18.16+vmware.1-tkg.1   True         15m
v1.19.8---vmware.1-tkg.1    v1.19.8+vmware.1-tkg.1    True         15m
v1.20.4---vmware.1-tkg.1    v1.20.4+vmware.1-tkg.1    True         15m

Since I only uploaded a single OVA and it was for the 1.20.4 version, that’s the tkr version I’m going to use.

tanzu cluster create -f /home/ubuntu/.tanzu/tkg/clusterconfigs/tkg-wld.yaml --tkr v1.20.4---vmware.1-tkg.1 -v 6

Almost immediately, we should see a new control plane VM created in the vSphere client.

And we can see that it has the default 2 CPUs and 4GB RAM.

Following along at the command line, we can see that control plane is being created.

Using namespace from config:
Validating configuration...
Waiting for resource pinniped-federation-domain of type *unstructured.Unstructured to be up and running
Waiting for resource pinniped-supervisor-default-tls-certificate of type *v1.Secret to be up and running
no os options provided, selecting based on default os options
Creating workload cluster 'tkg-wld'...
patch cluster object with operation status:
        {
                "metadata": {
                        "annotations": {
                                "TKGOperationInfo" : "{\"Operation\":\"Create\",\"OperationStartTimestamp\":\"2021-03-21 15:45:50.400816781 +0000 UTC\",\"OperationTimeout\":1800}",
                                "TKGOperationLastObservedTimestamp" : "2021-03-21 15:45:50.400816781 +0000 UTC"
                        }
                }
        }
Waiting for cluster to be initialized...
cluster control plane is still being initialized, retrying

One interesting thing I observed was that if you don’t configure any Identity Management information while deploying the management cluster, the process of building out a workload cluster still looks for that information. You’ll see messages similar to the following (they can be ignored, the process moves along without issue)

Using namespace from config:
Validating configuration...
Waiting for resource pinniped-federation-domain of type *unstructured.Unstructured to be up and running
no matches for kind "FederationDomain" in version "config.supervisor.pinniped.dev/v1alpha1", retrying
no matches for kind "FederationDomain" in version "config.supervisor.pinniped.dev/v1alpha1", retrying
no matches for kind "FederationDomain" in version "config.supervisor.pinniped.dev/v1alpha1", retrying
no matches for kind "FederationDomain" in version "config.supervisor.pinniped.dev/v1alpha1", retrying
Failed to configure Pinniped configuration for workload cluster. Please refer to the documentation to check if you can configure pinniped on workload cluster manually
no os options provided, selecting based on default os options
Creating workload cluster 'tkg-wld'...

Once the control plane is up, we’ll see some movement at the command line until it gets to the point of creating the worker nodes.

Getting secret for cluster
Waiting for resource tkg-wld-kubeconfig of type *v1.Secret to be up and running
Waiting for cluster nodes to be available...
Waiting for resource tkg-wld of type *v1alpha3.Cluster to be up and running
Waiting for resources type *v1alpha3.MachineDeploymentList to be up and running
worker nodes are still being created for MachineDeployment 'tkg-wld-md-0', DesiredReplicas=1 Replicas=1 ReadyReplicas=0 UpdatedReplicas=1, retrying

And in the vSphere Client, a worker node is getting created.

And here we can see that the custom worker node sizing took effect since this node has 8 CPUs and 8GB RAM.

Back at the command line, it’s a very short time until the cluster is created.

Waiting for resources type *v1alpha3.MachineList to be up and running
Waiting for addons installation...
Waiting for resources type *v1alpha3.ClusterResourceSetList to be up and running
Waiting for resource antrea-controller of type *v1.Deployment to be up and running

Workload cluster 'tkg-wld' created

I knew that there would be nothing created in NSX ALB for this cluster yet but was curious to see if there was any evidence of activity. Sure enough, I found access logs from the ako (Avi Kubernetes Operator) pod in the new tkg-wld cluster. I found this on the Operations, Events page and simply searched for the cluster name.

We can use the tanzu cluster list command to see a high level view of the clusters present.

tanzu cluster list

  NAME     NAMESPACE  STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES   PLAN
  tkg-wld  default    running  1/1           1/1      v1.20.4+vmware.1  <none>  dev

And the tanzu cluster get command to see more detail on a specific cluster.

tanzu cluster get tkg-wld

  NAME     NAMESPACE  STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES
  tkg-wld  default    running  1/1           1/1      v1.20.4+vmware.1  <none>
ℹ

Details:

NAME                                                        READY  SEVERITY  REASON  SINCE  MESSAGE
/tkg-wld                                                    True                     4m58s
├─ClusterInfrastructure - VSphereCluster/tkg-wld            True                     9m30s
├─ControlPlane - KubeadmControlPlane/tkg-wld-control-plane  True                     4m58s
│ └─Machine/tkg-wld-control-plane-7tgg6                     True                     6m27s
└─Workers
  └─MachineDeployment/tkg-wld-md-0
    └─Machine/tkg-wld-md-0-84c4d7898c-k5lql                 True                     97s

Similar to the command we ran to get a kubeconfig for an LDAP user in the management cluster earlier, we can use the tanzu cluster kubeconfig get command to merge the admin user’s kubeconfig with the current kubeconfig.

tanzu cluster kubeconfig get tkg-wld --admin

ℹ  You can now access the cluster by running 'kubectl config use-context tkg-wld-admin@tkg-wld'

You should see a new context present for the new workload cluster.

kubectl config get-contexts

CURRENT   NAME                      CLUSTER    AUTHINFO          NAMESPACE
*         tkg-mgmt-admin@tkg-mgmt   tkg-mgmt   tkg-mgmt-admin
          tkg-wld-admin@tkg-wld     tkg-wld    tkg-wld-admin

We can switch to this context to proceed with any further work on the cluster.

kubectl config use-context tkg-wld-admin@tkg-wld

Switched to context "tkg-wld-admin@tkg-wld".

And just as with the management cluster, we can get an LDAP cluster that can be handed off to any LDAP users (this process plays out the same as it did for the management cluster).

tanzu cluster kubeconfig get tkg-wld --export-file /tmp/ldaps-tkg-wld-kubeconfig

ℹ  You can now access the cluster by running 'kubectl config use-context tanzu-cli-tkg-wld@tkg-wld' under path '/tmp/ldaps-tkg-wld-kubeconfig'

By comparison, there are fare fewer deployments running in the workload cluster .This is expected since the management cluster is doing a lot of the heavy lifting to get the workload clusters up and running.

kubectl --kubeconfig=/tmp/ldaps-tkg-wld-kubeconfig get deployments -A

NAMESPACE            NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
kube-system          antrea-controller        1/1     1            1           15m
kube-system          coredns                  2/2     2            2           17m
kube-system          metrics-server           1/1     1            1           14m
kube-system          vsphere-csi-controller   1/1     1            1           16m
pinniped-concierge   pinniped-concierge       2/2     2            2           15m
tkg-system           kapp-controller          1/1     1            1           16m

Create a workload cluster entirely from the command line

This is not recommended as you would essentially have to take most of the contents of your cluster definition file and set those variables on the command line.

export VSPHERE_DATACENTER='/RegionA01'
export VSPHERE_DATASTORE='/RegionA01/datastore/map-vol'
export VSPHERE_FOLDER='/RegionA01/vm'
export VSPHERE_NETWORK='K8s-Workload'
export VSPHERE_PASSWORD='<encoded:Vk13YXJlMSE=>'
export VSPHERE_RESOURCE_POOL='/RegionA01/host/RegionA01-MGMT/Resources'
export VSPHERE_SERVER='vcsa-01a.corp.tanzu'
export VSPHERE_SSH_AUTHORIZED_KEY='ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC5KYNeWQgVHrDHaEhBCLF1vIR0OAtUIJwjKYkY4E/5HhEu8fPFvBOIHPFTPrtkX4vzSiMFKE5WheKGQIpW3HHlRbmRPc9oe6nNKlsUfFAaJ7OKF146Gjpb7lWs/C34mjdtxSb1D/YcHSyqK5mxhyHAXPeK8lrxG5MLOJ3X2A3iUvXcBo1NdhRdLRWQmyjs16fnPx6840x9n5NqeiukFYIVhDMFErq42AkeewsWcbZQuwViSLk2cIc09eykAjaXMojCmSbjrj0kC3sbYX+HD2OWbKohTqqO6/UABtjYgTjIS4PqsXWk63dFdcxF6ukuO6ZHaiY7h3xX2rTg9pv1oT8WBR44TYgvyRp0Bhe0u2/n/PUTRfp22cOWTA2wG955g7jOd7RVGhtMHi9gFXeUS2KodO6C4XEXC7Y2qp9p9ARlNvu11QoaDyH3l0h57Me9we+3XQNuteV69TYrJnlgWecMa/x+rcaEkgr7LD61dY9sTuufttLBP2ro4EIWoBY6F1Ozvcp8lcgi/55uUGxwiKDA6gQ+UA/xtrKk60s6MvYMzOxJiUQbWYr3MJ3NSz6PJVXMvlsAac6U+vX4U9eJP6/C1YDyBaiT96cb/B9TkvpLrhPwqMZdYVomVHsdY7YriJB93MRinKaDJor1aIE/HMsMpbgFCNA7mma9x5HS/57Imw==    admin@corp.local'
export VSPHERE_TLS_THUMBPRINT='DB:CC:DC:80:F9:17:DA:37:4F:AC:F7:65:6D:3D:AC:99:B8:A0:5A:BE'
export VSPHERE_USERNAME='administrator@vsphere.local'
export VSPHERE_WORKER_NUM_CPUS='8'
export VSPHERE_WORKER_DISK_GIB='40'
export VSPHERE_WORKER_MEM_MIB='8192'
export CNI='antrea'

An alternative is to put these values in the .tanzu/tkg/cluster-config.yaml file so that they can be used for any subsequent cluster creation. Command line variable will override these if needed.

From here you would run a tanzu cluster create command with a few more parameters present.

tanzu cluster create tkg-wld2 --plan dev --vsphere-controlplane-endpoint 192.168.130.130 --tkr v1.20.4---vmware.1-tkg.1 -v 6

This example creates a cluster name tkg-wld2, uses the dev plan, had 192.168.130.130 for Kubernetes API endpoint and has version of 1.20.4.

You can see from the worker node created that it did not use the system default of 2 CPUs and 2GB RAM but the 8 CPUs and 8 GB RAM that were specified as command line variables.

Deploy an application that uses a Load Balancer service

I’m not showing all the details here but I have created a WordPress application that uses a service of type LoadBalancer to make sure that NSX ALB is capable of providing Load Balancer addresses to my workload services.

apiVersion: v1
kind: Service
metadata:
  name: wordpress
  labels:
    app: wordpress
spec:
  ports:
    - port: 80
  selector:
    app: wordpress
    tier: frontend
  type: LoadBalancer

If you’re curious to see some of the communication between your cluster and NSX ALB, run kubectl -n avi-system logs ako-0 -f. Most of it is not terribly interesting but you can be on the lookout for any issues with allocating the VIP being requested.

Checking on the status of the service shows that an IP address of 192.168.220.2 has been allocated, which is in the VIP pool configured in NSX ALB.

kubectl get svc --selector=app=wordpress

NAME              TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
wordpress         LoadBalancer   100.71.160.52   192.168.220.2   80:32155/TCP   2m8s
wordpress-mysql   ClusterIP      None            <none>          3306/TCP       2m8s

Immediately afterwards we will see that two new VMs, NSX ALB Service Engines, have been created.

And back in the NSX ALB UI we see the same. Their status is red as they are not powered on yet. Once these SEs are created, subsequent LoadBalancer requests will move along much more quickly since new SEs won’t need to be provisioned (until you hit the upper limit of Services per SEs).

If you were to try to get the Load Balancer IP address at this point it would fail since there are no Service Engines up to do the work of handling the traffic.

Eventually these will show up as healthy after the VMs are powered on and connected to the controller.

In the NSX ALB UI, we can see that a service has been configured and has the IP address noted previously and is service port 32155.

Scale a cluster

Scaling a cluster is just as easy with the tanzu command as it was with the tkg command. I’m going to scale my workload cluster from one to two worker nodes.

tanzu cluster scale tkg-wld -w 2

Successfully updated worker node machine deployment replica count for cluster tkg-wld
Workload cluster 'tkg-wld' is being scaled

A new worker node has been deployed.

And we can see it in the kubectl get nodes output as well.

kubectl get nodes

NAME                            STATUS   ROLES                  AGE   VERSION
tkg-wld-control-plane-29cns     Ready    control-plane,master   61m   v1.20.4+vmware.1
tkg-wld-md-0-84c4d7898c-8rjcp   Ready    <none>                 31s   v1.20.4+vmware.1
tkg-wld-md-0-84c4d7898c-m7bgb   Ready    <none>                 56m   v1.20.4+vmware.1

And NSX ALB reflects the change as well.

6 thoughts on “Installing Tanzu Kubernetes Grid 1.3 on vSphere with NSX Advanced Load Balancer”

  1. Pingback: How to configure external-dns with Microsoft DNS in TKG 1.3 (plus Harbor and Contour) – Little Stuff

  2. Pingback: TKG v1.3 Active Directory Integration with Pinniped and Dex - CormacHogan.com

  3. Alessandro Scuderi

    Great Content! I am able to deploy the cluster but without ldaps. If I use my internal CA root certificate I receive the following Error:
    “Network Error” :x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “LDAPCA””)

    How Can I resolve this issue?
    Thank you,
    Alessandro

    1. Hello Alessandro. I’m not sure I know how to resolve that one as I’ve not seen the behavior before and don’t see anything internal that might help with it. I would recommend getting a service request opened with technical support to help sort it out.

  4. Pingback: Upgrading from TKG 1.3 to 1.4 (including extensions) on vSphere – Little Stuff

  5. Pingback: Installing a TKG 1.4 management cluster on vSphere with NSX Advanced Load Balancer – Little Stuff

Leave a Comment

Your email address will not be published. Required fields are marked *