In one of my earlier posts, How to install vSphere with Tanzu with vSphere Networking, I did a walkthrough of installing vSphere with Tanzu 7.0 U1 and using HAProxy as the Load Balancer solution. I’m going at it again but in 7.0 U2 we now support using NSX Advanced Load Balancer (formerly Avi Vantage) for providing Load Balancer services.
In my previous post, Deploying NSX Advanced Load Balancer for use with Tanzu Kubernetes Grid (1.3 release) and vSphere with Tanzu (7.0 U2 release), I went through the steps needed to get NSX Advanced Load Balancer (NSX ALB) up and running such that it could be used for Tanzu Kubernetes Grid (TKG) or vSphere with Tanzu. In this post, I’ll get vSphere with Tanzu installed and make use of the Load Balancer functionality afforded by NSX ALB.
Configure Workload Management
You might notice that this jumps right into configuring Workload Management and skips the needed steps of creating a content library for workload clusters (TKG Service clusters) and of configuring a Storage Policy. I’m using the same instances of each noted in How to install vSphere with Tanzu with vSphere Networking, so no real need to document their configuration again.
Navigate to the Workload Management page and click on the Get Started button.
You’ll get a warning noting that you have to have either HAProxy or NSX ALB (noted as Avi here) present in order to use the vCenter Server Network option.
Select a compatible cluster.
Select an appropriate size for your deployment.
Select an appropriate Storage Policy.
Complete the information relevant to your NSX ALB deployment. Supplying the certificate was easy for me since I used my own wildcard certificate. If you don’t have the NSX ALB certificate handy, you can download it from the NSX ALB UI via the Templates, Security, SSL/TLS Certificates page.
The Starting IP Address value will be the first of five consecutive IP addresses that can be used for Supervisor Cluster control plane nodes. Everything else here should be fairly standard.
You can leave the default IP address for Services value in place or choose something else. Enter an appropriate IP address for your DNS Server. Click the Add button under Workload Network.
Choose an appropriate portgroup (K8s-Workload in this example) and enter the Layer 3 Routing Configuration information appropriate for your environment.
If everything looks good here, move on to the next page.
Click the Add button next to Add Content Library.
Select the appropriate Content Library for serving up Kubernetes node images.
Click the Finish button to get the deployment started.
As noted in How to install vSphere with Tanzu with vSphere Networking, you’ll see a Namespaces folder get created in the Hosts and Clusters view of the vSphere Client and the Supervisor Cluster Control Plane VMs will be placed there. When the deployment is getting close to being done, you should see three VMs present.
For a short time, the Control Plane Node IP Address will be the first IP address supplied on the Workload Management configuration page.
Once the deployment is complete, the Control Plane Node IP Address will change to a VIP supplied by NSX ALB. The Config Status should also change to Running.
Now, even though the Workload Management page shows that the deployment is finished, you might find that you are not able to get to the noted IP address right away. This is likely because NSX ALB has not provisioned the Service Engines (SEs) yet that will provide the infrastructure needed for hosting the needed Virtual Services. Once these are in place and configured properly, you’ll see them present in the NSX ALB UI on the Infrastructure, Service Engine page.
If everything is healthy, you should see a “green” service on the Applications, Virtual Services page. Note that the Address is the one supplied as the Control Plane Node IP Address on the Workload Management page in the vSphere Client.
Don’t worry if this is not green right away. In my lab, I found that it took several minutes as NSX ALB continuously updated the health of the service. It won’t go green until the overall health is 85 or higher…even if the service if perfectly functional.
On the Applications, Dashboard page you can use the VS Tree view to see a logical representation of the traffic flow for the service. Note that the endpoints are the three Supervisor Cluster Control Plane nodes, on ports 443 and 6443.
Back in the vSphere Client you should see the two SE VMs in the inventory now (AviTanzu*).
Now that the Supervisor Cluster is up and functional, you’ll want to navigate to Administration, Licensing, Licenses, Assets, Supervisor Clusters to apply a valid license.
Replace the Certificate
The first thing to do is replace the default certificate with one that will be trusted. From the Hosts and Clusters View, select the cluster where the Supervisor Cluster is deployed and then click on the Configure tab. Under Namespaces, select Certificates.
You can click View Details to see information about the current certificate.
Click on the Actions menu and then select Generate CSR.
Enter the appropriate details for the new certificate on the Enter Info page.
Copy or download the CSR that is generated and take it to a CA to get a certificate generated.
When you have the new certificate ready, click the Actions menu and then select Replace Certificate.
Paste or upload the new certificate and click the Replace button.
If it worked, you’ll see a page similar to the following:
And navigating to https://<Control Plane Node IP Address> will not produce a certificate warning.
From this page you’ll want to copy the CLI Plugin to a location where you expect to be running
Configure a Supervisor Cluster Namespace
From the Workload Management page, navigate to Namespaces and click on the Create Namespace button.
Select the appropriate cluster and network (network-1 which we created during deployment) and give the Namespace a name.
We have a little bit of information about our new Namespace here but we need to add some Permissions and Storage before we can really make use of it.
Click on the Add Permissions button.
Complete the Add Permissions page as appropriate. Bear in mind that the user(s) you specify here will be able to login to the Supervisor cluster via the
kubectl command, and the Role assigned here dictates whether they are an administrator or have read-only access.
I’m also granting access to an AD user from the corp.tanzu domain.
On the Namespace page, click on the Add Storage button. Select an appropriate datastore.
You could also set limits on the compute and storage resources available to this Namespace via the Capacity and Usage tile but it is not necessary for this example.
Create a Tanzu Kubernetes cluster
Now we’re ready to login and create a Tanzu Kubernetes cluster. From the system where you have copied the CLI Plugin for vSphere, issue a command similar to the following to login (note that the IP address is what was indicated on the Workload Management page and the user, firstname.lastname@example.org, is what we specified when creating the Namespace):
kubectl vsphere login --server 192.168.220.2 Username: email@example.com KUBECTL_VSPHERE_PASSWORD environment variable is not set. Please enter the password below Password: Logged in successfully. You have access to the following contexts: 192.168.220.2 tkg If the context you wish to use is not in this list, you may need to try logging in again later, or contact your cluster administrator. To change context, use `kubectl config use-context <workload name>`
I didn’t know about the new ability to supply a password via an environment variable but I knew that it was a huge ask from many customers as it’s lack made automating things very difficult.
export KUBECTL_VSPHERE_PASSWORD=VMware1! kubectl vsphere login --server 192.168.220.2 -u firstname.lastname@example.org Logged in successfully. You have access to the following contexts: 192.168.220.2 tkg If the context you wish to use is not in this list, you may need to try logging in again later, or contact your cluster administrator. To change context, use `kubectl config use-context <workload name>`
That worked great! On to viewing the nodes in the cluster.
kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME 421d0c329c86f8a10347027a6f8ce1c4 Ready master 13h v1.19.1+wcp.2 192.168.130.2 <none> VMware Photon OS/Linux 4.19.164-3.ph3-esx containerd://1.3.3 421d202e406c7a419f64b15a0268cd89 Ready master 13h v1.19.1+wcp.2 192.168.130.3 <none> VMware Photon OS/Linux 4.19.164-3.ph3-esx containerd://1.3.3 421dc7e4475bea2c5a1e3c561e0162cd Ready master 14h v1.19.1+wcp.2 192.168.130.4 <none> VMware Photon OS/Linux 4.19.164-3.ph3-esx containerd://1.3.3
And you should see that your context is set to the Supervisor Cluster Namespace that was created earlier.
kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE 192.168.220.2 192.168.220.2 wcp:192.168.220.2:email@example.com * tkg 192.168.220.2 wcp:192.168.220.2:firstname.lastname@example.org tkg
Creating a Tanzu Kubernetes cluster is as simple as
kubectl apply‘ing a small bit of yaml that will build out a Custom Resource Definition (CRD) called a
TanzuKubernetesCluster. The following is a minimally functional definition file that I have used in my lab:
apiVersion: run.tanzu.vmware.com/v1alpha1 kind: TanzuKubernetesCluster metadata: name: tkg-cluster #name of cluster namespace: tkg spec: topology: controlPlane: count: 1 class: best-effort-xsmall # vmclass to be used for master(s) storageClass: k8s-policy workers: count: 2 class: best-effort-xsmall # vmclass to be used for workers(s) storageClass: k8s-policy distribution: version: v1.19
A few of the main things to take away from this:
- The name of the cluster is tkg-cluster.
- The Namespace to which it will be deployed is tkg.
- There is only one control plane node and two worker nodes.
- The “class” is best-effort-xsmall, which means they will not have any resources reserved (best-effort) and will have 2 vCPU, 2GB RAM and 16GB of disk. You can read more about Virtual Machine Class Types at Virtual Machine Class Types for Tanzu Kubernetes Clusters.
- The storageClass is k8s-policy, the one we created earlier.
- The Kubernetes version is 1.19 but this will be matched to the a full version that includes this short version notation.
You can read up on all of the available options at Configuration Parameters for Tanzu Kubernetes Clusters.
The only thing left to do is apply the yaml definition file.
kubectl apply -f tkg-cluster.yaml tanzukubernetescluster.run.tanzu.vmware.com/tkg-cluster created
As with the Supervisor Cluster, you’ll see VMs getting provisioned.
And you’ll see a new structure getting created under the Namespaces folder (tkg, tkg-cluster)
I like to have one session open where I’m just watching the events from the Supervisor Cluster to make sure nothing concerning jumps out and that things are moving along as expected.
kubectl get events -w LAST SEEN TYPE REASON OBJECT MESSAGE 2m34s Warning ReconcileFailure wcpmachine/tkg-cluster-control-plane-cmscz-hqttb vm is not yet created: vmware-system-capw-controller-manager/WCPMachine//tkg/tkg-cluster/tkg-cluster-control-plane-cmscz-hqttb 0s Warning ReconcileFailure wcpmachine/tkg-cluster-control-plane-cmscz-hqttb vm is not yet created: vmware-system-capw-controller-manager/WCPMachine//tkg/tkg-cluster/tkg-cluster-control-plane-cmscz-hqttb 4m31s Normal CreateK8sServiceSuccess virtualmachineservice/tkg-cluster-control-plane-service CreateK8sService success 4m32s Normal CertificateIssued certificaterequest/tkg-cluster-extensions-ca-569859115 Certificate fetched from issuer successfully 4m32s Normal KeyPairVerified issuer/tkg-cluster-extensions-ca-issuer Signing CA verified 51s Normal KeyPairVerified issuer/tkg-cluster-extensions-ca-issuer Signing CA verified 4m32s Normal GeneratedKey certificate/tkg-cluster-extensions-ca Generated a new private key 4m32s Normal Requested certificate/tkg-cluster-extensions-ca Created new CertificateRequest resource "tkg-cluster-extensions-ca-569859115" 4m32s Normal Issued certificate/tkg-cluster-extensions-ca Certificate issued successfully 4m29s Normal SuccessfulCreate machineset/tkg-cluster-workers-b9n7d-69dc849cc4 Created machine "tkg-cluster-workers-b9n7d-69dc849cc4-xx5k9" 4m29s Normal SuccessfulCreate machineset/tkg-cluster-workers-b9n7d-69dc849cc4 Created machine "tkg-cluster-workers-b9n7d-69dc849cc4-hnct5" 4m22s Warning ReconcileError machinehealthcheck/tkg-cluster-workers-b9n7d error creating client and cache for remote cluster: error fetching REST client config for remote cluster "tkg/tkg-cluster": failed to retrieve kubeconfig secret for Cluster tkg/tkg-cluster: secrets "tkg-cluster-kubeconfig" not found 4m29s Normal SuccessfulCreate machinedeployment/tkg-cluster-workers-b9n7d Created MachineSet "tkg-cluster-workers-b9n7d-69dc849cc4" 4m17s Warning ReconcileError machinehealthcheck/tkg-cluster-workers-b9n7d error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "tkg/tkg-cluster": Get https://192.168.220.4:6443/api?timeout=10s: dial tcp 192.168.220.4:6443: connect: connection refused 7s Warning ReconcileError machinehealthcheck/tkg-cluster-workers-b9n7d error creating client and cache for remote cluster: error creating dynamic rest mapper for remote cluster "tkg/tkg-cluster": Get https://192.168.220.4:6443/api?timeout=10s: dial tcp 192.168.220.4:6443: connect: connection refused 4m27s Warning ReconcileFailure wcpcluster/tkg-cluster unexpected error while reconciling control plane endpoint for tkg-cluster: failed to reconcile loadbalanced endpoint for WCPCluster tkg/tkg-cluster: failed to get control plane endpoint for Cluster tkg/tkg-cluster: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses 0s Warning ReconcileFailure wcpmachine/tkg-cluster-control-plane-cmscz-hqttb vm does not have an IP address: vmware-system-capw-controller-manager/WCPMachine//tkg/tkg-cluster/tkg-cluster-control-plane-cmscz-hqttb
At the end of the deployment you should see a message similar to the following in the events:
0s Normal PhaseChanged tanzukubernetescluster/tkg-cluster cluster changes from creating phase to running phase
And you should be able to see that the cluster is provisioned via the
kubectl get cluster command.
kubectl get cluster NAME PHASE tkg-cluster Provisioned
Back on the Namespace page in the vSphere Client, you’ll see that there is 1 Tanzu Kubernetes Cluster present in the Tanzu Kubernetes Grid Service panel.
And drilling down into this cluster you can see some high-level details about it.
Back at the command line, you can get at similar information by looking for a tanzukubernetescluster (tkr) resource.
kubectl get tkc NAME CONTROL PLANE WORKER DISTRIBUTION AGE PHASE TKR COMPATIBLE UPDATES AVAILABLE tkg-cluster 1 2 v1.19.7+vmware.1-tkg.1.fc82c41 15m running True
To get access to the new cluster we’ll need to login again and pass the
kubectl vsphere login --server 192.168.220.2 -u email@example.com --tanzu-kubernetes-cluster-namespace tkg --tanzu-kubernetes-cluster-name tkg-cluster Logged in successfully. You have access to the following contexts: 192.168.220.2 tkg tkg-cluster If the context you wish to use is not in this list, you may need to try logging in again later, or contact your cluster administrator. To change context, use `kubectl config use-context <workload name>`
There is a new context created (tkg-cluster in this example) that should match the name of the new cluster. You should see that your context has automatically switched to it.
kubectl config get-contexts CURRENT NAME CLUSTER AUTHINFO NAMESPACE 192.168.220.2 192.168.220.2 wcp:192.168.220.2:firstname.lastname@example.org tkg 192.168.220.2 wcp:192.168.220.2:email@example.com tkg * tkg-cluster 192.168.220.4 wcp:192.168.220.4:firstname.lastname@example.org
You can see that the nodes are present as we expect and the names match was is visible in the vSphere Client.
kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME tkg-cluster-control-plane-qc26b Ready master 14m v1.19.7+vmware.1 192.168.130.5 <none> VMware Photon OS/Linux 4.19.160-1.ph3-esx containerd://1.4.3 tkg-cluster-workers-b9n7d-69dc849cc4-hnct5 Ready <none> 7m59s v1.19.7+vmware.1 192.168.130.6 <none> VMware Photon OS/Linux 4.19.160-1.ph3-esx containerd://1.4.3 tkg-cluster-workers-b9n7d-69dc849cc4-xx5k9 Ready <none> 7m39s v1.19.7+vmware.1 192.168.130.7 <none> VMware Photon OS/Linux 4.19.160-1.ph3-esx containerd://1.4.3
In the NSX ALB UI, we can see that a new virtual service has been created for the new cluster. The Address value matches the API endpoint address noted in the vSphere Client and in the
kubectl config get-contexts output. This service is only working on port 6443 since we don’t need any GUI access.
And as with the first service created you can see a logical representation of the network flow to the control plane VM.
Deploy an application that uses a Load Balancer service
I’m not showing all the details here but I have created a WordPress application that uses a service of type LoadBalancer to make sure that NSX ALB is capable of providing Load Balancer addresses to my workload services.
apiVersion: v1 kind: Service metadata: name: wordpress labels: app: wordpress spec: ports: - port: 80 selector: app: wordpress tier: frontend type: LoadBalancer
Checking on the status of the service shows that an IP address of 192.168.220.7 has been allocated, which is in the VIP pool configured in NSX ALB.
kubectl get svc --selector=app=wordpress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE wordpress LoadBalancer 10.97.8.119 192.168.220.6 80:30205/TCP 52s wordpress-mysql ClusterIP None <none> 3306/TCP 53s
In the NSX ALB UI, we can see that a third service has been configured and had the IP address noted previously and is service port 80.
The tree view shows the path that traffic will flow to the two worker nodes in the cluster where the application is running.
13 thoughts on “Deploying vSphere 7.0 U2 with Tanzu while using NSX Advanced Load Balancer”
i tried to provision a simple webserver app (nginx) in my TKG cluster (with ALB) and found that the pull request wasn’t successful. Did you came across this issue too?
I don’t know how to login into the TKG worker node. Do you?
thanks and great article by the way
Hello Erich. I did not have any issues with pulling images from the internet…is there anything upstream from your cluster that might be blocking traffic? Any kind of network security policy in place in your cluster that could affect this? If nothing easy presents itself as a solution I would highly recommend opening a support request with VMware.
Regarding logging into a TKG worker node, https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.3/vmware-tanzu-kubernetes-grid-13/GUID-troubleshooting-tkg-tips.html#connect-to-cluster-nodes-with-ssh-9 should provide the steps you’re looking for.
thanks for your quick answer.
I was able to login succesfully in Supervisorclusternodes (this is what i did know) and the Nodes from the guest cluster.
ALB, VCSA, DNS and four vESXi Hosts are running on a pESXi-Cluster and these vLAB-VMs are connected with some NSX-v segments (this is my way to force me to still use NSX-v ). So far i did not have problems with this approach. I do this since more than 5 years and with more than 10 vLAB-environments.
Inside this vLAB (which i deployed according to and with some parts of the scripts from William Lam) i do use (obviously) dvSwitches.
The SupervisorControlVMs are able to ping external addresses. The traffic is leaving eth0 and is routed externally (just checked with pktcap-uw).
The Kubernetes Cluster Nodes are not able to ping external addresses (vNIC0 -> Workload Segment).
at present i guess it’s anything related to my NSX-v configuration, because i see icmp requests on my transit network (replies are missing)
This is nothing which i could ask VMware Support honestly
Will report if i’m successful.
thanks for your quick answer.
If you’re interested to know what i’m doing i need to use a different communication format (for screenshots …).
this environment runs in one of my vLabs (hosted on pESXi-Cluster using NSX-v as segmentation concept)
was pretty good so far, but it’s not that easy, especially when you forget to program the reverse route path (this is what was wrong)
Now i’m able to ping external addresses from my worker node and will see if my kubernetes app can be pulled
Thanks for the post. Your screenshots are helping me to troubleshoot an issue getting this running in my lab.
How does your NSX ALB chose the VIP address 192.168.220.2 as the frontend for your supervisor cluster? I don’t see that address or subnet as an input in any of your configuration (perhaps I’m missing it).
My supervisor cluster and service engine VIP both come up with the same IP and network connectivity to the supervisor cluster is therefore intermittent.
“Once the deployment is complete, the Control Plane Node IP Address will change to a VIP supplied by NSX ALB.”
This never happens in my environment.
Hello Mark. You’ll need to look at my previous post, Deploying NSX Advanced Load Balancer for use with Tanzu Kubernetes Grid and vSphere with Tanzu to see where that IP came from. I have specified a block of VIP IP address as 192.168.220.0/23 with a gateway address of 192.168.220.1 and a usable range of 192.168.220.2-192.168.220.127 (named K8s-Frontend). I’m using this block for VIPs and it’s where the Supervisor Cluster is pulling an IP from.
Thank you so much for the guide, i followed the entire guide, except switching the AVI LB to the essentials edition and i use a self-signed certificate on the AVI with its IP as SAN.
But when i deploy a new workload cluster with the YAML file specified the workers are not created. The cluster name is there but the worker nodes are not provisioned, any ideas?
error creating client and cache for remote cluster: error fetching REST client config for remote cluster “tkg/sitecore-dev”: failed to retrieve kubeconfig secret for Cluster tkg/sitecore-dev: secrets “sitecore-dev-kubeconfig” not found
We’d very likely have to dig into the logs/events to see what is going on there. I would highly recommend getting a support request opened with VMware to better assist you.
Found the solution, since 7.0.2 you have to select vm classes (all 10 is desired) on the namespace.
Could you do a guide on how to enable Windows container support with a supervisor cluster? I would like to run Windows containers also.
I don’t believe that Windows containers are supported in vSphere with Tanzu (yet). They are supported in Tanzu Kubernetes Grid Integrated edition (TKGI) 1.9 and above support Windows workers (https://docs.pivotal.io/tkgi/1-12/support-windows-index.html). With regards to TKG, you may want to register for the Modernize Windows Apps: Introduction to Windows Containers on Kubernetes [APP1999] VMworld session (https://myevents.vmware.com/widget/vmware/vmworld2021/catalog?search.passtype=15931978752250014Azc&search.level=1517937137830003aCWu&search.product=1617723187121049eJ7h&search.track=contentTrack_applicationModernization&search=APP1999%5D&tab.contentcatalogtabs=1627421929827001vRXW).
Pingback: Revisiting installing vSphere with Tanzu while using NSX Advanced Load Balancer – Little Stuff