How to install vSphere with Tanzu with vSphere Networking

As of vSphere 7.0 U1, vSphere with Kubernetes has been rebranded to vSphere with Tanzu. Additionally, you no longer have to have NSX-T to deploy it. There is now an option that uses HAProxy and vSphere Networking for load balancing, but it does come with some limitations…the biggest being that you can’t deploy vSphere Pods like you can with the NSX-T architecture and you won’t be able to deploy the integrated Harbor instance. I’ll walk through the installation and configuration of this offering so you can see what’s involved. To learn more about this process you can read the vSphere with Tanzu Quick Start Guide…a supplement to the official VMware documentation at vSphere with Tanzu Configuration and Management.

I’m starting out with a vCenter Server 7.0 U1 installed and three ESXi 7.0 U1 hosts configured. I have a vDS configured with multiple portgroups to support the networking infrastructure required for vSphere with Tanzu. The first portgroup is named k8s-workload and is on VLAN 130 and the second is called k8s-frontend and is on VLAN 220. I’m configuring HAProxy with the three-NIC configuration which allows for a Frontend network to be created where users can access the environment via virtual IP addresses. You can read more about the different topology choices at System Requirements and Topologies for Setting Up a Supervisor Cluster with vSphere Networking.

Configure a Storage Policy

You’ll need to have a storage policy that can be called upon to help provision persistent volumes when the need arises. I’ve chosen to use a tag-based storage policy as its fairly easy to implement. You start out but launching the Assign Tag wizard on your datastore so that you can get a tag in place first:

Click the Add Tag link and enter a name for the new tag.

Click the Create New Category link and enter a name for the tag category. Leave all other settings as-is.

Click the Create button.

Click the Create button.

Click the Assign button. You should now see the tag on your datastore.

Now we can create the storage policy itself. Navigate to the Policies and Profiles page and click the Create link.

Give the new policy a name.

Click the Next button.

Select Enable tag based placement rules and click the Next button.

Select the appropriate Tag category and Tags.

Click the Next button. You can validate that the appropriate datastore is listed on the Storage compatibility page.

Click the Next button. If all looks good on the Review and finish page, click the Finish button to create the Storage Policy.

Create a Content Library

The next thing to do is create a Content Library to house the Kubernetes node images that will be used when we build out Tanzu Kubernetes clusters. Navigate to the Content Libraries page and click the Create button.

Give the new Content Library a name and select the appropriate vCenter Server.

Click the Next button.

Select Subscribed content library and enter http://wp-content.vmware.com/v2/latest/lib.json.

Click the Next button and click the Yes button if you get a pop-up asking to verify the authenticity of the subscribed host.

Select an appropriate datastore.

Click the Next button. If everything looks good on the Ready to complete page, click the Finish button. It will take at least a few minutes to sync the content library but we have some more work to do while this is happening.

Deploy HAProxy

There are a few different ways to deploy the HAProxy VM we’ll need but I chose to go with another Content Library. The initial setup process is fairly similar to the last one with the exception of choosing a Local content library instead of a Subscribed content library. Once the Content Library is created (mine is named HAProxy) you can import an HAProxy OVA, which is available at https://github.com/haproxytech/vmware-haproxy/releases. You’ll notice in this example that I’m using v0.1.6 of this OVA but the most current one is v0.1.8.

Click on your new Content Library and from the Actions menu select Import item.

You can either download the OVA from GitHub or provide a link to it in the URL field.

Click the Import button and wait a few minutes for the OVA to finish downloading/importing.

When you’re ready to move on, select New VM from This Template on the Actions menu for this Content Library.

Give the new VM a name and select the appropriate Datacenter/Folder.

Click the Next button.

Select the appropriate Cluster/Resource Pool.

Click the Next button, and click the Next button on the Review Details page. Accept the License agreement and click the Next button again.

Here’s where you’ll have to choose your architecture. As noted previously, I went with the Frontend network design.

Click the Next button.

Select an appropriate Datastore.

Click the Next button.

Set the appropriate portgroups for the Management, Workload and Frontend networks.

Click the Next button.

You’ll need to supply a fair amount of information on the Customize template page. You can see from the following screenshots that I’m putting the Management IP on the 192.168.110.0/24 network, the Workload IP on the 192.168.130.0/24 network (also on VLAN 130) and the Frontend IP on the 192.168.220.0/23 network (also on VLAN 220). The Load Balancing IP addresses are taken from a subnet of the Frontend IP range, 192.168.221.0/24. The Kubernetes nodes created as part of a Tanzu Kubernetes cluster will get their IP addresses from the Workload network.

Click the Next button. If everything looks good on the Ready to complete page, click the Finish button.

The OVA deployment will take at least a few minutes. When it’s done, power on the new VM and validate that it has the IP addresses assigned during deployment.

If you enabled SSH during deployment, you can ssh to the HAProxy VM and retrieve the certificate in use via a process similar to the following:

ssh root@192.168.110.30
The authenticity of host '192.168.110.30 (192.168.110.30)' can't be established.
ECDSA key fingerprint is SHA256:7GRP9rUyVuoBv7u8/amhlSHTJ/8pBblPKcT6OQjYHhQ.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.110.30' (ECDSA) to the list of known hosts.
 
 
Password:
 
 15:41:20 up 3 min, 0 users, load average: 0.03, 0.09, 0.04
 
tdnf update info not available yet!
root@haproxy [ ~ ]# cat /etc/haproxy/ca.crt
 
-----BEGIN CERTIFICATE-----
MIIDozCCAougAwIBAgIJALzlaxmKA5LVMA0GCSqGSIb3DQEBBQUAMG8xCzAJBgNV
BAYTAlVTMRMwEQYDVQQIDApDYWxpZm9ybmlhMRIwEAYDVQQHDAlQYWxvIEFsdG8x
DzANBgNVBAoMBlZNd2FyZTENMAsGA1UECwwEQ0FQVjEXMBUGA1UEAwwOMTkyLjE2
OC4xMTAuMzAwHhcNMjAwOTIzMjM0NzUwWhcNMzAwOTIxMjM0NzUwWjBvMQswCQYD
VQQGEwJVUzETMBEGA1UECAwKQ2FsaWZvcm5pYTESMBAGA1UEBwwJUGFsbyBBbHRv
MQ8wDQYDVQQKDAZWTXdhcmUxDTALBgNVBAsMBENBUFYxFzAVBgNVBAMMDjE5Mi4x
NjguMTEwLjMwMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0HKu1Y9J
UJIk888K6m7besxPQtyEB/dH+3iaobO196dFroR5TaC8Nrw8HWL6znuHkLs0wrpi
SB5hGjl2+4SqoqhQP/UPNTotsKs62emb15QhkZyUL8QlEHtVCVWPOMcdCgL1wMu9
pawz56XacwwrTFLh+Ao6XNuIGonyTSkTz1yDeNNkPd56YoQTQhgt9r4F6zhz3cnV
0UFtdLUyEcq7E40w8CcSFscXik7eCn3hiEiIQDwNPWl39MjUCRfB1M3T/GfKep78
Imh+NpsbqCvbHGNgUhJNMGwFXLHDc9WQL20neSkB4Zx373bo/FKCsxg1lPgMc5A2
edNbBNoijLxa+wIDAQABo0IwQDAPBgNVHRMBAf8EBTADAQH/MA4GA1UdDwEB/wQE
AwIBhjAdBgNVHQ4EFgQUZ/h8Xsd8SxAD3sqqXYDrj3UfLeQwDQYJKoZIhvcNAQEF
BQADggEBAGWXGphcZ/MsM/kZ9Edi2ksyAR8F0EiLomXrnssXCBPE14ppLbhRtsop
Ju8zbw4BPTvYGGYZx4UKIwkVZc2NImzLWf/vn4Uk+duX+dQREo7/EygPqWASyxKY
l0lhvZEVyNO4XkBUUHsJI5T0u06rYdjt1fFN39oyixEPRypL0D9AL9ZsGb/Gd84w
6MjH/5yxETLaX5hNVTmG2Tf2/vJ9B+Y4ENXGWS5zMk+67Nh/ZPLQgb3OfY4F8H6W
cjrz9/2/esptVRzTGXwbf5RkpVXl1JkD9hBOTruYMN/sTG4SGSzqPdINEcRaLQ0Z
6Fog5HgSJNOl4XfgiJMnoJ6iH5NNiKk=
-----END CERTIFICATE-----

Be sure to save this certificate as it will be needed when configuring vSphere with Tanzu. If you did not enable SSH, the vSphere with Tanzu Quick Start Guide provides a PowerShell example that will allow you to retrieve the certificate.

Configure Workload Management

Now we can finally enable vSphere with Tanzu! Navigate to the Workload Management page and complete the information displayed as appropriate.

Click the Get Started button, and click the next Get Started button you see.

Since we don’t have NSX-T deployed, the only available option is vCenter Server Network. Make sure the appropriate vCenter Server is selected.

Click the Next button.

Select a compatible cluster.

Cick the Next button.

Select an appropriate size for your deployment.

Click the Next button.

Select an appropriate Storage Policy (k8s-policy that we created earlier in this example).

Click the Next button.

Complete the information related to your HAProxy deployment. The Data plane API Address should be the management IP address with port 5556 appended, the IP Address Ranges for Virtual Servers should be part of the Load Balancer IP Ranges specified during HAProxy deployment and the Server Certificate Authority should be the HAProxy certificate obtained earlier.

Click the Next button when you’re ready to proceed.

The Starting IP Address value will be the first of five consecutive IP addresses that can be used for Supervisor Cluster control plane nodes. Everything else here should be fairly standard.

Click the Next button.

You can leave the default IP address for Services value in place or choose something else. Enter an appropriate IP address for your DNS Server.

Click the Add button under Workload Network.

Choose an appropriate portgroup (K8s-Workload in this example) and enter the Layer 3 Routing Configuration information appropriate for your environment. You can see from this example that the IP Address Ranges is a subset of the same 192.168.130.0/24 network specified for the Workload Network during HAProxy deployment.

Click the Save button and then click the Next button on the Workload Network page.

Click the Add button next to Add Content Library.

Select the appropriate Content Library for serving up Kubernetes node images (tkg-cl which we created earlier in this example).

Click the OK button and then click the Next button on the TKG Configuration page.

If everything looks good on the Review and Confirm page, click the Finish button.

Your Workload Management page will look like this for a long time…an hour or more maybe. But you can check out a few other areas to see what’s happening.

You should see a Namespaces object created with three SupervisorControlPlaneVM VMs present under it.

You’ll also see several tasks related to the deployment and configuration of these VMS.

When the first VM powers on you should see that it has the first two IP addresses from the range of five IP addresses specified earlier (starting with 192.168.110.101 in this example). The very first IP address will be a VIP that floats between the Supervisor Cluster nodes as needed while the second IP address will remain on this node. As the other two nodes are powered up they should each only have one IP address from the range.

When the deployment is finished, your Workload Management page should look like the following:

You can see that the Control Plane Node IP address is the first in the range of Virtual Server IP Addresses specified during deployment (192.168.221.2). And if you point a browser to this address…

You’ll want to download the CLI Plugin for vSphere (appropriate for your operating system) and copy it to the system where you will run kubectl commands.

At this point we have a Supervisor Cluster up and running but we can’t do much with it. The Supervisor Cluster is analogous to a management cluster in TKG…it is a part of ClusterAPI and will allow you to create workload clusters (Tanzu Kubernetes clusters) where you can run your workloads. There is a feature called Tanzu Kubernetes Grid Service (TKGS) that is responsible for handling the creating Tanzu Kubernetes clusters. In the NSX-T variant of vSphere with Tanzu, you can run workloads in the Supervisor Cluster since each ESXi host acts as a worker and had a process called a Spherelet running that takes on the function that Kubelet normally would. The vSphere Networking model creates a Supervisor Cluster that only had control plane nodes and no workers, so you must create a Tanzu Kubernetes cluster to be able to deploy workloads.

Create a Namespace

The next thing you’ll need to do is to create a Namespace in the Supervisor Cluster. This isn’t exactly the same thing as a standard namespace in Kubernetes as it is integrated with vSphere and vital for granting access to the cluster.

On the Workload Management page, click on the Namespaces link.

Click on the Create Namespace button.

Select the appropriate cluster and network (network-1 which we created during deployment) and give the Namespace a name.

Click on the Create button.

We have a little bit of information about our new Namespace here but we need to add some Permissions and Storage before we can really make use of it.

Click on the Add Permissions button.

Complete the Add Permissions page as appropriate. Bear in mind that the user(s) you specify here will be able to login to the Supervisor cluster via the kubectl command, and the Role assigned here dictates whether they are an administrator or have read-only access.

Click the OK button. On the Namespace page, click on the Add Storage button.

Select an appropriate datastore.

Click the OK button.

You could also set limits on the compute and storage resources available to this Namespace via the Capacity and Usage tile but it is not necessary for this example.

Create a Tanzu Kubernetes cluster

Now we’re ready to login and create a Tanzu Kubernetes cluster. From the system where you have copied the CLI Plugin for vSphere, issue a command similar to the following to login (note that the IP address is what was indicated on the Workload Management page and the user, administrator@vsphere.local, is what we specified when creating the Namespace):

kubectl vsphere login --server=192.168.221.2 -u administrator@vsphere.local
 
Password:
Logged in successfully.
 
You have access to the following contexts:
   192.168.221.2
   tkg
 
If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.
 
To change context, use `kubectl config use-context <workload name>`

If your login fails due to a certificate error, you can add the --insecure-skip-tls-verify switch to the kubectl vsphere login command.

Now that we’re logged in to the Supervisor cluster, check which context your using to make sure it’s set to the Namespace we just created.

kubectl config get-contexts

CURRENT   NAME            CLUSTER         AUTHINFO                                     NAMESPACE
          192.168.221.2   192.168.221.2   wcp:192.168.221.2:administrator@corp.tanzu
*         tkg             192.168.221.2   wcp:192.168.221.2:administrator@corp.tanzu   tkg

Creating a Tanzu Kubernetes cluster is as simple as kubectl apply‘ing a small bit of yaml that will build out a Custom Resource Definition (CRD) called a TanzuKubernetesCluster. The following is a minimally functional definition file that I have used in my lab:

apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TanzuKubernetesCluster
metadata:
  name: tkg-cluster
  namespace: tkg
spec:
  topology:
    controlPlane:
      count: 1
      class: best-effort-xsmall
      storageClass: k8s-policy
    workers:
      count: 2
      class: best-effort-xsmall
      storageClass: k8s-policy
  distribution:
    version: v1.17.8+vmware.1-tkg.1.5417466
  settings:
    network:
      cni:
        name: antrea
      services:
        cidrBlocks: ["198.51.100.0/12"]
      pods:
        cidrBlocks: ["192.0.2.0/16"]
      serviceDomain: managedcluster.local

A few of the main things to take away from this:

  • The name of the cluster is tkg-cluster.
  • The Namespace to which it will be deployed is tkg.
  • There is only one control plane node and two worker nodes.
  • The “class” is best-effort-small, which means they will not have any resources reserved (best-effort) and will have 2 vCPU, 4GB RAM and 16GB of disk. You can read more about Virtual Machine Class Types at Virtual Machine Class Types for Tanzu Kubernetes Clusters.
  • The storageClass is k8s-policy, the one we created earlier.
  • The Kubernetes version is 1.17.8 but since this is being deployed via a VMware signed binary, the full version is v1.17.8+vmware.1-tkg-1.5417466. You can specify just the short version if you prefer (v1.17.8 in this example). The available Kubernetes versions that can be configured are determined by the items in the subscribed Content Library created earlier.
  • The Container Networking Interface (CNI) is Antrea, an open-source CNI developed by VMware. This is new for vSphere with Tanzu as the only option available previously was Calico (which is still available but the default is now Antrea).
  • The services and pod cidrBlocks are left at default values but could be changed if you prefer.
  • The serviceDomain default is cluster.local and does not need to be changed but I have set it to managedcluster.local. Keep in mind that this will have an impact on the FQDN for any services in your cluster.

When you have your Tanzu Kubernetes cluster yaml definition file just the way you want, you can deploy it via a simple kubectl apply command.

kubectl apply -f demo-applications/tkg-cluster.yaml

tanzukubernetescluster.run.tanzu.vmware.com/tkg-cluster created

That’s a bit of a lie…it takes several minutes for the cluster to really be created. You’ll have to inspect a few different places to see everything that’s happening. I like to have one session open where I’m just watching the events from the Supervisor Cluster.

kubectl get events -w

LAST SEEN   TYPE     REASON             OBJECT                                           MESSAGE
23s         Normal   SuccessfulCreate   machineset/tkg-cluster-workers-5q242-7c8554795   Created machine "tkg-cluster-workers-5q242-7c8554795-d6hgz"
23s         Normal   SuccessfulCreate   machineset/tkg-cluster-workers-5q242-7c8554795   Created machine "tkg-cluster-workers-5q242-7c8554795-vxkt8"
24s         Normal   SuccessfulCreate   machinedeployment/tkg-cluster-workers-5q242      Created MachineSet "tkg-cluster-workers-5q242-7c8554795"
0s          Warning   ReconcileFailure   wcpcluster/tkg-cluster                           failed to configure resource policy for WCPCluster tkg/tkg-cluster: failed to create Resource Policy: Internal error occurred: failed calling webhook "default.validating.virtualmachinesetresourcepolicy.vmoperator.vmware.com": Post https://vmware-system-vmop-webhook-service.vmware-system-vmop.svc:443/default-validate-vmoperator-vmware-com-v1alpha1-virtualmachinesetresourcepolicy?timeout=30s: dial tcp 10.96.0.102:443: i/o timeout
0s          Warning   ReconcileFailure   wcpcluster/tkg-cluster                           unexpected error while reconciling control plane endpoint for tkg-cluster: failed to reconcile loadbalanced endpoint for WCPCluster tkg/tkg-cluster: failed to get control plane endpoint for Cluster tkg/tkg-cluster: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses
0s          Warning   ReconcileFailure   wcpcluster/tkg-cluster                           unexpected error while reconciling control plane endpoint for tkg-cluster: failed to reconcile loadbalanced endpoint for WCPCluster tkg/tkg-cluster: failed to get control plane endpoint for Cluster tkg/tkg-cluster: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses
0s          Normal    CreateVMServiceSuccess   virtualmachineservice/tkg-cluster-control-plane-service   CreateVMService success
0s          Warning   ReconcileFailure         wcpcluster/tkg-cluster                                    unexpected error while reconciling control plane endpoint for tkg-cluster: failed to reconcile loadbalanced endpoint for WCPCluster tkg/tkg-cluster: failed to get control plane endpoint for Cluster tkg/tkg-cluster: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses
0s          Warning   ReconcileFailure         wcpcluster/tkg-cluster                                    unexpected error while reconciling control plane endpoint for tkg-cluster: failed to reconcile loadbalanced endpoint for WCPCluster tkg/tkg-cluster: failed to get control plane endpoint for Cluster tkg/tkg-cluster: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses
0s          Warning   ReconcileFailure         wcpcluster/tkg-cluster                                    unexpected error while reconciling control plane endpoint for tkg-cluster: failed to reconcile loadbalanced endpoint for WCPCluster tkg/tkg-cluster: failed to get control plane endpoint for Cluster tkg/tkg-cluster: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses
0s          Warning   ReconcileFailure         wcpcluster/tkg-cluster                                    unexpected error while reconciling control plane endpoint for tkg-cluster: failed to reconcile loadbalanced endpoint for WCPCluster tkg/tkg-cluster: failed to get control plane endpoint for Cluster tkg/tkg-cluster: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses
0s          Warning   ReconcileFailure         wcpcluster/tkg-cluster                                    unexpected error while reconciling control plane endpoint for tkg-cluster: failed to reconcile loadbalanced endpoint for WCPCluster tkg/tkg-cluster: failed to get control plane endpoint for Cluster tkg/tkg-cluster: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses
0s          Normal    Reconcile                gateway/tkg-cluster-control-plane-service                 Success
0s          Normal    Reconcile                gateway/tkg-cluster-control-plane-service                 Success

You can also watch for the tkg-cluster item to be created under the tkg Namespace in the vSphere UI, along with the control plane and worker node VMs.

If you look closely, you’ll see that this node has an IP address of 192.168.130.6, which is in the range or Workload IPs we specified during deployment.

You can also check for the existence of virtualmachines (another new CRD) via the kubectl command.

kubectl get virtualmachines

NAME                                        AGE
tkg-cluster-control-plane-gxmcg             25m
tkg-cluster-workers-5q242-7c8554795-d6hgz   7m39s
tkg-cluster-workers-5q242-7c8554795-vxkt8   7m39s

When the deployment is done, you should see all VMs in the Tanzu Kubernetes cluster powered on in the vSphere client.

Back at the command line, you’ll need to login again and specify some additional information related to the Tanzu Kubernetes cluster before we can access it. Note that we’re accessing the same IP address and using the same user as before…all settings configured on a Supervisor Cluster Namespace are inherited by any Tanzu Kubernetes clusters created within.

kubectl vsphere login --server=192.168.221.2 -u administrator@vsphere.local --tanzu-kubernetes-cluster-name tkg-cluster --tanzu-kubernetes-cluster-namespace tkg
 
Password:
Logged in successfully.
 
You have access to the following contexts:
   192.168.221.2
   tkg
   tkg-cluster
 
If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.
 
To change context, use `kubectl config use-context <workload name>`

You should be dropped into the new context, tkg-cluster, which corresponds to the Tanzu Kubernetes cluster, but you should check just to be sure.

kubectl config get-contexts

CURRENT   NAME            CLUSTER         AUTHINFO                                     NAMESPACE
          192.168.221.2   192.168.221.2   wcp:192.168.221.2:administrator@corp.tanzu
          tkg             192.168.221.2   wcp:192.168.221.2:administrator@corp.tanzu   tkg
*         tkg-cluster     192.168.221.3   wcp:192.168.221.3:administrator@corp.tanzu

And we can see that we have three nodes with names derived from the name of the cluster at the Kubernetes version specified in our definition file.

kubectl get nodes

NAME                                        STATUS   ROLES    AGE   VERSION
tkg-cluster-control-plane-gxmcg             Ready    master   35m   v1.17.8+vmware.1
tkg-cluster-workers-5q242-7c8554795-d6hgz   Ready    <none>   16m   v1.17.8+vmware.1
tkg-cluster-workers-5q242-7c8554795-vxkt8   Ready    <none>   14m   v1.17.8+vmware.1

You’ve now got a Kubernetes cluster up and running, ready to start taking on workloads.

Leave a Comment

Your email address will not be published.