It was a bit of a learning curve dealing with Azure since I’d never laid eyes on it before, but the end result was relatively easily achieved…a fully functional TKG cluster. We’ll walk through both installation methods… the UI and the CLI. There are a few things that need to be done or collected first at the Azure end of this which will be common to both methods.
Table of Contents
Configure/Document Azure resources
When you log in to Azure, you can Azure Active Directory and you will be presented with a page similar to the following:
Make a note of the Tenant ID value as it will be used later.
While on the Active Directory page, click on the App registrations link on the right and then click on the New registration button at the top.

Give the new registration a meaningful name and set the access permissions appropriately.

Click the Register button when your ready to proceed. You should be presented with a page similar to the following:

Make a note of the Application (client) ID value as it will be used later.
Navigate to Subscriptions. If you have more than one, choose which one you’ll use for deploying TKG clusters and make a note of its Subscription ID value as it will be used later.
Click on the chosen Subscription and then click on the Access control (IAM) link on the left.

Click the Add button under Add a role assignment.
A new pane will appear on the right where you can configure the new role assignment. Set the Role to Owner, set the Assign access to value to Azure AD user, group, or service principal and then type in the name of your new application (clittle-tkg in my example) in the Select field. Your application should show up in the search results and clicking on it should move it into the Selected members list.

Click the Save button.
You can click on the Role assignments tab to validate that your application is listed as an owner.
Head back to Azure Active Directory, App registrations and then click on your application.
Click on the Certificates & Secrets link on the left.

Click on the New client secret button.
Enter a descriptive name and set the expiration as appropriate….don’t follow this example in a production environment as you would never want your certificate to have an indefinite expiration date.

Click the Add button.
Make a note of the value for your new secret as this is the only chance you’ll get to do so. Once you log out and back in or refresh your browser the secrete value will be obfuscated.
With all of this done, we now have the four parameters that will be needed during cluster creation:
Tenant ID: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0
Client ID: a48db493-6b9f-4709-a73d-b04efa9cb05a
Client Secret: xC2Ggr214o5e~mVcUX1OUW9WLZap_-0N
Subscription ID: 477a3190-70ed-47b1-a714-4ac99deb3f32
Prepare for cluster creation
You’ll need to download the tkg
binary. You can find the latest tkg
CLI download at <link> There are several other utilities included with the cli tgz file and you should go ahead and extract them as well.
You’ll also want to create an ssh key pair if you don’t already have one. This will be used for accessing the Azure virtual machines that are created. You can do this via a simple ssh-keygen
command, similar to the following:
ssh-keygen -t rsa -b 4096 -C "admin@corp.tanzu"
Enter a location to save the key pair or accept the default when prompted. You can enter a passphrase to protect the key pair or leave it blank.
Generating public/private rsa key pair.
Enter file in which to save the key (/home/clittle/.ssh/id_rsa): /tmp/id_rsa
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /tmp/id_rsa.
Your public key has been saved in /tmp/id_rsa.pub.
The key fingerprint is:
SHA256:bhBWzRxqaa4Z6Dm3WwjW906e78GIak/yryI7GfVSLxI admin@corp.tanzu
The key's randomart image is:
+---[RSA 4096]----+
| .+.. |
| . o+ |
| o = |
| +E=. |
| +.++S. |
| o.ooOoo.o |
| +o*o*.+ o |
| +ooO + . . |
| .=+++o=oo |
+----[SHA256]-----+
Be sure to copy the contents of the .pub
file created as it will be used later. In this example, it is /tmp/id_rsa.pub
.
Provision your management cluster via the UI
Now we’re ready to kick things off with the tkg init --ui
command, which will create the management cluster. You should see output similar to the following as well as a browser window being opened.
Logs of the command execution can also be found at: /tmp/tkg-20200914T115109677021347.log
Validating the pre-requisites...
Serving kickstart UI at http://127.0.0.1:8080
It’s worth noting that you can also use the --bind
switch to specify a different address from which to launch the browser and the --browser
switch to specify which browser to use. You can get more information on these switches by running tkg init --help
.
In your browser, you should have a new tab open that looks similar to the following:
Click the Deploy button under Microsoft Azure and then fill in the details for the deployment per the information that was collected previously. If you have an existing resource group on Azure that you’d like to use it can be selected from the dropdown or you can let the installer create a new one for you. The choice of region is up to you.

Click the Next button.
If you have an existing VNET you’d like to use, select the appropriate radio button and then choose it from the dropdown. You’ll also need to choose appropriate control plane and worker node subnets. For this example, we’re letting the installer create a new VNET. Select the appropriate Resource Group from the dropdown and set the VNET CIDR Block as appropriate (or leave it at the default value like I did).

Click the Next button.
You have a number of choices to make on the Management Cluster Settings page.
- Development vs. Production: This will dictate the number of control plane nodes deployed (1 vs 3).
- Instance Type: This will dictate the compute and storage characteristics of the control plane virtual machines deployed to Azure. You can read more about the different size options at Sizes for virtual machines in Azure.
- Worker Node Instance Type: This will dictate the compute and storage characteristics of the worker virtual machines deployed to Azure.
- Machine Health Checks: This will dictate whether ClusterAPI will monitor the health of the deployed virtual machines and recreate them if they are deemed unhealthy. You can read more about machine health checks at Configure a MachineHealthCheck.
You can see in this example that I’ve selected a Development deployment with the smallest node size available and have named the management cluster mgmt-azure (if you leave this blank it will auto-generate a cluster name). I’ve also left the Machine Health Checks option enabled.

Click the Next button.
The Metadata page is entirely optional so complete it as you feel is appropriate.

This is a great new feature in the 1.2 version…the ability choose your CNI. Prior to this version, we were locked into Calico but you can now choose Antrea, Calico or None. If you choose None you would need to manually deploy a CNI to your cluster later. You can alter the default CIDR values or leave them at the defaults (as I did).

Click the Next button.
Choose to participate in the CEIP or not.

Click the Next button.

Click the Review Configuration button.
If everything looks good, click the Deploy Management Cluster button.
You’ll be able to follow the high-level progress in the UI.

And if you head back to Azure and navigate to the Resource groups page, you’ll see that the Resource Group that you specified has already been created (assuming you didn’t choose to use an existing one).
As with deploying a TKG cluster to other IAASs, you should see a bootstrap cluster created locally where you can follow the progress in more detail. Finding it should be as simple as running docker ps
.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
43979bd1e7c1 projects-stg.registry.vmware.com/tkg/kind/node:v1.19.0_vmware.1 "/usr/local/bin/entr…" 30 seconds ago Up 15 seconds 127.0.0.1:62095->6443/tcp tkg-kind-btbu80jnov8m8h72efvg-control-plane
You could docker exec
into this container and look around but there is a better way of getting at what is happening in the bootstrap cluster. A kubeconfig file is created under your home directory at .kube-tkg\tmp
and should be named similar to config_5nh1LwxL
. You can use this to see what has been deployed to the bootstrap cluster and check the logs for various components.
kubectl --kubeconfig=config_5nh1LwxL get nodes
NAME STATUS ROLES AGE VERSION
tkg-kind-btbu80jnov8m8h72efvg-control-plane Ready master 83s v1.19.0+vmware.1
kubectl --kubeconfig=config_5nh1LwxL get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager-54cf965957-rrlpz 2/2 Running 0 2m39s
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager-5dc895c778-f7d95 2/2 Running 0 2m38s
capi-system capi-controller-manager-858d56cc8f-6s5tq 2/2 Running 0 2m40s
capi-webhook-system capi-controller-manager-659679fd44-l9wx6 2/2 Running 0 2m40s
capi-webhook-system capi-kubeadm-bootstrap-controller-manager-c96f4b9d-zbgjt 2/2 Running 0 2m39s
capi-webhook-system capi-kubeadm-control-plane-controller-manager-759c99fd9f-cx7zh 2/2 Running 0 2m38s
capi-webhook-system capz-controller-manager-575c4d8b5b-tqnsk 2/2 Running 0 2m36s
capz-system capz-controller-manager-664b574684-hztn9 2/2 Running 0 2m35s
cert-manager cert-manager-b98b948d8-xgfdn 1/1 Running 0 3m44s
cert-manager cert-manager-cainjector-577b45fb7c-gn5kl 1/1 Running 0 3m45s
cert-manager cert-manager-webhook-55c5cd4dcb-zwgmb 1/1 Running 0 3m44s
kube-system coredns-774fbc4754-8kv5g 1/1 Running 0 4m19s
kube-system coredns-774fbc4754-gx7jp 1/1 Running 0 4m19s
kube-system etcd-tkg-kind-btbu80jnov8m8h72efvg-control-plane 1/1 Running 0 4m34s
kube-system kindnet-mzt2n 1/1 Running 0 4m19s
kube-system kube-apiserver-tkg-kind-btbu80jnov8m8h72efvg-control-plane 1/1 Running 0 4m34s
kube-system kube-controller-manager-tkg-kind-btbu80jnov8m8h72efvg-control-plane 1/1 Running 0 4m34s
kube-system kube-proxy-qs6lq 1/1 Running 0 4m19s
kube-system kube-scheduler-tkg-kind-btbu80jnov8m8h72efvg-control-plane 1/1 Running 0 4m34s
local-path-storage local-path-provisioner-8b46957d4-c97ch 1/1 Running 0 4m19s
You should see more activity in the UI as the deployment progresses, especially as the bootstrap cluster is being instantiated.
If you want to follow along as the bootstrap cluster start to provision resources on Azure, you can tail the log from the capi-controller-manager-<#######>
pod in the capi-system
namespace and the logs from the capi-controller-manager-<#######> pod in the capz-system
namespace.
kubectl --kubeconfig=config_5nh1LwxL -n capi-system logs capi-controller-manager-858d56cc8f-6s5tq manager -f
Name":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0914 19:18:30.629218 1 controller.go:159] controller-runtime/controller "msg"="Starting Controller" "controller"="machinehealthcheck"
I0914 19:18:30.629276 1 controller.go:152] controller-runtime/controller "msg"="Starting EventSource" "controller"="machinedeployment" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","selector":{},"template":{"metadata":{},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}}}},"status":{}}}
I0914 19:18:30.629220 1 controller.go:152] controller-runtime/controller "msg"="Starting EventSource" "controller"="machineset" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0914 19:18:30.629302 1 controller.go:152] controller-runtime/controller "msg"="Starting EventSource" "controller"="machinedeployment" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0914 19:18:30.629325 1 controller.go:159] controller-runtime/controller "msg"="Starting Controller" "controller"="machineset"
I0914 19:18:30.629348 1 controller.go:159] controller-runtime/controller "msg"="Starting Controller" "controller"="machinedeployment"
I0914 19:18:30.729478 1 controller.go:152] controller-runtime/controller "msg"="Starting EventSource" "controller"="clusterresourceset" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I0914 19:18:30.729502 1 controller.go:180] controller-runtime/controller "msg"="Starting workers" "controller"="machineset" "worker count"=10
I0914 19:18:30.729532 1 controller.go:180] controller-runtime/controller "msg"="Starting workers" "controller"="cluster" "worker count"=10
I0914 19:18:30.729540 1 controller.go:180] controller-runtime/controller "msg"="Starting workers" "controller"="machinehealthcheck" "worker count"=10
I0914 19:18:30.729557 1 controller.go:180] controller-runtime/controller "msg"="Starting workers" "controller"="machinedeployment" "worker count"=10
I0914 19:18:30.830069 1 controller.go:159] controller-runtime/controller "msg"="Starting Controller" "controller"="clusterresourceset"
I0914 19:18:30.830139 1 controller.go:180] controller-runtime/controller "msg"="Starting workers" "controller"="clusterresourceset" "worker count"=10
I0914 19:19:03.289495 1 clusterresourceset_controller.go:247] controllers/ClusterResourceSet "msg"="Applying ClusterResourceSet to cluster" "cluster-name"="tkg-mgmt-azure-20200914130906" "clusterresourceset"="tkg-mgmt-azure-20200914130906-cni-antrea" "namespace"="tkg-system"
kubectl --kubeconfig=config_hetJb3TB -n capz-system logs capz-controller-manager-664b574684-mbntx manager
I0914 23:03:11.020616 1 azuremachine_controller.go:219] controllers/AzureMachine "msg"="Reconciling AzureMachine" "AzureCluster"="tkg-mgmt-azure-20200914165745" "azureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "machine"="tkg-mgmt-azure-20200914165745-control-plane-zkv5h" "namespace"="tkg-system"
I0914 23:03:16.551570 1 azurecluster_controller.go:154] controllers/AzureCluster "msg"="Reconciling AzureCluster" "AzureCluster"="tkg-mgmt-azure-20200914165745" "cluster"="tkg-mgmt-azure-20200914165745" "namespace"="tkg-system"
I0914 23:03:16.552852 1 azuremachine_controller.go:219] controllers/AzureMachine "msg"="Reconciling AzureMachine" "AzureCluster"="tkg-mgmt-azure-20200914165745" "azureMachine"="tkg-mgmt-azure-20200914165745-md-0-h2qws" "cluster"="tkg-mgmt-azure-20200914165745" "machine"="tkg-mgmt-azure-20200914165745-md-0-586b965c45-67n7g" "namespace"="tkg-system"
I0914 23:03:16.553459 1 azuremachine_controller.go:241] controllers/AzureMachine "msg"="Bootstrap data secret reference is not yet available" "AzureCluster"="tkg-mgmt-azure-20200914165745" "azureMachine"="tkg-mgmt-azure-20200914165745-md-0-h2qws" "cluster"="tkg-mgmt-azure-20200914165745" "machine"="tkg-mgmt-azure-20200914165745-md-0-586b965c45-67n7g" "namespace"="tkg-system"
I0914 23:03:24.096700 1 azurecluster_controller.go:154] controllers/AzureCluster "msg"="Reconciling AzureCluster" "AzureCluster"="tkg-mgmt-azure-20200914165745" "cluster"="tkg-mgmt-azure-20200914165745" "namespace"="tkg-system"
I0914 23:05:02.536764 1 helpers.go:298] controllers/AzureJSONMachine "msg"="returning early from json reconcile, no update needed" "AzureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "namespace"="tkg-system"
I0914 23:05:02.561623 1 azuremachine_controller.go:219] controllers/AzureMachine "msg"="Reconciling AzureMachine" "AzureCluster"="tkg-mgmt-azure-20200914165745" "azureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "machine"="tkg-mgmt-azure-20200914165745-control-plane-zkv5h" "namespace"="tkg-system"
I0914 23:05:02.561754 1 helpers.go:298] controllers/AzureJSONMachine "msg"="returning early from json reconcile, no update needed" "AzureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "namespace"="tkg-system"
I0914 23:05:04.177117 1 helpers.go:298] controllers/AzureJSONMachine "msg"="returning early from json reconcile, no update needed" "AzureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "namespace"="tkg-system"
E0914 23:05:04.207322 1 controller.go:248] controller-runtime/controller "msg"="Reconciler error" "error"="error patching conditions: The condition \"Ready\" was modified by a different process and this caused a merge/ChangeCondition conflict: \u0026v1alpha3.Condition{\n \tType: \"Ready\",\n- \tStatus: \"Unknown\",\n+ \tStatus: \"True\",\n \tSeverity: \"\",\n- \tLastTransitionTime: v1.Time{Time: s\"2020-09-14 23:05:02 +0000 UTC\"},\n+ \tLastTransitionTime: v1.Time{Time: s\"2020-09-14 23:05:04 +0000 UTC\"},\n \tReason: \"\",\n \tMessage: \"\",\n }\n" "controller"="azuremachine" "name"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "namespace"="tkg-system"
I0914 23:05:04.207539 1 helpers.go:298] controllers/AzureJSONMachine "msg"="returning early from json reconcile, no update needed" "AzureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "namespace"="tkg-system"
I0914 23:05:04.208107 1 azuremachine_controller.go:219] controllers/AzureMachine "msg"="Reconciling AzureMachine" "AzureCluster"="tkg-mgmt-azure-20200914165745" "azureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "machine"="tkg-mgmt-azure-20200914165745-control-plane-zkv5h" "namespace"="tkg-system"
I0914 23:05:05.623932 1 azuremachine_controller.go:219] controllers/AzureMachine "msg"="Reconciling AzureMachine" "AzureCluster"="tkg-mgmt-azure-20200914165745" "azureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "machine"="tkg-mgmt-azure-20200914165745-control-plane-zkv5h" "namespace"="tkg-system"
I0914 23:05:05.626170 1 helpers.go:298] controllers/AzureJSONMachine "msg"="returning early from json reconcile, no update needed" "AzureMachine"="tkg-mgmt-azure-20200914165745-control-plane-rmdzf" "cluster"="tkg-mgmt-azure-20200914165745" "namespace"="tkg-system"
When the deployment is finished, you should be presented with a screen similar to the following in the UI:
And back in Azure you should see numerous objects created and running now.
If you go back to the command line where you issued the tkg init --ui
command, you should see output similar to the following:
Logs of the command execution can also be found at: C:\Users\clittle\AppData\Local\Temp\tkg-20200908T135007539199435.log
Validating the pre-requisites...
Serving kickstart UI at http://127.0.0.1:8080
Validating configuration...
web socket connection established
sending pending 2 logs to UI
Using infrastructure provider azure:v0.4.8
Generating cluster configuration...
Setting up bootstrapper...
Bootstrapper created. Kubeconfig: C:\Users\clittle\.kube-tkg\tmp\config_5nh1LwxL
Installing providers on bootstrapper...
Fetching providers
Installing cert-manager Version="v0.16.1"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v0.3.9" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.9" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.9" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-azure" Version="v0.4.8" TargetNamespace="capz-system"
Start creating management cluster...
Saving management cluster kuebconfig into C:\Users\clittle/.kube/config
Installing providers on management cluster...
Fetching providers
Installing cert-manager Version="v0.16.1"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v0.3.9" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.9" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.9" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-azure" Version="v0.4.8" TargetNamespace="capz-system"
Waiting for the management cluster to get ready for move...
Waiting for addons installation...
Moving all Cluster API objects from bootstrap cluster to management cluster...
Performing move...
Discovering Cluster API objects
Moving Cluster API objects Clusters=1
Creating objects in the target cluster
Deleting objects from the source cluster
Context set for management cluster tkg-mgmt-azure-20200908140522 as 'tkg-mgmt-azure-20200908140522-admin@tkg-mgmt-azure-20200908140522'.
Management cluster created!
You can now create your first workload cluster by running the following:
tkg create cluster [name] --kubernetes-version=[version] --plan=[plan]
At this point you can jump down to the section on creating a workload cluster if you have no intention of deploying a management cluster via the CLI.
Provision your management cluster via the CLI
The first thing to do when creating the management cluster via the CLI is to run the tkg get mc
command. This will build out the .tkg folder structure that is required for all subsequent steps.
With this done, open the .tkg/config.yaml
file in a text editor. You should see something similar to the following:
cert-manager-timeout: 30m0s
overridesFolder: /home/clittle/.tkg/overrides
NODE_STARTUP_TIMEOUT: 20m
BASTION_HOST_ENABLED: "true"
providers:
- name: cluster-api
url: /home/clittle/.tkg/providers/cluster-api/v0.3.9/core-components.yaml
type: CoreProvider
- name: aws
url: /home/clittle/.tkg/providers/infrastructure-aws/v0.5.5/infrastructure-components.yaml
type: InfrastructureProvider
- name: vsphere
url: /home/clittle/.tkg/providers/infrastructure-vsphere/v0.7.1/infrastructure-components.yaml
type: InfrastructureProvider
- name: azure
url: /home/clittle/.tkg/providers/infrastructure-azure/v0.4.8/infrastructure-components.yaml
type: InfrastructureProvider
- name: tkg-service-vsphere
url: /home/clittle/.tkg/providers/infrastructure-tkg-service-vsphere/v1.0.0/unused.yaml
type: InfrastructureProvider
- name: kubeadm
url: /home/clittle/.tkg/providers/bootstrap-kubeadm/v0.3.9/bootstrap-components.yaml
type: BootstrapProvider
- name: kubeadm
url: /home/clittle/.tkg/providers/control-plane-kubeadm/v0.3.9/control-plane-components.yaml
type: ControlPlaneProvider
- name: docker
url: /home/clittle/.tkg/providers/infrastructure-docker/v0.3.6/infrastructure-components.yaml
type: InfrastructureProvider
images:
all:
repository: projects-stg.registry.vmware.com/tkg/cluster-api
cert-manager:
repository: projects-stg.registry.vmware.com/tkg/cert-manager
tag: v0.16.1_vmware.1
release:
version: v1.2.0-pre-alpha-292-gd80115e
There are a number of parameters that we’ll need to add in here to get it into the configuration we desire:
AZURE_TENANT_ID
: The tenant ID that was noted earlier.AZURE_CLIENT_ID
: The Client ID that was noted earlier.AZURE_CLIENT_SECRET
: The Client Secret value that was noted earlier.AZURE_SUBSCRIPTION_ID
: The Subscription ID value that was noted earlier.AZURE_LOCATION
: The Availability Zone where you would like the management cluster deployed.AZURE_SSH_PUBLIC_KEY_B64
: The base64-encoded public key from the SSH key pair that was created earlier. You can run a command similar tobase64 /tmp/id_rsa.pub
to get this value.AZURE_RESOURCE_GROUP
: A new or existing resource group.AZURE_VNET_NAME
: A new or existing VNET. If existing, you also need to specify theAZURE_CONTROL_PLANE_SUBNET_NAME
,AZURE_CONTROL_PLANE_SUBNET_CIDR
,AZURE_NODE_SUBNET_NAME
andAZURE_NODE_SUBNET_CIDR
values.AZURE_VNET_CIDR
: VNET CIDRAZURE_CONTROL_PLANE_MACHINE_TYPE
: Control Plane VM typeAZURE_NODE_MACHINE_TYPE
: Worker VM typeMACHINE_HEALTH_CHECK_ENABLED
: Whether to monitor node healthCLUSTER_CIDR
: Cluster POD CIDRSERVICE_CIDR
: Cluster Service CIDR
When the .tkg/config.yaml
file is finished, it should look similar to the following:
cert-manager-timeout: 30m0s
overridesFolder: /home/clittle/.tkg/overrides
NODE_STARTUP_TIMEOUT: 20m
BASTION_HOST_ENABLED: "true"
providers:
- name: cluster-api
url: /home/clittle/.tkg/providers/cluster-api/v0.3.9/core-components.yaml
type: CoreProvider
- name: aws
url: /home/clittle/.tkg/providers/infrastructure-aws/v0.5.5/infrastructure-components.yaml
type: InfrastructureProvider
- name: vsphere
url: /home/clittle/.tkg/providers/infrastructure-vsphere/v0.7.1/infrastructure-components.yaml
type: InfrastructureProvider
- name: azure
url: /home/clittle/.tkg/providers/infrastructure-azure/v0.4.8/infrastructure-components.yaml
type: InfrastructureProvider
- name: tkg-service-vsphere
url: /home/clittle/.tkg/providers/infrastructure-tkg-service-vsphere/v1.0.0/unused.yaml
type: InfrastructureProvider
- name: kubeadm
url: /home/clittle/.tkg/providers/bootstrap-kubeadm/v0.3.9/bootstrap-components.yaml
type: BootstrapProvider
- name: kubeadm
url: /home/clittle/.tkg/providers/control-plane-kubeadm/v0.3.9/control-plane-components.yaml
type: ControlPlaneProvider
- name: docker
url: /home/clittle/.tkg/providers/infrastructure-docker/v0.3.6/infrastructure-components.yaml
type: InfrastructureProvider
images:
all:
repository: projects-stg.registry.vmware.com/tkg/cluster-api
cert-manager:
repository: projects-stg.registry.vmware.com/tkg/cert-manager
tag: v0.16.1_vmware.1
release:
version: v1.2.0-pre-alpha-292-gd80115e
AZURE_TENANT_ID: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0
AZURE_CLIENT_ID: a48db493-6b9f-4709-a73d-b04efa9cb05a
AZURE_CLIENT_SECRET: xC2Ggr214o5e~mVcUX1OUW9WLZap_-0N
AZURE_SUBSCRIPTION_ID: 477a3190-70ed-47b1-a714-4ac99deb3f32
AZURE_LOCATION: eastus
AZURE_SSH_PUBLIC_KEY_B64: c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFDQVFEYnV2YnRlWWMvVlJVR1dIdFdIckdHbWdKWnBrUGdMbEhqbmg0ZXZRdkFWdDU5Q0NFTWNqdC84UDU2U1dhVEhWY2ZtUVBlSGlDTVNIbE5aRmt1bWR1VXpteUx0LzlDZ2ZxS1ViL3hVZDhFM0c1b091VkhWUi9QMy9EODdZL2ovNzg3V0x4MUhXbWxxOFlMN1RBbzE2SmdiQi9URjRFYkFYRWpORjVhaURVZHBGRzFHZ05oY1hmR1pkRU8zdFlUTERJSVloTWpIeFJndUc4YldpRWdvVmIwNkRWL0NvWUtqdU1XMzNTUFI4WmozOXRxNEdJV3hWV0VpdmFmNFZ5azRqZ2YrYW5sNkR1ZW1GT09FUUxoVTI0T3lYUTNjNHJ2c2xVaEhPL1V3VisxSTViK0s1K1ltRzJHZDI5Si8xMXNVSXB0bDlnMDdRUG95OUFxb0JWTFRSdVQ0N0NPRnp0QVFMeHBxWkVuUUJoNnFDTUxRQU9icCtHNWlxMk5XTXEvczBFWDZFUEEyNUZYT2UvTk9wRzRSMGRKZXE1TDRMcTN2N091SGlZV3ZNNTdxWFh0N21yVHlRSmxjaGNPL21VcWNjL3FzOXRnUUtxbFBUVHhuOE1GTHpNdHNxMWVPdmdBemNzZEtySTVaa1dVR1AvNHJxa1J6RXNyeFF5NHBQSmVQRHpPWFBJRlJWdDJlVzQrenI1YTI1SkRZa0Zoalp6Y3Byc1VaZmZQREtrdEpsczBZcDRzdXFVc05YRTJDUWtOaU4veEo0dGExUTVOQWN0YnpuRHYzS0J3MlIrR2VWQU9uMi83anV3ako5R2dWdWdYYXFDVWkxVnRBTDI5L0w5a1Q5N3lCTEE5QkdTY3ZvU0taY29tbTU5ZVAyUHhkWXlkNmIvUjZuUGc2aEhvalE9PSBhZG1pbkBjb3JwLnRhbnp1
AZURE_RESOURCE_GROUP: cjl-tkg
AZURE_VNET_NAME: cjl-tkg
AZURE_VNET_CIDR: 10.0.0.0/8
AZURE_CONTROL_PLANE_MACHINE_TYPE: Standard_D2s_v3
AZURE_NODE_MACHINE_TYPE: Standard_D2s_v3
MACHINE_HEALTH_CHECK_ENABLED: "true"
CLUSTER_CIDR: 100.96.0.0/11
SERVICE_CIDR: 100.64.0.0/13
You may want to increase the cert-manager-timeout
and/or the NODE_STARTUP_TIMEOUT
values if you suspect that the deployment may run long.
Now we’re ready to run our tkg init
command. There are a few parameters that we’ll be passing in to get this management cluster configured similar to the one that was deployed via the UI:
-i azure
: This will tell the installer that we’re deploying to Azure-p dev
: This selects the dev plan, which is a single control plane node and a single worker node.--ceip-participation true
: This enrolls our cluster in the Customer Experience Improvement Program.--name mgmt-azure
: This sets the name of the cluster to mgmt-azure.- –
-cni antrea
: This deploy Antrea as the CNI. -v 6
: This sets a high verbosity for the output of thetkg
command.
Our final command looks like the following:
tkg init -i azure -p dev --ceip-participation true --name mgmt-azure --cni antrea -v 6
And you’ll see a lot of output at the command line as the bootstrap cluster is created (this output is heavily truncated)
Logs of the command execution can also be found at: /tmp/tkg-20200915T124357718805799.log
Using configuration file: /home/clittle/.tkg/config.yaml
Validating the pre-requisites...
Setting up management cluster...
Validating configuration...
Using infrastructure provider azure:v0.4.8
Generating cluster configuration...
Fetching File="cluster-template-definition-dev.yaml" Provider="infrastructure-azure" Version="v0.4.8"
Setting up bootstrapper...
Fetching configuration for kind node image...
Creating kind cluster: tkg-kind-btggmrs80laenk092gng
Ensuring node image (projects-stg.registry.vmware.com/tkg/kind/node:v1.19.1_vmware.1) ...
Image: projects-stg.registry.vmware.com/tkg/kind/node:v1.19.1_vmware.1 present locally
Preparing nodes ...
Writing configuration ...
Starting control-plane ...
Installing CNI ...
Installing StorageClass ...
Waiting 2m0s for control-plane = Ready ...
Ready after 26s
Bootstrapper created. Kubeconfig: /home/clittle/.kube-tkg/tmp/config_UHMrZXOd
Checking cluster reachability...
Installing providers on bootstrapper...
Installing the clusterctl inventory CRD
Creating CustomResourceDefinition="providers.clusterctl.cluster.x-k8s.io"
Fetching providers
Fetching File="core-components.yaml" Provider="cluster-api" Version="v0.3.9"
Fetching File="bootstrap-components.yaml" Provider="bootstrap-kubeadm" Version="v0.3.9"
Fetching File="control-plane-components.yaml" Provider="control-plane-kubeadm" Version="v0.3.9"
Fetching File="infrastructure-components.yaml" Provider="infrastructure-azure" Version="v0.4.8"
Fetching File="metadata.yaml" Provider="cluster-api" Version="v0.3.9"
Fetching File="metadata.yaml" Provider="bootstrap-kubeadm" Version="v0.3.9"
Fetching File="metadata.yaml" Provider="control-plane-kubeadm" Version="v0.3.9"
Fetching File="metadata.yaml" Provider="infrastructure-azure" Version="v0.4.8"
Creating Namespace="cert-manager-test"
Installing cert-manager Version="v0.16.1"
Creating Namespace="cert-manager"
Creating CustomResourceDefinition="certificaterequests.cert-manager.io"
Creating CustomResourceDefinition="certificates.cert-manager.io"
You’ll see the same kind of activity at Azure as was noted for the UI-base install and you’ll also be able to follow the logs in the bootstrap cluster by using the kubeconfig file under .kube-tkg/config/tmp
and tailing the logs of the pods in the capi
and capz
namespaces.
When the process is finished, you should see output similar to the following:
Resuming the target cluster
Set Cluster.Spec.Paused Paused=false Cluster="mgmt-azure" Namespace="tkg-system"
Context set for management cluster mgmt-azure as 'mgmt-azure-admin@mgmt-azure'.
Deleting kind cluster: tkg-kind-btggmrs80laenk092gng
Management cluster created!
You can now create your first workload cluster by running the following:
tkg create cluster [name] --kubernetes-version=[version] --plan=[plan]
Deploy a workload cluster
Per the message at the end of the tkg init
output, you can now use the tkg create cluster
command to create your first workload cluster on Azure. But first, we’ll want to run a few other commands to make sure we’re really ready.
The first thing we’ll do is validate that our management cluster is really up and accessible.
tkg get mc
MANAGEMENT-CLUSTER-NAME CONTEXT-NAME STATUS
tkg-mgmt-azure-20200908140522 * tkg-mgmt-azure-20200908140522-admin@tkg-mgmt-azure-20200908140522 Success
The next thing is to make sure that are context is created and set correctly. Part of the process of creating the management cluster is to create a context for it as well but we will still need to set it as the current one.
kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
tkg-mgmt-azure-20200908140522-admin@tkg-mgmt-azure-20200908140522 tkg-mgmt-azure-20200908140522 tkg-mgmt-azure-20200908140522-admin
kubectl config use-context tkg-mgmt-azure-20200908140522-admin@tkg-mgmt-azure-20200908140522
Switched to context "tkg-mgmt-azure-20200908140522-admin@tkg-mgmt-azure-20200908140522".
And we can now validate the configuration of our management cluster.
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
tkg-mgmt-azure-20200908140522-control-plane-6lntl Ready master 21m v1.19.0 10.0.0.4 <none> Ubuntu 18.04.5 LTS 5.4.0-1023-azure containerd://1.3.4
tkg-mgmt-azure-20200908140522-md-0-l4rkz Ready <none> 18m v1.19.0 10.1.0.4 <none> Ubuntu 18.04.5 LTS 5.4.0-1023-azure containerd://1.3.4
kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager-54cf965957-nnrbf 2/2 Running 0 12m
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager-5dc895c778-kr84f 2/2 Running 0 12m
capi-system capi-controller-manager-858d56cc8f-gfkmp 2/2 Running 0 12m
capi-webhook-system capi-controller-manager-659679fd44-lqnpv 2/2 Running 0 12m
capi-webhook-system capi-kubeadm-bootstrap-controller-manager-c96f4b9d-9zlvs 2/2 Running 0 12m
capi-webhook-system capi-kubeadm-control-plane-controller-manager-759c99fd9f-9fjnq 2/2 Running 0 12m
capi-webhook-system capz-controller-manager-575c4d8b5b-tj9t4 2/2 Running 0 12m
capz-system capz-controller-manager-664b574684-kx4fs 2/2 Running 0 12m
cert-manager cert-manager-b98b948d8-8l4kx 1/1 Running 0 20m
cert-manager cert-manager-cainjector-577b45fb7c-vmncm 1/1 Running 0 20m
cert-manager cert-manager-webhook-55c5cd4dcb-pmxdh 1/1 Running 0 20m
kube-system antrea-agent-pdcfx 2/2 Running 0 21m
kube-system antrea-agent-zmskh 2/2 Running 2 19m
kube-system antrea-controller-d5b5cd9f8-4s9kv 1/1 Running 0 21m
kube-system coredns-85fc8659b-mf6tn 1/1 Running 0 21m
kube-system coredns-85fc8659b-rwgcf 1/1 Running 0 21m
kube-system etcd-tkg-mgmt-azure-20200908140522-control-plane-6lntl 1/1 Running 0 21m
kube-system kube-apiserver-tkg-mgmt-azure-20200908140522-control-plane-6lntl 1/1 Running 0 21m
kube-system kube-controller-manager-tkg-mgmt-azure-20200908140522-control-plane-6lntl 1/1 Running 0 21m
kube-system kube-proxy-hwd7b 1/1 Running 0 21m
kube-system kube-proxy-lnfdf 1/1 Running 0 19m
kube-system kube-scheduler-tkg-mgmt-azure-20200908140522-control-plane-6lntl 1/1 Running 0 21m
Everything is looking good so we can proceed with creating a workload cluster.
If you want your workload cluster nodes to end up in the same Azure Resource Group and use the same sizing as your management cluster, you can proceed straight to running your tkg create cluster
command. Otherwise, you’ll want to open the .tkg/config.yaml
file and make some changes. Specific parameters to pay attention to are:
AZURE_RESOURCE_GROUP
AZURE_VNET_NAME
AZURE_VNET_CIDR
AZURE_CONTROL_PLANE_MACHINE_TYPE
AZURE_NODE_MACHINE_TYPE
MACHINE_HEALTH_CHECK_ENABLED
If the management cluster was deployed via the UI, you won’t see the AZURE_RESOURCE_GROUP
, AZURE_VNET_NAME
and AZURE_VNET_CIDR
parameters and the default will be to create a new ones with the same name as the workload cluster.
When you’re ready, you can run a command similar to the following to kick off the deployment:
tkg create cluster azure-wld -p dev
As you can see, I’m taking a very simple approach to how I want this cluster created. I’m naming it azure-wld and using an unaltered dev plan (1 control plane and one worker node). You can use lots of other parameters to fine-tune the cluster, similar to what was done with the tkg init
command. You can run tkg create cluster --help
to see all of the available options.
The output from this command is very sparse by default.
Logs of the command execution can also be found at: C:\Users\clittle\AppData\Local\Temp\tkg-20200908T144203017991719.log
Validating configuration...
Creating workload cluster 'azure-wld'...
Waiting for cluster to be initialized...
Waiting for cluster nodes to be available...
Waiting for addons installation...
Workload cluster 'azure-wld' created
You can pass the -v #
switch to the tkg create cluster
command to increase the output verbosity, where # is a number…higher is more verbose.
When the deployment is done, you can run the tkg get clusters
command to see that your cluster is created successfully.
tkg get clusters
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES
azure-wld default running 1/1 1/1 v1.19.0+vmware.1 workload
And back in Azure you should see a new Resource Group created whose name matches the name of your workload cluster. There should be several items in it (very similar to how the Resource Group for the management cluster looks).

We can now use the tkg get credentials
command to populate our kubeconfig file with the context information for the new workload cluster.
tkg get credentials azure-wld
Credentials of workload cluster 'azure-wld' have been saved
You can now access the cluster by running 'kubectl config use-context azure-wld-admin@azure-wld'
kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
azure-wld-admin@azure-wld azure-wld azure-wld-admin
* tkg-mgmt-azure-20200908140522-admin@tkg-mgmt-azure-20200908140522 tkg-mgmt-azure-20200908140522 tkg-mgmt-azure-20200908140522-admin
The context can be changed to the workload cluster context and we can then start using kubectl to manage the cluster.
kubectl config use-context azure-wld-admin@azure-wld
Switched to context
"azure-wld-admin@azure-wld".
kubectl get nodes
NAME STATUS ROLES AGE VERSION
azure-wld-control-plane-6j9jd Ready master 7m19s v1.19.0
azure-wld-md-0-fbpls Ready <none> 5m8s v1.19.0
If you find that you need direct access to these nodes (or the nodes in the management cluster), you can use the ssh key pair to get to them. Unfortunately, the kubectl get nodes -o wide
command does not show the external IP address:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
azure-wld-control-plane-mjl28 Ready master 10m v1.19.1 10.0.0.4 <none> Ubuntu 18.04.5 LTS 5.4.0-1025-azure containerd://1.3.4
azure-wld-md-0-pgqp5 Ready <none> 8m24s v1.19.1 10.1.0.4 <none> Ubuntu 18.04.5 LTS 5.4.0-1025-azure containerd://1.3.4
However, you can get the external IP address of each node in the Azure UI if you navigate to Resource Groups, click on the resource group that maps to your cluster, and then clicking on the appropriate virtual machine.
Then you can issue a command similar to the following to access the node:
ssh -i /tmp/id_rsa capi@52.226.47.63
The authenticity of host '52.226.47.63 (52.226.47.63)' can't be established.
ECDSA key fingerprint is SHA256:8kVqR/u8VhdT/YQPGT4HWllUIeDeXnt153MAhuBBcks.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '52.226.47.63' (ECDSA) to the list of known hosts.
Enter passphrase for key '/tmp/id_rsa':
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1025-azure x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Tue Sep 15 21:15:53 UTC 2020
System load: 0.13 Processes: 168
Usage of /: 3.8% of 123.88GB Users logged in: 0
Memory usage: 15% IP address for eth0: 10.0.0.4
Swap usage: 0% IP address for antrea-gw0: 100.96.0.1
* Kubernetes 1.19 is out! Get it in one command with:
sudo snap install microk8s --channel=1.19 --classic
https://microk8s.io/ has docs and details.
0 packages can be updated.
0 updates are security updates.
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
capi@azure-wld-control-plane-qct2g:~$