I’ve written a few posts in the past about using packages (previously extensions) within the context of TGK clusters installed on vSphere (
Upgrading from TKG 1.3 to 1.4 (including extensions) on vSphere, How to configure external-dns with Microsoft DNS in TKG 1.3 (plus Harbor and Contour),
Working with TKG Extensions and Shared Services in TKG 1.2). The same package framework is available for TKGS clusters in vSphere with Tanzu as well. In this post, I’ll walk through installing most of them.
Packages provide extra functionality to the TKGS cluster. There are many different kinds of packages that can be installed for different purposes…ingress, dynamic DNS record creation, certificate services, etc…
Install the tanzu CLI
You can download the tanzu
CLI from Customer Connect. You’ll find it under the Tanzu Kubernetes Grid product and there are OS-specific versions available. You should have a .tar.gz
file whose contents you’ll need to extract on a system with access to the cluster.
tar -zxvf tanzu-cli-bundle-linux-amd64.tar.gz
cli/
cli/core/
cli/core/v0.28.1/
cli/core/v0.28.1/tanzu-core-linux_amd64
cli/tanzu-framework-plugins-standalone-linux-amd64.tar.gz
cli/tanzu-framework-plugins-context-linux-amd64.tar.gz
cli/ytt-linux-amd64-v0.43.1+vmware.1.gz
cli/kapp-linux-amd64-v0.53.2+vmware.1.gz
cli/imgpkg-linux-amd64-v0.31.1+vmware.1.gz
cli/kbld-linux-amd64-v0.35.1+vmware.1.gz
cli/vendir-linux-amd64-v0.30.1+vmware.1.gz
This is obviously for a linux system so the tanzu-core-linux_amd64 file needs to be made executable (and moved to a location that is more easily accessible).
mv cli/core/v0.28.1/tanzu-core-linux_amd64 /usr/local/bin/tanzu
chmod +x /usr/local/bin/tanzu
You can now run the tanzu init
command to do the initial configuration of the tanzu
CLI.
tanzu init
ℹ Checking for required plugins...
ℹ Installing plugin 'secret:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'telemetry:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'isolated-cluster:v0.28.1'
ℹ Installing plugin 'login:v0.28.1'
ℹ Installing plugin 'management-cluster:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'package:v0.28.1' with target 'kubernetes'
ℹ Installing plugin 'pinniped-auth:v0.28.1'
ℹ Successfully installed all required plugins
✔ successfully initialized CLI
Prepare the TKGS cluster for package installation
Use the kubectl login
command to access the TKGS cluster.
kubectl vsphere login --server wcp.corp.vmw -u vmwadmin@corp.vmw --tanzu-kubernetes-cluster-namespace tkg2-cluster-namespace --tanzu-kubernetes-cluster-name tkg2-cluster-1
A Package Repository is needed as it defines where the packages can be downloaded from.
Create a PackageRepositry spec file and then deploy it to the cluster:
apiVersion: packaging.carvel.dev/v1alpha1
kind: PackageRepository
metadata:
name: tanzu-standard
namespace: tkg-system
spec:
fetch:
imgpkgBundle:
image: projects.registry.vmware.com/tkg/packages/standard/repo:v1.6.0
Note: You can get the proper image value from the VMware Install Package Repository documentation.
kubectl apply -f packagerepo.yaml
Check to make sure that the PackageRepository object is created:
kubectl get packagerepositories -A
NAMESPACE NAME AGE DESCRIPTION
tkg-system tanzu-standard 33s Reconcile succeeded
tanzu package repository list -A
NAMESPACE NAME SOURCE STATUS
tkg-system tanzu-standard (imgpkg) projects.registry.vmware.com/tkg/packages/standard/repo:v1.6.0 Reconcile succeeded
Now you’re ready to start installing packages.
Install the cert-manager package
cert-manage is a prerequisite for pretty much every other package so it needs to be installed first.
Create the cert-manager namespace:
kubectl create ns cert-manager
Before you can install the cert-manager package, you need to know which version to install. You can get a listing of all available packages and their versions by querying the packages CRD.
kubectl -n tkg-system get packages
NAME PACKAGEMETADATA NAME VERSION AGE
cert-manager.tanzu.vmware.com.1.1.0+vmware.1-tkg.2 cert-manager.tanzu.vmware.com 1.1.0+vmware.1-tkg.2 2m31s
cert-manager.tanzu.vmware.com.1.1.0+vmware.2-tkg.1 cert-manager.tanzu.vmware.com 1.1.0+vmware.2-tkg.1 2m31s
cert-manager.tanzu.vmware.com.1.5.3+vmware.2-tkg.1 cert-manager.tanzu.vmware.com 1.5.3+vmware.2-tkg.1 2m31s
cert-manager.tanzu.vmware.com.1.5.3+vmware.4-tkg.1 cert-manager.tanzu.vmware.com 1.5.3+vmware.4-tkg.1 2m31s
cert-manager.tanzu.vmware.com.1.7.2+vmware.1-tkg.1 cert-manager.tanzu.vmware.com 1.7.2+vmware.1-tkg.1 2m31s
contour.tanzu.vmware.com.1.17.1+vmware.1-tkg.1 contour.tanzu.vmware.com 1.17.1+vmware.1-tkg.1 2m31s
contour.tanzu.vmware.com.1.17.2+vmware.1-tkg.2 contour.tanzu.vmware.com 1.17.2+vmware.1-tkg.2 2m31s
contour.tanzu.vmware.com.1.17.2+vmware.1-tkg.3 contour.tanzu.vmware.com 1.17.2+vmware.1-tkg.3 2m31s
contour.tanzu.vmware.com.1.18.2+vmware.1-tkg.1 contour.tanzu.vmware.com 1.18.2+vmware.1-tkg.1 2m31s
contour.tanzu.vmware.com.1.20.2+vmware.1-tkg.1 contour.tanzu.vmware.com 1.20.2+vmware.1-tkg.1 2m31s
external-dns.tanzu.vmware.com.0.10.0+vmware.1-tkg.1 external-dns.tanzu.vmware.com 0.10.0+vmware.1-tkg.1 2m31s
external-dns.tanzu.vmware.com.0.10.0+vmware.1-tkg.2 external-dns.tanzu.vmware.com 0.10.0+vmware.1-tkg.2 2m31s
external-dns.tanzu.vmware.com.0.11.0+vmware.1-tkg.2 external-dns.tanzu.vmware.com 0.11.0+vmware.1-tkg.2 2m31s
external-dns.tanzu.vmware.com.0.8.0+vmware.1-tkg.1 external-dns.tanzu.vmware.com 0.8.0+vmware.1-tkg.1 2m31s
fluent-bit.tanzu.vmware.com.1.7.5+vmware.1-tkg.1 fluent-bit.tanzu.vmware.com 1.7.5+vmware.1-tkg.1 2m31s
fluent-bit.tanzu.vmware.com.1.7.5+vmware.2-tkg.1 fluent-bit.tanzu.vmware.com 1.7.5+vmware.2-tkg.1 2m31s
fluent-bit.tanzu.vmware.com.1.8.15+vmware.1-tkg.1 fluent-bit.tanzu.vmware.com 1.8.15+vmware.1-tkg.1 2m31s
fluxcd-helm-controller.tanzu.vmware.com.0.21.0+vmware.1-tkg.1 fluxcd-helm-controller.tanzu.vmware.com 0.21.0+vmware.1-tkg.1 2m31s
fluxcd-kustomize-controller.tanzu.vmware.com.0.24.4+vmware.1-tkg.1 fluxcd-kustomize-controller.tanzu.vmware.com 0.24.4+vmware.1-tkg.1 2m31s
fluxcd-source-controller.tanzu.vmware.com.0.24.4+vmware.1-tkg.1 fluxcd-source-controller.tanzu.vmware.com 0.24.4+vmware.1-tkg.1 2m31s
fluxcd-source-controller.tanzu.vmware.com.0.24.4+vmware.1-tkg.2 fluxcd-source-controller.tanzu.vmware.com 0.24.4+vmware.1-tkg.2 2m31s
fluxcd-source-controller.tanzu.vmware.com.0.24.4+vmware.1-tkg.4 fluxcd-source-controller.tanzu.vmware.com 0.24.4+vmware.1-tkg.4 2m31s
grafana.tanzu.vmware.com.7.5.16+vmware.1-tkg.1 grafana.tanzu.vmware.com 7.5.16+vmware.1-tkg.1 2m31s
grafana.tanzu.vmware.com.7.5.7+vmware.1-tkg.1 grafana.tanzu.vmware.com 7.5.7+vmware.1-tkg.1 2m30s
grafana.tanzu.vmware.com.7.5.7+vmware.2-tkg.1 grafana.tanzu.vmware.com 7.5.7+vmware.2-tkg.1 2m30s
harbor.tanzu.vmware.com.2.2.3+vmware.1-tkg.1 harbor.tanzu.vmware.com 2.2.3+vmware.1-tkg.1 2m30s
harbor.tanzu.vmware.com.2.2.3+vmware.1-tkg.2 harbor.tanzu.vmware.com 2.2.3+vmware.1-tkg.2 2m30s
harbor.tanzu.vmware.com.2.3.3+vmware.1-tkg.1 harbor.tanzu.vmware.com 2.3.3+vmware.1-tkg.1 2m30s
harbor.tanzu.vmware.com.2.5.3+vmware.1-tkg.1 harbor.tanzu.vmware.com 2.5.3+vmware.1-tkg.1 2m30s
multus-cni.tanzu.vmware.com.3.7.1+vmware.1-tkg.1 multus-cni.tanzu.vmware.com 3.7.1+vmware.1-tkg.1 2m30s
multus-cni.tanzu.vmware.com.3.7.1+vmware.2-tkg.1 multus-cni.tanzu.vmware.com 3.7.1+vmware.2-tkg.1 2m30s
multus-cni.tanzu.vmware.com.3.7.1+vmware.2-tkg.2 multus-cni.tanzu.vmware.com 3.7.1+vmware.2-tkg.2 2m30s
multus-cni.tanzu.vmware.com.3.8.0+vmware.1-tkg.1 multus-cni.tanzu.vmware.com 3.8.0+vmware.1-tkg.1 2m30s
prometheus.tanzu.vmware.com.2.27.0+vmware.1-tkg.1 prometheus.tanzu.vmware.com 2.27.0+vmware.1-tkg.1 2m29s
prometheus.tanzu.vmware.com.2.27.0+vmware.2-tkg.1 prometheus.tanzu.vmware.com 2.27.0+vmware.2-tkg.1 2m29s
prometheus.tanzu.vmware.com.2.36.2+vmware.1-tkg.1 prometheus.tanzu.vmware.com 2.36.2+vmware.1-tkg.1 2m29s
whereabouts.tanzu.vmware.com.0.5.1+vmware.2-tkg.1 whereabouts.tanzu.vmware.com 0.5.1+vmware.2-tkg.1 2m29s
We’ll go with the most recent version, 1.7.2, for this installation and install it to the cert-manager namespace that was created earlier.
tanzu package install cert-manager -p cert-manager.tanzu.vmware.com -n cert-manager -v 1.7.2+vmware.1-tkg.1
tanzu package install cert-manager output
10:33:00AM: Creating service account 'cert-manager-cert-manager-sa'
10:33:00AM: Creating cluster admin role 'cert-manager-cert-manager-cluster-role'
10:33:00AM: Creating cluster role binding 'cert-manager-cert-manager-cluster-rolebinding'
10:33:00AM: Creating overlay secrets
10:33:00AM: Creating package install resource
10:33:00AM: Waiting for PackageInstall reconciliation for 'cert-manager'
10:33:00AM: Fetch started (3s ago)
10:33:03AM: Fetching
| apiVersion: vendir.k14s.io/v1alpha1
| directories:
| - contents:
| - imgpkgBundle:
| image: projects.registry.vmware.com/tkg/packages/standard/cert-manager@sha256:e7711b3ce0f05ece458d43c4ddb57f3ff3b98fe562d83b07db4b095d6789c292
| path: .
| path: "0"
| kind: LockConfig
|
10:33:03AM: Fetch succeeded
10:33:03AM: Template succeeded
10:33:03AM: Deploy started (2s ago)
10:33:05AM: Deploying
| Target cluster 'https://10.96.0.1:443' (nodes: tkg2-cluster-1-5vs4x-lflzx, 2+)
| Changes
| Namespace Name Kind Age Op Op st. Wait to Rs Ri
| (cluster) cert-manager Namespace 10m update - reconcile ok -
| ^ cert-manager-cainjector ClusterRole - create - reconcile - -
| ^ cert-manager-cainjector ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-controller-approve:cert-manager-io ClusterRole - create - reconcile - -
| ^ cert-manager-controller-approve:cert-manager-io ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-controller-certificates ClusterRole - create - reconcile - -
| ^ cert-manager-controller-certificates ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-controller-certificatesigningrequests ClusterRole - create - reconcile - -
| ^ cert-manager-controller-certificatesigningrequests ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-controller-challenges ClusterRole - create - reconcile - -
| ^ cert-manager-controller-challenges ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-controller-clusterissuers ClusterRole - create - reconcile - -
| ^ cert-manager-controller-clusterissuers ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-controller-ingress-shim ClusterRole - create - reconcile - -
| ^ cert-manager-controller-ingress-shim ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-controller-issuers ClusterRole - create - reconcile - -
| ^ cert-manager-controller-issuers ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-controller-orders ClusterRole - create - reconcile - -
| ^ cert-manager-controller-orders ClusterRoleBinding - create - reconcile - -
| ^ cert-manager-edit ClusterRole - create - reconcile - -
| ^ cert-manager-view ClusterRole - create - reconcile - -
| ^ cert-manager-webhook MutatingWebhookConfiguration - create - reconcile - -
| ^ cert-manager-webhook ValidatingWebhookConfiguration - create - reconcile - -
| ^ cert-manager-webhook:subjectaccessreviews ClusterRole - create - reconcile - -
| ^ cert-manager-webhook:subjectaccessreviews ClusterRoleBinding - create - reconcile - -
| ^ certificaterequests.cert-manager.io CustomResourceDefinition - create - reconcile - -
| ^ certificates.cert-manager.io CustomResourceDefinition - create - reconcile - -
| ^ challenges.acme.cert-manager.io CustomResourceDefinition - create - reconcile - -
| ^ clusterissuers.cert-manager.io CustomResourceDefinition - create - reconcile - -
| ^ issuers.cert-manager.io CustomResourceDefinition - create - reconcile - -
| ^ orders.acme.cert-manager.io CustomResourceDefinition - create - reconcile - -
| cert-manager cert-manager Deployment - create - reconcile - -
| ^ cert-manager Service - create - reconcile - -
| ^ cert-manager ServiceAccount - create - reconcile - -
| ^ cert-manager-cainjector Deployment - create - reconcile - -
| ^ cert-manager-cainjector ServiceAccount - create - reconcile - -
| ^ cert-manager-webhook ConfigMap - create - reconcile - -
| ^ cert-manager-webhook Deployment - create - reconcile - -
| ^ cert-manager-webhook Service - create - reconcile - -
| ^ cert-manager-webhook ServiceAccount - create - reconcile - -
| ^ cert-manager-webhook:dynamic-serving Role - create - reconcile - -
| ^ cert-manager-webhook:dynamic-serving RoleBinding - create - reconcile - -
| kube-system cert-manager-cainjector:leaderelection Role - create - reconcile - -
| ^ cert-manager-cainjector:leaderelection RoleBinding - create - reconcile - -
| ^ cert-manager:leaderelection Role - create - reconcile - -
| ^ cert-manager:leaderelection RoleBinding - create - reconcile - -
| Op: 45 create, 0 delete, 1 update, 0 noop, 0 exists
| Wait to: 46 reconcile, 0 delete, 0 noop
| 5:33:07PM: ---- applying 23 changes [0/46 done] ----
| 5:33:07PM: create validatingwebhookconfiguration/cert-manager-webhook (admissionregistration.k8s.io/v1) cluster
| 5:33:07PM: create clusterrole/cert-manager-controller-challenges (rbac.authorization.k8s.io/v1) cluster
| 5:33:08PM: create customresourcedefinition/certificaterequests.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:08PM: create customresourcedefinition/certificates.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:09PM: update namespace/cert-manager (v1) cluster
| 5:33:09PM: create customresourcedefinition/challenges.acme.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-controller-issuers (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-cainjector (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-controller-certificates (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-controller-clusterissuers (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-controller-orders (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-controller-certificatesigningrequests (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-controller-ingress-shim (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-view (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-edit (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-controller-approve:cert-manager-io (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create clusterrole/cert-manager-webhook:subjectaccessreviews (rbac.authorization.k8s.io/v1) cluster
| 5:33:09PM: create role/cert-manager-cainjector:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
| 5:33:09PM: create role/cert-manager:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
| 5:33:09PM: create mutatingwebhookconfiguration/cert-manager-webhook (admissionregistration.k8s.io/v1) cluster
| 5:33:09PM: create customresourcedefinition/issuers.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:09PM: create customresourcedefinition/clusterissuers.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:10PM: create customresourcedefinition/orders.acme.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:10PM: ---- waiting on 23 changes [0/46 done] ----
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-challenges (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile validatingwebhookconfiguration/cert-manager-webhook (admissionregistration.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile customresourcedefinition/orders.acme.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile customresourcedefinition/challenges.acme.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-issuers (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile namespace/cert-manager (v1) cluster
| 5:33:10PM: ok: reconcile customresourcedefinition/certificaterequests.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-cainjector (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-clusterissuers (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-certificates (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile customresourcedefinition/certificates.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-edit (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-orders (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-ingress-shim (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-certificatesigningrequests (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-view (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-webhook:subjectaccessreviews (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-approve:cert-manager-io (rbac.authorization.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile mutatingwebhookconfiguration/cert-manager-webhook (admissionregistration.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile role/cert-manager:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
| 5:33:10PM: ok: reconcile role/cert-manager-cainjector:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
| 5:33:10PM: ok: reconcile customresourcedefinition/clusterissuers.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:10PM: ok: reconcile customresourcedefinition/issuers.cert-manager.io (apiextensions.k8s.io/v1) cluster
| 5:33:10PM: ---- applying 5 changes [23/46 done] ----
| 5:33:10PM: create role/cert-manager-webhook:dynamic-serving (rbac.authorization.k8s.io/v1) namespace: cert-manager
| 5:33:10PM: create configmap/cert-manager-webhook (v1) namespace: cert-manager
| 5:33:10PM: create serviceaccount/cert-manager (v1) namespace: cert-manager
| 5:33:11PM: create serviceaccount/cert-manager-webhook (v1) namespace: cert-manager
| 5:33:11PM: create serviceaccount/cert-manager-cainjector (v1) namespace: cert-manager
| 5:33:11PM: ---- waiting on 5 changes [23/46 done] ----
| 5:33:11PM: ok: reconcile serviceaccount/cert-manager-webhook (v1) namespace: cert-manager
| 5:33:11PM: ok: reconcile role/cert-manager-webhook:dynamic-serving (rbac.authorization.k8s.io/v1) namespace: cert-manager
| 5:33:11PM: ok: reconcile serviceaccount/cert-manager-cainjector (v1) namespace: cert-manager
| 5:33:11PM: ok: reconcile serviceaccount/cert-manager (v1) namespace: cert-manager
| 5:33:11PM: ok: reconcile configmap/cert-manager-webhook (v1) namespace: cert-manager
| 5:33:11PM: ---- applying 13 changes [28/46 done] ----
| 5:33:11PM: create clusterrolebinding/cert-manager-controller-challenges (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create clusterrolebinding/cert-manager-controller-issuers (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create clusterrolebinding/cert-manager-cainjector (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create clusterrolebinding/cert-manager-controller-clusterissuers (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create rolebinding/cert-manager-webhook:dynamic-serving (rbac.authorization.k8s.io/v1) namespace: cert-manager
| 5:33:11PM: create clusterrolebinding/cert-manager-controller-certificates (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create clusterrolebinding/cert-manager-controller-ingress-shim (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create clusterrolebinding/cert-manager-controller-orders (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create clusterrolebinding/cert-manager-controller-certificatesigningrequests (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create clusterrolebinding/cert-manager-controller-approve:cert-manager-io (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create rolebinding/cert-manager-cainjector:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
| 5:33:11PM: create clusterrolebinding/cert-manager-webhook:subjectaccessreviews (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: create rolebinding/cert-manager:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
| 5:33:11PM: ---- waiting on 13 changes [28/46 done] ----
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-clusterissuers (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-challenges (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile rolebinding/cert-manager-webhook:dynamic-serving (rbac.authorization.k8s.io/v1) namespace: cert-manager
| 5:33:11PM: ok: reconcile rolebinding/cert-manager:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-issuers (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-cainjector (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-certificates (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile rolebinding/cert-manager-cainjector:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-webhook:subjectaccessreviews (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-certificatesigningrequests (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-ingress-shim (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-approve:cert-manager-io (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-orders (rbac.authorization.k8s.io/v1) cluster
| 5:33:11PM: ---- applying 5 changes [41/46 done] ----
| 5:33:12PM: create service/cert-manager (v1) namespace: cert-manager
| 5:33:12PM: create service/cert-manager-webhook (v1) namespace: cert-manager
| 5:33:13PM: create deployment/cert-manager-cainjector (apps/v1) namespace: cert-manager
| 5:33:13PM: create deployment/cert-manager (apps/v1) namespace: cert-manager
| 5:33:13PM: create deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
| 5:33:13PM: ---- waiting on 5 changes [41/46 done] ----
| 5:33:13PM: ok: reconcile service/cert-manager-webhook (v1) namespace: cert-manager
| 5:33:13PM: ok: reconcile service/cert-manager (v1) namespace: cert-manager
| 5:33:13PM: ongoing: reconcile deployment/cert-manager-cainjector (apps/v1) namespace: cert-manager
| 5:33:13PM: ^ Waiting for 1 unavailable replicas
| 5:33:13PM: L ok: waiting on replicaset/cert-manager-cainjector-79bf859fb7 (apps/v1) namespace: cert-manager
| 5:33:13PM: L ongoing: waiting on pod/cert-manager-cainjector-79bf859fb7-hnzc4 (v1) namespace: cert-manager
| 5:33:13PM: ^ Pending: ContainerCreating
| 5:33:13PM: ongoing: reconcile deployment/cert-manager (apps/v1) namespace: cert-manager
| 5:33:13PM: ^ Waiting for generation 2 to be observed
| 5:33:13PM: L ok: waiting on replicaset/cert-manager-6c69844b6b (apps/v1) namespace: cert-manager
| 5:33:13PM: L ongoing: waiting on pod/cert-manager-6c69844b6b-bnmcp (v1) namespace: cert-manager
| 5:33:13PM: ^ Pending: ContainerCreating
| 5:33:13PM: ongoing: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
| 5:33:13PM: ^ Waiting for generation 2 to be observed
| 5:33:13PM: L ok: waiting on replicaset/cert-manager-webhook-599749fbd5 (apps/v1) namespace: cert-manager
| 5:33:13PM: L ongoing: waiting on pod/cert-manager-webhook-599749fbd5-5c2wx (v1) namespace: cert-manager
| 5:33:13PM: ^ Pending: ContainerCreating
| 5:33:13PM: ---- waiting on 3 changes [43/46 done] ----
| 5:33:13PM: ongoing: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
| 5:33:13PM: ^ Waiting for 1 unavailable replicas
| 5:33:13PM: L ok: waiting on replicaset/cert-manager-webhook-599749fbd5 (apps/v1) namespace: cert-manager
| 5:33:13PM: L ongoing: waiting on pod/cert-manager-webhook-599749fbd5-5c2wx (v1) namespace: cert-manager
| 5:33:13PM: ^ Pending: ContainerCreating
| 5:33:13PM: ongoing: reconcile deployment/cert-manager (apps/v1) namespace: cert-manager
| 5:33:13PM: ^ Waiting for 1 unavailable replicas
| 5:33:13PM: L ok: waiting on replicaset/cert-manager-6c69844b6b (apps/v1) namespace: cert-manager
| 5:33:13PM: L ongoing: waiting on pod/cert-manager-6c69844b6b-bnmcp (v1) namespace: cert-manager
| 5:33:13PM: ^ Pending: ContainerCreating
| 5:33:27PM: ok: reconcile deployment/cert-manager (apps/v1) namespace: cert-manager
| 5:33:27PM: ---- waiting on 2 changes [44/46 done] ----
| 5:33:33PM: ongoing: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
| 5:33:33PM: ^ Waiting for 1 unavailable replicas
| 5:33:33PM: L ok: waiting on replicaset/cert-manager-webhook-599749fbd5 (apps/v1) namespace: cert-manager
| 5:33:33PM: L ongoing: waiting on pod/cert-manager-webhook-599749fbd5-5c2wx (v1) namespace: cert-manager
| 5:33:33PM: ^ Condition Ready is not True (False)
| 5:33:37PM: ok: reconcile deployment/cert-manager-cainjector (apps/v1) namespace: cert-manager
| 5:33:37PM: ---- waiting on 1 changes [45/46 done] ----
| 5:33:37PM: ongoing: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
| 5:33:37PM: ^ Waiting for 1 unavailable replicas
| 5:33:37PM: L ok: waiting on replicaset/cert-manager-webhook-599749fbd5 (apps/v1) namespace: cert-manager
| 5:33:37PM: L ok: waiting on pod/cert-manager-webhook-599749fbd5-5c2wx (v1) namespace: cert-manager
| 5:33:38PM: ok: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
| 5:33:38PM: ---- applying complete [46/46 done] ----
| 5:33:38PM: ---- waiting complete [46/46 done] ----
| Succeeded
10:33:38AM: Deploy succeeded (1s ago)
You can use either the kubectl
or the tanzu
command to query the installation of the cert-manager package.
kubectl -n cert-manager get packageinstalls
NAME PACKAGE NAME PACKAGE VERSION DESCRIPTION AGE
cert-manager cert-manager.tanzu.vmware.com 1.7.2+vmware.1-tkg.1 Reconcile succeeded 106s
tanzu package installed list -n cert-manager
NAME PACKAGE-NAME PACKAGE-VERSION STATUS
cert-manager cert-manager.tanzu.vmware.com 1.7.2+vmware.1-tkg.1 Reconcile succeeded
tanzu package installed get -n cert-manager cert-manager
NAMESPACE: cert-manager
NAME: cert-manager
PACKAGE-NAME: cert-manager.tanzu.vmware.com
PACKAGE-VERSION: 1.7.2+vmware.1-tkg.1
STATUS: Reconcile succeeded
CONDITIONS: - type: ReconcileSucceeded
status: "True"
reason: ""
message: ""
You can also check the cert-manager namespace for created resources.
kubectl -n cert-manager get all
NAME READY STATUS RESTARTS AGE
pod/cert-manager-6c69844b6b-bnmcp 1/1 Running 0 16m
pod/cert-manager-cainjector-79bf859fb7-hnzc4 1/1 Running 0 16m
pod/cert-manager-webhook-599749fbd5-5c2wx 1/1 Running 0 16m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cert-manager ClusterIP 10.105.81.68 <none> 9402/TCP 16m
service/cert-manager-webhook ClusterIP 10.96.177.102 <none> 443/TCP 16m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cert-manager 1/1 1 1 16m
deployment.apps/cert-manager-cainjector 1/1 1 1 16m
deployment.apps/cert-manager-webhook 1/1 1 1 16m
NAME DESIRED CURRENT READY AGE
replicaset.apps/cert-manager-6c69844b6b 1 1 1 16m
replicaset.apps/cert-manager-cainjector-79bf859fb7 1 1 1 16m
replicaset.apps/cert-manager-webhook-599749fbd5 1 1 1 16m
Install the Contour package
It’s always a good idea to install Contour early on as it will provide the needed ingress for other packages.
With most packages, you have to define a “data-values” specification that determines much of how the package will be configured. You can read up on the available parameters for all of the available packages at Installing Tanzu Packages on TKG 2 Clusters on Supervisor.
infrastructure_provider: vsphere
namespace: tanzu-system-ingress
contour:
configFileContents: {}
useProxyProtocol: false
replicas: 2
pspNames: "vmware-system-restricted"
logLevel: info
envoy:
service:
type: LoadBalancer
annotations: {}
nodePorts:
http: null
https: null
externalTrafficPolicy: Cluster
disableWait: false
hostPorts:
enable: true
http: 80
https: 443
hostNetwork: false
terminationGracePeriodSeconds: 300
logLevel: info
pspNames: null
certificates:
duration: 8760h
renewBefore: 360h
There is very little that is customized here. The most important point is that the envoy service is type LoadBalancer. This means that it will be realized by NSX and accessible from outside of the cluster (very important for ingress).
Note: You can use the imgpkg
command (included in the tanzu
CLI download) to pull a default data-values yaml file, via a command similar to the following:
imgpkg pull -b $(kubectl -n tkg-system get packages contour.tanzu.vmware.com.1.20.2+vmware.1-tkg.1 -o jsonpath='{.spec.template.spec.fetch[0].imgpkgBundle.image}') -o /tmp/contour
The default file in this example would be located at /tmp/contour/config/values.yaml.
You can use this same process to get defaullt data-values yaml files for other packages as well.
Just as was done for cert-manager, a unique namespace should be created for the Contour package.
kubectl create ns tanzu-system-ingress
You will also need to specify a version of Contour to deploy. From the earlier kubectl -n tkg-system get packages
output, you can see the available Contour versions.
contour.tanzu.vmware.com.1.17.1+vmware.1-tkg.1 contour.tanzu.vmware.com 1.17.1+vmware.1-tkg.1 2m31s
contour.tanzu.vmware.com.1.17.2+vmware.1-tkg.2 contour.tanzu.vmware.com 1.17.2+vmware.1-tkg.2 2m31s
contour.tanzu.vmware.com.1.17.2+vmware.1-tkg.3 contour.tanzu.vmware.com 1.17.2+vmware.1-tkg.3 2m31s
contour.tanzu.vmware.com.1.18.2+vmware.1-tkg.1 contour.tanzu.vmware.com 1.18.2+vmware.1-tkg.1 2m31s
contour.tanzu.vmware.com.1.20.2+vmware.1-tkg.1 contour.tanzu.vmware.com 1.20.2+vmware.1-tkg.1 2m31s
The latest version is 1.20.2 so we’ll install that.
tanzu package install contour -p contour.tanzu.vmware.com -v 1.20.2+vmware.1-tkg.1 --values-file contour-data-values.yaml -n tanzu-system-ingress
The output is skipped for this and the rest of the packages as it’s very similar to the output for the cert-manager package. When the installation is done you should see output similar to the following:
10:49:47AM: Deploy succeeded
You should also validate that the package installation was successful and the necessary objects were created.
tanzu package installed list -n tanzu-system-ingress
NAME PACKAGE-NAME PACKAGE-VERSION STATUS
contour contour.tanzu.vmware.com 1.20.2+vmware.1-tkg.1 Reconcile succeeded
kubectl -n tanzu-system-ingress get all
NAME READY STATUS RESTARTS AGE
pod/contour-5c6cb8f577-pg5j6 1/1 Running 0 3m5s
pod/contour-5c6cb8f577-vq5dt 1/1 Running 0 3m5s
pod/envoy-fl4rx 2/2 Running 0 3m6s
pod/envoy-tltqf 2/2 Running 0 3m6s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/contour ClusterIP 10.96.189.3 <none> 8001/TCP 3m5s
service/envoy LoadBalancer 10.101.72.152 10.40.14.70 80:30412/TCP,443:30108/TCP 3m5s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/envoy 2 2 2 2 2 <none> 3m7s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/contour 2/2 2 2 3m5s
NAME DESIRED CURRENT READY AGE
replicaset.apps/contour-5c6cb8f577 2 2 2 3m5s
You can see that the envoy service has an external IP address of 10.40.14.70, clearly in the range specified for ingress during Workload Management configuration.
In NSX, you can see a new load balancer for this address:

Looking at the server pool for this load balancer, you can see that there are two members, the envoy pods.

Since the envoy pods take on the IP addresses of the worker nodes on which these run, these IPs can be seen in the cluster by querying the nodes.
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
tkg2-cluster-1-5vs4x-lflzx Ready control-plane,master 125m v1.23.8+vmware.2 10.244.0.114 <none> VMware Photon OS/Linux 4.19.256-4.ph3-esx containerd://1.6.6
tkg2-cluster-1-tkg2-cluster-1-nodepool-1-6hgnv-685f8bb498-kjjsk Ready <none> 121m v1.23.8+vmware.2 10.244.0.116 <none> VMware Photon OS/Linux 4.19.256-4.ph3-esx containerd://1.6.6
tkg2-cluster-1-tkg2-cluster-1-nodepool-1-6hgnv-685f8bb498-mx58w Ready <none> 121m v1.23.8+vmware.2 10.244.0.115 <none> VMware Photon OS/Linux 4.19.256-4.ph3-esx containerd://1.6.6
Install the Service Discovery package
The service discovery package (or external-DNS) is a great one to install as it will allow for DNS records to be created automatically for systems with an ingress component (or a httpproxy component in the case of Contour). External-DNS is an open source project that has been included with TKG since version 1.3. External-DNS synchronizes exposed Kubernetes Services and Ingresses with DNS providers. vSphere with Tanzu can use external-DNS to assist with service discovery as it will automatically create DNS records for httpproxy resources created via Contour. AWS (Route53), Azure, and RFC2136 (BIND) are currently supported but we’re suing RFC2136 since this is what is needed to work with Microsoft DNS.
You can read more about this specific package in my earlier post, How to configure external-dns with Microsoft DNS in TKG 1.3 (plus Harbor and Contour).
Create the namespace:
kubectl create ns tanzu-system-service-discovery
A configmap is needed that defines the kerberos configuration that the service discovery package will use…this is fairly generic and the only things customized were the domain/realm name and the kdc/admin_server addresses:
apiVersion: v1
kind: ConfigMap
metadata:
name: krb5.conf
namespace: tanzu-system-service-discovery
data:
krb5.conf: |
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
pkinit_anchors = /etc/pki/tls/certs/ca-bundle.crt
default_ccache_name = KEYRING:persistent:%{uid}
default_realm = CORP.VMW
[realms]
CORP.VMW = {
kdc = controlcenter.corp.vmw
admin_server = controlcenter.corp.vmw
}
[domain_realm]
corp.vmw = CORP.VMW
.corp.vmw = CORP.VMW
Create the ConfigMap:
kubectl apply -f external-dns-krb5-cm.yaml
And a data-values specification will needed for the service discovery package.
namespace: tanzu-system-service-discovery
deployment:
args:
- --provider=rfc2136
- --rfc2136-host=controlcenter.corp.vmw
- --rfc2136-port=53
- --rfc2136-zone=corp.vmw
- --rfc2136-gss-tsig
- --rfc2136-kerberos-realm=corp.vmw
- --rfc2136-kerberos-username=administrator
- --rfc2136-kerberos-password=VMware1!
- --rfc2136-tsig-axfr
- --source=service
- --source=ingress
- --source=contour-httpproxy
- --domain-filter=corp.vmw
- --txt-owner-id=k8s
- --txt-prefix=external-dns-
- --registry=txt
- --policy=upsert-only
env: []
securityContext: {}
volumeMounts:
- name: kerberos-config-volume
mountPath: /etc/krb5.conf
subPath: krb5.conf
volumes:
- name: kerberos-config-volume
configMap:
defaultMode: 420
name: krb5.conf
I’ve made a number of changes to this file to allow for external DNS to communicate with my Microsoft DNS implementation.
- set the
rfc2136-host
value to controlcenter.corp.vmw, the FQDN of my AD/DNS server - set the
rfc2136-zone
,rfc2126-kerberos-realm
, anddomain-filter
values to corp.vmw, the DNS zone name - set the
rfc2136-kerberos-username
value to administrator, the admin user in my AD domain - set the
rfc2136-kerberos-password
value to VMware1!, the administrator user password in my AD domain - set the
source
value to contour-httprpoxy, telling external DNS to only look for httpproxy resources (not ingress objects) - specified the
volumeMounts
andvolumes
stanzas to point to the krb5 configMap created earlier
The available versions were shown earlier:
external-dns.tanzu.vmware.com.0.10.0+vmware.1-tkg.1 external-dns.tanzu.vmware.com 0.10.0+vmware.1-tkg.1 2m31s
external-dns.tanzu.vmware.com.0.10.0+vmware.1-tkg.2 external-dns.tanzu.vmware.com 0.10.0+vmware.1-tkg.2 2m31s
external-dns.tanzu.vmware.com.0.11.0+vmware.1-tkg.2 external-dns.tanzu.vmware.com 0.11.0+vmware.1-tkg.2 2m31s
external-dns.tanzu.vmware.com.0.8.0+vmware.1-tkg.1 external-dns.tanzu.vmware.com 0.8.0+vmware.1-tkg.1 2m31s
The package can be installed with the latest version, 0.11.0.
tanzu package install external-dns -p external-dns.tanzu.vmware.com -n tanzu-system-service-discovery -v 0.11.0+vmware.1-tkg.2 --values-file external-dns-data-values.yaml
Check that the components have been successfully installed.
tanzu package installed list -n tanzu-system-service-discovery
NAME PACKAGE-NAME PACKAGE-VERSION STATUS
external-dns external-dns.tanzu.vmware.com 0.11.0+vmware.1-tkg.2 Reconcile succeeded
kubectl -n tanzu-system-service-discovery get all
NAME READY STATUS RESTARTS AGE
pod/external-dns-77d947745-tcjz9 1/1 Running 0 63s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/external-dns 1/1 1 1 63s
NAME DESIRED CURRENT READY AGE
replicaset.apps/external-dns-77d947745 1 1 1 63s
You’ll be able to check that this package is functional when you install other packages that have an ingress.
Install the Prometheus package
Create the namespace:
kubectl create ns tanzu-system-monitoring
Get the desired version:
prometheus.tanzu.vmware.com.2.27.0+vmware.1-tkg.1 prometheus.tanzu.vmware.com 2.27.0+vmware.1-tkg.1 2m29s
prometheus.tanzu.vmware.com.2.27.0+vmware.2-tkg.1 prometheus.tanzu.vmware.com 2.27.0+vmware.2-tkg.1 2m29s
prometheus.tanzu.vmware.com.2.36.2+vmware.1-tkg.1 prometheus.tanzu.vmware.com 2.36.2+vmware.1-tkg.1 2m29s
We’ll use 2.36.2, the latest version.
Create the data-values specification.
prometheus-data-values.yaml
alertmanager:
config:
alertmanager_yml: |
global: {}
receivers:
- name: default-receiver
templates:
- '/etc/alertmanager/templates/*.tmpl'
route:
group_interval: 5m
group_wait: 10s
receiver: default-receiver
repeat_interval: 3h
deployment:
replicas: 1
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
updateStrategy: Recreate
pvc:
accessMode: ReadWriteOnce
storage: 2Gi
storageClassName: k8s-policy
service:
port: 80
targetPort: 9093
type: ClusterIP
ingress:
alertmanager_prefix: /alertmanager/
alertmanagerServicePort: 80
enabled: true
prometheus_prefix: /
prometheusServicePort: 80
tlsCertificate:
ca.crt: ca
tls.crt: |
-----BEGIN CERTIFICATE-----
MIIHxTCCBa2gAwIBAgITIgAAAAQnSpH7QfxTKAAAAAAABDANBgkqhkiG9w0BAQsF
ADBMMRMwEQYKCZImiZPyLGQBGRYDdm13MRQwEgYKCZImiZPyLGQBGRYEY29ycDEf
MB0GA1UEAxMWY29udHJvbGNlbnRlci5jb3JwLnZtdzAeFw0yMjAzMjExOTQ1MzNa
Fw0zMjAzMTgxOTQ1MzNaMG0xCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9y
bmlhMRIwEAYDVQQHEwlQYWxvIEFsdG8xDzANBgNVBAoTBlZNd2FyZTEPMA0GA1UE
CxMGVk13YXJlMRMwEQYDVQQDDAoqLmNvcnAudm13MIICIjANBgkqhkiG9w0BAQEF
AAOCAg8AMIICCgKCAgEAzhh0/CNdiskBkkzZzrhBbKiQEzekxnw0uCgLCpkmQzIe
m98jgt7LgS2dQFf5vABpTqIX11t4EoY/1gmsrz1IjBv26VrIjSLFmGppRgh+7NPY
LbXVB6AbrgtlC+a89d14h37l30khZ62H2KTvmUojihUSeYZfi50IMsW/UPr8Z94V
xhI2bTy/PDL/6sd+qmTpqaZplooEQBn18D0VqJhbTCK+kbdxgM01/8PcZoTcjAEN
XcUR64p3H0KG0QeQn8r81nOsEihNBUCsrc6PiqPDRGtOJ+BsUhQy+BVmk/dIwnqM
PJN7cTElz3h+ScJdUr0vPBfZjkI/glRk4GugFomg/qdHvt0OVsLG9efBbiZJo8/X
PWEcRppCYE+HexbiVwKCML2i75nsELdH4+UVmfrywhAyZf4hvlz2/w5LQudPrfHz
qPZ76NLmUkkK7buPGsktbtMS4n8fUGsr23Y4UxvZNRrvb6aPp3gSylDrzsV4dTQb
PiLxzrZymiYw8tReMchCXbb9LGoN50xYbq2nAyUPhxnH0HOuw415shfkLTTGx1ms
Ht+NeTUbavSh4dBa6EKFfKbYQ9Fukw3XkeMClX0Mdo/EQemivLGO+hIxoDIexkQs
kXU2y/1uRhCOW5Rcsjr3EgwXGHENr8WUIut8wG1QoW8lg0ma73ErIY37dtFORGcC
AwEAAaOCAn0wggJ5MA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEFBQcD
ATAVBgNVHREEDjAMggoqLmNvcnAudm13MB0GA1UdDgQWBBQHTiEY3UJ11GdfNS9U
iba5lC0f3zAfBgNVHSMEGDAWgBTGg/OWUAlp52UyrZaNhYHgXyQZxzCB1wYDVR0f
BIHPMIHMMIHJoIHGoIHDhoHAbGRhcDovLy9DTj1jb250cm9sY2VudGVyLmNvcnAu
dm13LENOPWNvbnRyb2xjZW50ZXIsQ049Q0RQLENOPVB1YmxpYyUyMEtleSUyMFNl
cnZpY2VzLENOPVNlcnZpY2VzLENOPUNvbmZpZ3VyYXRpb24sREM9Y29ycCxEQz12
bXc/Y2VydGlmaWNhdGVSZXZvY2F0aW9uTGlzdD9iYXNlP29iamVjdENsYXNzPWNS
TERpc3RyaWJ1dGlvblBvaW50MIHFBggrBgEFBQcBAQSBuDCBtTCBsgYIKwYBBQUH
MAKGgaVsZGFwOi8vL0NOPWNvbnRyb2xjZW50ZXIuY29ycC52bXcsQ049QUlBLENO
PVB1YmxpYyUyMEtleSUyMFNlcnZpY2VzLENOPVNlcnZpY2VzLENOPUNvbmZpZ3Vy
YXRpb24sREM9Y29ycCxEQz12bXc/Y0FDZXJ0aWZpY2F0ZT9iYXNlP29iamVjdENs
YXNzPWNlcnRpZmljYXRpb25BdXRob3JpdHkwPAYJKwYBBAGCNxUHBC8wLQYlKwYB
BAGCNxUIgfaGVYGazCSBvY8jgZr5P5S9VzyB26Mxg6eFPwIBZAIBAjAbBgkrBgEE
AYI3FQoEDjAMMAoGCCsGAQUFBwMBMA0GCSqGSIb3DQEBCwUAA4ICAQCCxqhinZTi
NWHGvTgms+KdmNfIFR/R6GTFF8bO/bgfRw5pVeSDurEechjlO2hRDWOHn4H2+fIQ
vN4I6cjEbVnDBbVrRkbCfNLD1Wjj/zz65pZ8PmjQUiFl9L4HxaUH5sF/3Jrylsu0
M2oITByEfl5WpfC0oyB2/9nKYLPLOK5j2OicHxEino1RPAPdOk5gU5c6+Ed74kVh
KtgK3r8Lc4/IhwfmSDgGl1DmEkzv/u+0bQTOOH1fVKSl3p9+YieADc3s2SJWFF0F
mKElJHlZEeAg+MI16zNbQhowZE2SE+b9VTGK9KDkCmYGRjpHc61onQNTzIH5rDFx
/0aBOGp3+tdA+QEI8VgpQlaa0BtsKyY3l/DAg7I42x4Zv9ta7vZUe81v11PqdSJQ
v7NOriJkRNPneErH2QNsbi00p+TlUFzOl85AEtlB722/fDHbDGSxngDAhImCv4uC
xPnwVRo94AI0A6ol0FEct2z2wgehQKQbwcskNpOE7wryd+/yqrm+Z5o0crfaPPwX
uEwDtQjCRM+8wDWBcQnvyOJe374nFLpGpX8tZLjOg2U0wwdsfAxtYGZUAN0V0xm0
YYsIjp7/f+Pk1DjzWx8JIAbzItKLucDreAmmDXqk+DrBP9LYqtmjB0n7nSErgK8G
sA3kGCJdOkI0kgF10gsinaouG2jVlwNOsw==
-----END CERTIFICATE-----
tls.key: |
-----BEGIN PRIVATE KEY-----
MIIJRAIBADANBgkqhkiG9w0BAQEFAASCCS4wggkqAgEAAoICAQDOGHT8I12KyQGS
TNnOuEFsqJATN6TGfDS4KAsKmSZDMh6b3yOC3suBLZ1AV/m8AGlOohfXW3gShj/W
CayvPUiMG/bpWsiNIsWYamlGCH7s09gttdUHoBuuC2UL5rz13XiHfuXfSSFnrYfY
pO+ZSiOKFRJ5hl+LnQgyxb9Q+vxn3hXGEjZtPL88Mv/qx36qZOmppmmWigRAGfXw
PRWomFtMIr6Rt3GAzTX/w9xmhNyMAQ1dxRHrincfQobRB5CfyvzWc6wSKE0FQKyt
zo+Ko8NEa04n4GxSFDL4FWaT90jCeow8k3txMSXPeH5Jwl1SvS88F9mOQj+CVGTg
a6AWiaD+p0e+3Q5Wwsb158FuJkmjz9c9YRxGmkJgT4d7FuJXAoIwvaLvmewQt0fj
5RWZ+vLCEDJl/iG+XPb/DktC50+t8fOo9nvo0uZSSQrtu48ayS1u0xLifx9Qayvb
djhTG9k1Gu9vpo+neBLKUOvOxXh1NBs+IvHOtnKaJjDy1F4xyEJdtv0sag3nTFhu
racDJQ+HGcfQc67DjXmyF+QtNMbHWawe3415NRtq9KHh0FroQoV8pthD0W6TDdeR
4wKVfQx2j8RB6aK8sY76EjGgMh7GRCyRdTbL/W5GEI5blFyyOvcSDBcYcQ2vxZQi
63zAbVChbyWDSZrvcSshjft20U5EZwIDAQABAoICAHmP83DFa2dxKHwi2FYWWIC+
7Dxplcd9e5skA1889lSsO2G1PDz1LRQE07wgKC28EGFROr7MNQa4KO8WxcSXYTND
S2BZK/ITkHlWSsIEQNlwGxLbLcxRpAIEtpVOhCaBe5ZwQyZw/EMrF/WxU6IXGN9Z
jowftjujZDKOcUpSwI6DcFRkabYFHsdjTZAuG4hl/W0TuzQQNHGa3nXVkfDf7Pn7
hGxux4QxhqhV3qqZs3zhIgEtPGSyR5EorFyfGa8nC/tyPwx2uPdgLnpWXFRqQ8MX
iAH9XecMAwRRmy+rrD8KCa2xUB5z3tmBOPxIqMMk07eeWbSPXuaA4P9+e+7PPyXl
BEZ/6Y5wICrd2WrzJQOUrLQpbiSRZi7tTxUxYCKWXmB93RDjG8yl5qXZZ5G173PK
hGHH8KyUWD6tW0ytFEW714IaqJN8ffTCkThnveWKOd/jW3/ffxxShMj9+FnnUEfp
dfQCq3rZEafoJX/A98TOibq/f5Rogky3D3azhs6gz/+NBterX6+7U2OnSK4cPAGr
KPn0HJT99gnojkFO32L0N8QriXSdY5gpTw+ZGzgOu78WO5JwvCd+LENNCMfIQckh
Jm/GM+0sTG03bhlQwyfJeV43ETyarGfiRuwFzgbvH24Ie6iSPSrIu5oqFfDp2+El
EzAC9kqo4rmCoUQcX5ABAoIBAQD/egDF3mu7yVXQx8EFSKAsTi/Bq6z4gmBtQdN3
g4MbdlfZ5/rrQZ1pbm2VlqLYz0hab4V0rtJKFKAg4UdSxZTnuR3tZCgSjNd4wy8b
Meuwyls/IMKPmcrKmBWSxesQRs/wMlKPWYv7SUgJhgHqiPuDP74s1FaYdnPeVjKO
BDABdzk3MJaU/FEPqHs+0RhjhHOaVm4v2T+/AYIFZ0Gvm+uSlhjlfse83Ui2WV5b
0DUWS0Gl80UCkPloDWpDa21bcpmcYqrBUnppYz4XPvzNjnYZflubJU7ZGoEB3DNw
u8IlVidH2/c86eGKvCNihs01f3Oxg8XpEB9W6mFjhDVE7FnJAoIBAQDOhI2+tcG+
oQybL3IhvvTYnpxPh9GU9qxCEsb71qR+xmXXdIAf9ZPtuL+iet2B60CICcn94gtn
ntqm4mSjJ9qg7CVnwCLWRhSqGE7Ha+4xnAQgRIpaVhRYThS0ZEL2GbYpezkc78Ql
ZKDWJLAz/4sGzQcbBt+xVM2962bTFVSWRdeXfe6BoRbn5f9V9ZA7wSE5porC1k1h
C6HiApOXmZMryILGrrE2YxO/S4cA+TVwH2/dU8W4Ti50EcYvMUwLf3hE3hZwIUvR
HSecQBcb5fp3mZpnvaZjNTx7vKh6jjRXfmSRhI2dbSZZh6k+5NFP5C28A4Hkhuh6
lpAGQxQ2h8SvAoIBAQDUTtBbn3aKbUvaoFZBDNTHXQaE/SVWtApsYZraJDmNVfC2
DvnQDgxBtNpuyOt2H/Rx62HN0QbDN5bHHFAIclhHpehAAs7mc5MRMatw/zBuEAx6
TsBBVD5Z1L+A5Odu9FoTs842gOU6o/CwsWPgQ4w4y31Ahgmc1DuAVsPWj5ZRcYHj
4oYRNAotaAdb8apB8a2cYh1ZuEIoeplR4jiNNpczj3cLKSvWQVMO7v/ibwnfCBV7
UspT0qThmtxnQNx1daxAcSKUW/WMpUPRT7AJJ03v67k3Gm8HLuZs5FD/a5lxK8Kj
DiLNxVOA1s7VL09UGSHNMMQE5jgVI9xhNlqKd5w5AoIBAQCe0y639szUMMOjLbAW
5+ciGYmZWJkEeVktT4ec8wx7O1Xjh4NqENH9x1IKQXfNjQGKHg0spgWjYXZDVmWT
XPk1PafezNN9+1O1JRChKg58NMKvlkbZBs6KwzIFMf6VilygNlZMPNGa+HMBfiHN
O8DOMCxAyt6KYPACGeJwgD0XfQs7ROyC4ULegfIHR93vNq64ya55/ZpxAiMz0Et2
EfQvffullYBQlY4AVrOzOfWxD1xW2TB8eBQdy/WhIcacKSJzxGF5RwIqBsQJ1Phw
ykQAay9mjWJDdhPYDdV8u5ThnSD3EPxgkCsoO78b0ZpwWMobiI8DFAYDEXwedMQ8
09mdAoIBAQCxsvbB38jiPPqsF77jxWzogVdPR6xO6oLkVcvu4KI1Fq0wFdDH4eBJ
gOxV8e+wsQBVO1XXP4i3UhbCq7gPgh+YJWWy66RKiuwpIPf2yvGbbUgBpp73MxOx
ycerjh7LRS56gAAJ6d5hyYf293E5QEAOZh+Niuz3m1xMytdvG7IefX3TMe0GMgZh
I5djpaHvjgB6qHeVsLuUWjC2FPXtrz2STay08tq/Pc5g+57bTfOyHo0BZ3C/uhZa
l1NzswracGQIzo03zk/X3Z6P2YOea4BkZ0Iwh34wOHJnTkfEeSx6y+oSFMcFRthT
yfFCZUk/sVCc/C1a4VigczXftUGiRrTR
-----END PRIVATE KEY-----
ca.crt: |
-----BEGIN CERTIFICATE-----
MIIFczCCA1ugAwIBAgIQTYJITQ3SZ4BBS9UzXfJIuTANBgkqhkiG9w0BAQsFADBM
MRMwEQYKCZImiZPyLGQBGRYDdm13MRQwEgYKCZImiZPyLGQBGRYEY29ycDEfMB0G
A1UEAxMWY29udHJvbGNlbnRlci5jb3JwLnZtdzAeFw0yMjAzMjExOTE3MjhaFw0z
NzAzMjExOTI3MjNaMEwxEzARBgoJkiaJk/IsZAEZFgN2bXcxFDASBgoJkiaJk/Is
ZAEZFgRjb3JwMR8wHQYDVQQDExZjb250cm9sY2VudGVyLmNvcnAudm13MIICIjAN
BgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEA2OYKxckOjhgufWu1YEnatvJ1M127
gwPbFNj11/dICXaPe+mjN1Hce0PiS2QaaeAe8kH+mOKRa2JjaGdXr6rOiB80KZOR
uw0GzSJyL5w7ewR+NJf31YO62BD/mt3sHeMnCXmSBxOQvb0nGkhTr1y+rDpvxJ87
zNczgfN54to6S379wjOsC4bkHLnMJ5EtJG78pPqX1+1wcVOURNJ6y9BcejLnoy/y
CFpXKOVxKHzy2nnsitAuBb+hD+Jxw8/jFQUhxH0VlgyfXCQdegasSA9RHtZtfpVs
hshisjkSlvQmbsEknBZrAfBVIYidwt3w050jVhiUs5Ql6vDotY6Gqtzzgq0obv6P
7E9NPej3BzhPSIUyqnpf57UWI4zUiRJvbSu/J2MCBKHwYfzke1cnvLA7viDEdB9+
/Htk9aG9/1B6ddDfafrcSOWtkTfHWYLv21o3Uwoh9W5OpK9JikZu/PqnpZkUi+2C
L+WCww/BS1yhQwVif6PqUMeSLz3jtq3w6R/ruUMlO+0E5//bskDT6QGxBgcvMF9n
Dl+u0uqHKOdiUvOXBtF139HKUrZsq0m3WPoel2/p+cVVJYsyJG/rRpeh1g/X0cB3
9EuTjX6vnrT+IS8ZfAaoHzpmgh1vGu2r2xgPq2E8x4ji9FGV8YTjAs60Nw7YxKUW
Wgj+YNpxP2SxFqUCAwEAAaNRME8wCwYDVR0PBAQDAgGGMA8GA1UdEwEB/wQFMAMB
Af8wHQYDVR0OBBYEFMaD85ZQCWnnZTKtlo2FgeBfJBnHMBAGCSsGAQQBgjcVAQQD
AgEAMA0GCSqGSIb3DQEBCwUAA4ICAQAutXwOtsmYcbj/bs3Mydx0Di9m+6UVTEZd
ORRrTus/BL/TNryO7zo2beczGPK26MwqhmUZiaF61jRb36kxmFPVx2uV2np4LbQj
5MrxtPzf2XXy4b7ADqQpLgu4rR3mZiXGmzUoV17hmAhyfSU1qm4FssXGK2ypWsQs
BwsKX4DsIijJJZbXwKFaauq0LtnkgeGWdoEFFWAH0yJWPbz9h+ovlCxq0DBiG00l
brnY90sqpoiWTxMKNCXDDhNjvtxO3kQIDQVvbNMCEbmYG+RrWQHtvufw97RK/cTL
9dKFSblIIizMINVwM/gqtlVVvWP1EFaUy0xG5bvOO+SCe+TlA7rz4/RORqqE5Ugg
7F8fWz+o6BM/qf/Kwh+WN42dyR1rOsFqEVNamZLjrAzgwjQ/nquRRMl2cK6yg6Fq
d0O42wwYPpLUEFv4xe4a3kpRvvhshNkzR4IacbmaUlnzmlewoFXVueEblviBHJoV
1OUC6qfLkCjfCEv470Kr5vDe5Y/l/7j8EYj7a/wa2++kq+7xd+bj/DDed85fm3Yk
dhfp7bGXKm4KbPLzkSpiYWbE+EbArLtIk62exjcJvJPdoxMTxgbdelzl/snPLrdg
w0oGuTTBfxSMKs767N3G1q5tz0mwFpIqIQtXUSmaJ+9p7IkpWcThLnyYYo1IpWm/
ZHtjzZMQVA==
-----END CERTIFICATE-----
virtual_host_fqdn: prometheus.corp.vmw
kube_state_metrics:
deployment:
replicas: 1
service:
port: 80
targetPort: 8080
telemetryPort: 81
telemetryTargetPort: 8081
type: ClusterIP
namespace: tanzu-system-monitoring
node_exporter:
daemonset:
hostNetwork: false
updatestrategy: RollingUpdate
service:
port: 9100
targetPort: 9100
type: ClusterIP
prometheus:
config:
alerting_rules_yml: |
{}
alerts_yml: |
{}
prometheus_yml: |
global:
evaluation_interval: 1m
scrape_interval: 1m
scrape_timeout: 10s
rule_files:
- /etc/config/alerting_rules.yml
- /etc/config/recording_rules.yml
- /etc/config/alerts
- /etc/config/rules
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'kube-state-metrics'
static_configs:
- targets: ['prometheus-kube-state-metrics.prometheus.svc.cluster.local:8080']
- job_name: 'node-exporter'
static_configs:
- targets: ['prometheus-node-exporter.prometheus.svc.cluster.local:9100']
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: kubernetes-nodes-cadvisor
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- replacement: kubernetes.default.svc:443
target_label: __address__
- regex: (.+)
replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
source_labels:
- __meta_kubernetes_node_name
target_label: __metrics_path__
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
- job_name: kubernetes-apiservers
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: keep
regex: default;kubernetes;https
source_labels:
- __meta_kubernetes_namespace
- __meta_kubernetes_service_name
- __meta_kubernetes_endpoint_port_name
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- alertmanager.prometheus.svc:80
- kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_namespace]
regex: default
action: keep
- source_labels: [__meta_kubernetes_pod_label_app]
regex: prometheus
action: keep
- source_labels: [__meta_kubernetes_pod_label_component]
regex: alertmanager
action: keep
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_probe]
regex: .*
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_number]
regex:
action: drop
recording_rules_yml: |
groups:
- name: kube-apiserver.rules
interval: 3m
rules:
- expr: |2
(
(
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
-
(
(
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1d]))
or
vector(0)
)
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1d]))
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1d]))
)
)
+
# errors
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1d]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
labels:
verb: read
record: apiserver_request:burnrate1d
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1h]))
-
(
(
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1h]))
or
vector(0)
)
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1h]))
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1h]))
)
)
+
# errors
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1h]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1h]))
labels:
verb: read
record: apiserver_request:burnrate1h
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[2h]))
-
(
(
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[2h]))
or
vector(0)
)
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[2h]))
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[2h]))
)
)
+
# errors
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[2h]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[2h]))
labels:
verb: read
record: apiserver_request:burnrate2h
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[30m]))
-
(
(
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30m]))
or
vector(0)
)
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[30m]))
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[30m]))
)
)
+
# errors
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[30m]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[30m]))
labels:
verb: read
record: apiserver_request:burnrate30m
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[3d]))
-
(
(
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[3d]))
or
vector(0)
)
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[3d]))
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[3d]))
)
)
+
# errors
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[3d]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[3d]))
labels:
verb: read
record: apiserver_request:burnrate3d
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
-
(
(
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[5m]))
or
vector(0)
)
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[5m]))
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[5m]))
)
)
+
# errors
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[5m]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
labels:
verb: read
record: apiserver_request:burnrate5m
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[6h]))
-
(
(
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[6h]))
or
vector(0)
)
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[6h]))
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[6h]))
)
)
+
# errors
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[6h]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[6h]))
labels:
verb: read
record: apiserver_request:burnrate6h
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1d]))
-
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[1d]))
)
+
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[1d]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1d]))
labels:
verb: write
record: apiserver_request:burnrate1d
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1h]))
-
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[1h]))
)
+
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[1h]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1h]))
labels:
verb: write
record: apiserver_request:burnrate1h
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[2h]))
-
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[2h]))
)
+
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[2h]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[2h]))
labels:
verb: write
record: apiserver_request:burnrate2h
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[30m]))
-
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[30m]))
)
+
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[30m]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[30m]))
labels:
verb: write
record: apiserver_request:burnrate30m
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[3d]))
-
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[3d]))
)
+
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[3d]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[3d]))
labels:
verb: write
record: apiserver_request:burnrate3d
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
-
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[5m]))
)
+
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[5m]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
labels:
verb: write
record: apiserver_request:burnrate5m
- expr: |2
(
(
# too slow
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[6h]))
-
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[6h]))
)
+
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[6h]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[6h]))
labels:
verb: write
record: apiserver_request:burnrate6h
- expr: |
sum by (code,resource) (rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
labels:
verb: read
record: code_resource:apiserver_request_total:rate5m
- expr: |
sum by (code,resource) (rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
labels:
verb: write
record: code_resource:apiserver_request_total:rate5m
- expr: |
histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))) > 0
labels:
quantile: "0.99"
verb: read
record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
- expr: |
histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))) > 0
labels:
quantile: "0.99"
verb: write
record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
- expr: |2
sum(rate(apiserver_request_duration_seconds_sum{subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod)
/
sum(rate(apiserver_request_duration_seconds_count{subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod)
record: cluster:apiserver_request_duration_seconds:mean5m
- expr: |
histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
labels:
quantile: "0.99"
record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
- expr: |
histogram_quantile(0.9, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
labels:
quantile: "0.9"
record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
- expr: |
histogram_quantile(0.5, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
labels:
quantile: "0.5"
record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
- interval: 3m
name: kube-apiserver-availability.rules
rules:
- expr: |2
1 - (
(
# write too slow
sum(increase(apiserver_request_duration_seconds_count{verb=~"POST|PUT|PATCH|DELETE"}[30d]))
-
sum(increase(apiserver_request_duration_seconds_bucket{verb=~"POST|PUT|PATCH|DELETE",le="1"}[30d]))
) +
(
# read too slow
sum(increase(apiserver_request_duration_seconds_count{verb=~"LIST|GET"}[30d]))
-
(
(
sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30d]))
or
vector(0)
)
+
sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope="namespace",le="0.5"}[30d]))
+
sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope="cluster",le="5"}[30d]))
)
) +
# errors
sum(code:apiserver_request_total:increase30d{code=~"5.."} or vector(0))
)
/
sum(code:apiserver_request_total:increase30d)
labels:
verb: all
record: apiserver_request:availability30d
- expr: |2
1 - (
sum(increase(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[30d]))
-
(
# too slow
(
sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30d]))
or
vector(0)
)
+
sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[30d]))
+
sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[30d]))
)
+
# errors
sum(code:apiserver_request_total:increase30d{verb="read",code=~"5.."} or vector(0))
)
/
sum(code:apiserver_request_total:increase30d{verb="read"})
labels:
verb: read
record: apiserver_request:availability30d
- expr: |2
1 - (
(
# too slow
sum(increase(apiserver_request_duration_seconds_count{verb=~"POST|PUT|PATCH|DELETE"}[30d]))
-
sum(increase(apiserver_request_duration_seconds_bucket{verb=~"POST|PUT|PATCH|DELETE",le="1"}[30d]))
)
+
# errors
sum(code:apiserver_request_total:increase30d{verb="write",code=~"5.."} or vector(0))
)
/
sum(code:apiserver_request_total:increase30d{verb="write"})
labels:
verb: write
record: apiserver_request:availability30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"2.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"2.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"2.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"2.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"2.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"2.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"3.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"3.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"3.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"3.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"3.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"3.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"4.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"4.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"4.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"4.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"4.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"4.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"5.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"5.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"5.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"5.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"5.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"5.."}[30d]))
record: code_verb:apiserver_request_total:increase30d
- expr: |
sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~"LIST|GET"})
labels:
verb: read
record: code:apiserver_request_total:increase30d
- expr: |
sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~"POST|PUT|PATCH|DELETE"})
labels:
verb: write
record: code:apiserver_request_total:increase30d
rules_yml: |
{}
deployment:
configmapReload:
containers:
args:
- --volume-dir=/etc/config
- --webhook-url=http://127.0.0.1:9090/-/reload
containers:
args:
- --storage.tsdb.retention.time=42d
- --config.file=/etc/config/prometheus.yml
- --storage.tsdb.path=/data
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
- --web.enable-lifecycle
replicas: 1
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
updateStrategy: Recreate
pvc:
accessMode: ReadWriteOnce
storage: 20Gi
storageClassName: k8s-policy
service:
port: 80
targetPort: 9090
type: ClusterIP
pushgateway:
deployment:
replicas: 1
service:
port: 9091
targetPort: 9091
type: ClusterIP
Some important things to note from this specification:
- ingress is enabled (
ingress: enabled: true
) - ingress is configured for URLs ending in /alertmanager/ (
alertmanagerprefix:
) and / (prometheus_prefix:
). - The FQDN for Prometheus is prometheus.corp.vmw (
virtual_host_fqdn:
) - A custom certificate is supplied under the ingress section (
tls.crt
,tls.key
,ca.crt
). This is the wildcard certificate created for use in my environment (works for anything ending in corp.vmw). - The pvc for alertmanager is 2GB and will be created under the k8s-policy storageClass.
- The pvc for promethues is 20GB and will be created under the k8s-policy storageClass.
The Prometheus package can be installed.
tanzu package install prometheus -p prometheus.tanzu.vmware.com -v 2.36.2+vmware.1-tkg.1 --values-file prometheus-data-values.yaml -n tanzu-system-monitoring
Since this package has pvcs, you will see some activity around this in the vSphere Client.

Check to see that that necessary components have been successfully deployed:
tanzu package installed list -n tanzu-system-monitoring
NAME PACKAGE-NAME PACKAGE-VERSION STATUS
prometheus prometheus.tanzu.vmware.com 2.36.2+vmware.1-tkg.1 Reconcile succeeded
kubectl -n tanzu-system-monitoring get all
NAME READY STATUS RESTARTS AGE
pod/alertmanager-d56577ff5-mlq6k 1/1 Running 0 2m1s
pod/prometheus-kube-state-metrics-7b48c44779-62mk2 1/1 Running 0 2m2s
pod/prometheus-node-exporter-5qtrs 1/1 Running 0 2m2s
pod/prometheus-node-exporter-bv5q9 1/1 Running 0 2m2s
pod/prometheus-node-exporter-dpmxz 1/1 Running 0 2m2s
pod/prometheus-pushgateway-5cc854c88b-b4wmr 1/1 Running 0 119s
pod/prometheus-server-5b6c8b5444-nvxgt 2/2 Running 0 2m1s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager ClusterIP 10.111.1.128 <none> 80/TCP 2m1s
service/prometheus-kube-state-metrics ClusterIP None <none> 80/TCP,81/TCP 2m2s
service/prometheus-node-exporter ClusterIP 10.109.6.230 <none> 9100/TCP 2m1s
service/prometheus-pushgateway ClusterIP 10.97.135.94 <none> 9091/TCP 2m2s
service/prometheus-server ClusterIP 10.102.5.183 <none> 80/TCP 2m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-node-exporter 3 3 3 3 3 <none> 2m2s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/alertmanager 1/1 1 1 2m1s
deployment.apps/prometheus-kube-state-metrics 1/1 1 1 2m2s
deployment.apps/prometheus-pushgateway 1/1 1 1 2m
deployment.apps/prometheus-server 1/1 1 1 2m1s
NAME DESIRED CURRENT READY AGE
replicaset.apps/alertmanager-d56577ff5 1 1 1 2m1s
replicaset.apps/prometheus-kube-state-metrics-7b48c44779 1 1 1 2m2s
replicaset.apps/prometheus-pushgateway-5cc854c88b 1 1 1 2m
replicaset.apps/prometheus-server-5b6c8b5444 1 1 1 2m1s
You can see the 2GB and 20GB pvcs created for alertmanager and prometheus:
kubectl -n tanzu-system-monitoring get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
alertmanager Bound pvc-a53f7091-9823-4b70-a9b4-c3d7a1e27a4b 2Gi RWO k8s-policy 2m30s
prometheus-server Bound pvc-41745d1d-9401-41d7-b44d-ba430ecc5cda 20Gi RWO k8s-policy 2m30s
And since this package has an ingress configured, you will see an httpproxy resource in the tanzu-system-monitoring namespace:
kubectl -n tanzu-system-monitoring get httpproxy
NAME FQDN TLS SECRET STATUS STATUS DESCRIPTION
prometheus-httpproxy prometheus.corp.vmw prometheus-tls valid Valid HTTPProxy
To check that Service Discovery (external-DNS) picked this up, and created the necessary record, you can check the DNS application to see that a new DNS record exists.

This record is using the same IP address that was seen for the envoy service earlier, 10.40.14.70 (provided as an NSX LB). You can point a browser to https://prometheus.corp.vmw to check that the Prometheus package is working, the certificate configured is in place, that the ingress is working properly and the DNS record resolves to the correct address.

Everything looks as it should and there is no certificate warning.
Install the Grafana package
The next package we’ll install is Grafana and it’s very similar to Prometheus as it is using ingress and has a pvc.
Create the namespace.
kubectl create ns tanzu-system-dashboards
Determine the version to use:
grafana.tanzu.vmware.com.7.5.16+vmware.1-tkg.1 grafana.tanzu.vmware.com 7.5.16+vmware.1-tkg.1 2m31s
grafana.tanzu.vmware.com.7.5.7+vmware.1-tkg.1 grafana.tanzu.vmware.com 7.5.7+vmware.1-tkg.1 2m30s
grafana.tanzu.vmware.com.7.5.7+vmware.2-tkg.1 grafana.tanzu.vmware.com 7.5.7+vmware.2-tkg.1 2m30s
Again, we’ll use the latest version, 7.5.16.
Create a data-values specification for the Grafana package:
grafana-data-values.yaml
namespace: tanzu-system-dashboards
grafana:
deployment:
replicas: 1
updateStrategy: Recreate
service:
type: NodePort
port: 80
targetPort: 3000
config:
grafana_ini: |
[analytics]
check_for_updates = false
[grafana_net]
url = https://grafana.com
[log]
mode = console
[paths]
data = /var/lib/grafana/data
logs = /var/log/grafana
plugins = /var/lib/grafana/plugins
provisioning = /etc/grafana/provisioning
datasource_yaml: |-
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: prometheus-server.tanzu-system-monitoring.svc.cluster.local
access: proxy
isDefault: true
dashboardProvider_yaml: |-
apiVersion: 1
providers:
- name: 'sidecarDashboardProvider'
orgId: 1
folder: ''
folderUid: ''
type: file
disableDeletion: false
updateIntervalSeconds: 10
allowUiUpdates: false
options:
path: /tmp/dashboards
foldersFromFilesStructure: true
pvc:
annotations: {}
storageClassName: k8s-policy
accessMode: ReadWriteOnce
storage: "2Gi"
secret:
type: "Opaque"
admin_password: "Vk13YXJlMSE="
ingress:
enabled: true
virtual_host_fqdn: "grafana.corp.vmw"
prefix: "/"
servicePort: 80
tlsCertificate:
tls.crt: |
-----BEGIN CERTIFICATE-----
MIIHxTCCBa2gAwIBAgITIgAAAAQnSpH7QfxTKAAAAAAABDANBgkqhkiG9w0BAQsF
ADBMMRMwEQYKCZImiZPyLGQBGRYDdm13MRQwEgYKCZImiZPyLGQBGRYEY29ycDEf
MB0GA1UEAxMWY29udHJvbGNlbnRlci5jb3JwLnZtdzAeFw0yMjAzMjExOTQ1MzNa
Fw0zMjAzMTgxOTQ1MzNaMG0xCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9y
bmlhMRIwEAYDVQQHEwlQYWxvIEFsdG8xDzANBgNVBAoTBlZNd2FyZTEPMA0GA1UE
CxMGVk13YXJlMRMwEQYDVQQDDAoqLmNvcnAudm13MIICIjANBgkqhkiG9w0BAQEF
AAOCAg8AMIICCgKCAgEAzhh0/CNdiskBkkzZzrhBbKiQEzekxnw0uCgLCpkmQzIe
m98jgt7LgS2dQFf5vABpTqIX11t4EoY/1gmsrz1IjBv26VrIjSLFmGppRgh+7NPY
LbXVB6AbrgtlC+a89d14h37l30khZ62H2KTvmUojihUSeYZfi50IMsW/UPr8Z94V
xhI2bTy/PDL/6sd+qmTpqaZplooEQBn18D0VqJhbTCK+kbdxgM01/8PcZoTcjAEN
XcUR64p3H0KG0QeQn8r81nOsEihNBUCsrc6PiqPDRGtOJ+BsUhQy+BVmk/dIwnqM
PJN7cTElz3h+ScJdUr0vPBfZjkI/glRk4GugFomg/qdHvt0OVsLG9efBbiZJo8/X
PWEcRppCYE+HexbiVwKCML2i75nsELdH4+UVmfrywhAyZf4hvlz2/w5LQudPrfHz
qPZ76NLmUkkK7buPGsktbtMS4n8fUGsr23Y4UxvZNRrvb6aPp3gSylDrzsV4dTQb
PiLxzrZymiYw8tReMchCXbb9LGoN50xYbq2nAyUPhxnH0HOuw415shfkLTTGx1ms
Ht+NeTUbavSh4dBa6EKFfKbYQ9Fukw3XkeMClX0Mdo/EQemivLGO+hIxoDIexkQs
kXU2y/1uRhCOW5Rcsjr3EgwXGHENr8WUIut8wG1QoW8lg0ma73ErIY37dtFORGcC
AwEAAaOCAn0wggJ5MA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEFBQcD
ATAVBgNVHREEDjAMggoqLmNvcnAudm13MB0GA1UdDgQWBBQHTiEY3UJ11GdfNS9U
iba5lC0f3zAfBgNVHSMEGDAWgBTGg/OWUAlp52UyrZaNhYHgXyQZxzCB1wYDVR0f
BIHPMIHMMIHJoIHGoIHDhoHAbGRhcDovLy9DTj1jb250cm9sY2VudGVyLmNvcnAu
dm13LENOPWNvbnRyb2xjZW50ZXIsQ049Q0RQLENOPVB1YmxpYyUyMEtleSUyMFNl
cnZpY2VzLENOPVNlcnZpY2VzLENOPUNvbmZpZ3VyYXRpb24sREM9Y29ycCxEQz12
bXc/Y2VydGlmaWNhdGVSZXZvY2F0aW9uTGlzdD9iYXNlP29iamVjdENsYXNzPWNS
TERpc3RyaWJ1dGlvblBvaW50MIHFBggrBgEFBQcBAQSBuDCBtTCBsgYIKwYBBQUH
MAKGgaVsZGFwOi8vL0NOPWNvbnRyb2xjZW50ZXIuY29ycC52bXcsQ049QUlBLENO
PVB1YmxpYyUyMEtleSUyMFNlcnZpY2VzLENOPVNlcnZpY2VzLENOPUNvbmZpZ3Vy
YXRpb24sREM9Y29ycCxEQz12bXc/Y0FDZXJ0aWZpY2F0ZT9iYXNlP29iamVjdENs
YXNzPWNlcnRpZmljYXRpb25BdXRob3JpdHkwPAYJKwYBBAGCNxUHBC8wLQYlKwYB
BAGCNxUIgfaGVYGazCSBvY8jgZr5P5S9VzyB26Mxg6eFPwIBZAIBAjAbBgkrBgEE
AYI3FQoEDjAMMAoGCCsGAQUFBwMBMA0GCSqGSIb3DQEBCwUAA4ICAQCCxqhinZTi
NWHGvTgms+KdmNfIFR/R6GTFF8bO/bgfRw5pVeSDurEechjlO2hRDWOHn4H2+fIQ
vN4I6cjEbVnDBbVrRkbCfNLD1Wjj/zz65pZ8PmjQUiFl9L4HxaUH5sF/3Jrylsu0
M2oITByEfl5WpfC0oyB2/9nKYLPLOK5j2OicHxEino1RPAPdOk5gU5c6+Ed74kVh
KtgK3r8Lc4/IhwfmSDgGl1DmEkzv/u+0bQTOOH1fVKSl3p9+YieADc3s2SJWFF0F
mKElJHlZEeAg+MI16zNbQhowZE2SE+b9VTGK9KDkCmYGRjpHc61onQNTzIH5rDFx
/0aBOGp3+tdA+QEI8VgpQlaa0BtsKyY3l/DAg7I42x4Zv9ta7vZUe81v11PqdSJQ
v7NOriJkRNPneErH2QNsbi00p+TlUFzOl85AEtlB722/fDHbDGSxngDAhImCv4uC
xPnwVRo94AI0A6ol0FEct2z2wgehQKQbwcskNpOE7wryd+/yqrm+Z5o0crfaPPwX
uEwDtQjCRM+8wDWBcQnvyOJe374nFLpGpX8tZLjOg2U0wwdsfAxtYGZUAN0V0xm0
YYsIjp7/f+Pk1DjzWx8JIAbzItKLucDreAmmDXqk+DrBP9LYqtmjB0n7nSErgK8G
sA3kGCJdOkI0kgF10gsinaouG2jVlwNOsw==
-----END CERTIFICATE-----
tls.key: |
-----BEGIN PRIVATE KEY-----
MIIJRAIBADANBgkqhkiG9w0BAQEFAASCCS4wggkqAgEAAoICAQDOGHT8I12KyQGS
TNnOuEFsqJATN6TGfDS4KAsKmSZDMh6b3yOC3suBLZ1AV/m8AGlOohfXW3gShj/W
CayvPUiMG/bpWsiNIsWYamlGCH7s09gttdUHoBuuC2UL5rz13XiHfuXfSSFnrYfY
pO+ZSiOKFRJ5hl+LnQgyxb9Q+vxn3hXGEjZtPL88Mv/qx36qZOmppmmWigRAGfXw
PRWomFtMIr6Rt3GAzTX/w9xmhNyMAQ1dxRHrincfQobRB5CfyvzWc6wSKE0FQKyt
zo+Ko8NEa04n4GxSFDL4FWaT90jCeow8k3txMSXPeH5Jwl1SvS88F9mOQj+CVGTg
a6AWiaD+p0e+3Q5Wwsb158FuJkmjz9c9YRxGmkJgT4d7FuJXAoIwvaLvmewQt0fj
5RWZ+vLCEDJl/iG+XPb/DktC50+t8fOo9nvo0uZSSQrtu48ayS1u0xLifx9Qayvb
djhTG9k1Gu9vpo+neBLKUOvOxXh1NBs+IvHOtnKaJjDy1F4xyEJdtv0sag3nTFhu
racDJQ+HGcfQc67DjXmyF+QtNMbHWawe3415NRtq9KHh0FroQoV8pthD0W6TDdeR
4wKVfQx2j8RB6aK8sY76EjGgMh7GRCyRdTbL/W5GEI5blFyyOvcSDBcYcQ2vxZQi
63zAbVChbyWDSZrvcSshjft20U5EZwIDAQABAoICAHmP83DFa2dxKHwi2FYWWIC+
7Dxplcd9e5skA1889lSsO2G1PDz1LRQE07wgKC28EGFROr7MNQa4KO8WxcSXYTND
S2BZK/ITkHlWSsIEQNlwGxLbLcxRpAIEtpVOhCaBe5ZwQyZw/EMrF/WxU6IXGN9Z
jowftjujZDKOcUpSwI6DcFRkabYFHsdjTZAuG4hl/W0TuzQQNHGa3nXVkfDf7Pn7
hGxux4QxhqhV3qqZs3zhIgEtPGSyR5EorFyfGa8nC/tyPwx2uPdgLnpWXFRqQ8MX
iAH9XecMAwRRmy+rrD8KCa2xUB5z3tmBOPxIqMMk07eeWbSPXuaA4P9+e+7PPyXl
BEZ/6Y5wICrd2WrzJQOUrLQpbiSRZi7tTxUxYCKWXmB93RDjG8yl5qXZZ5G173PK
hGHH8KyUWD6tW0ytFEW714IaqJN8ffTCkThnveWKOd/jW3/ffxxShMj9+FnnUEfp
dfQCq3rZEafoJX/A98TOibq/f5Rogky3D3azhs6gz/+NBterX6+7U2OnSK4cPAGr
KPn0HJT99gnojkFO32L0N8QriXSdY5gpTw+ZGzgOu78WO5JwvCd+LENNCMfIQckh
Jm/GM+0sTG03bhlQwyfJeV43ETyarGfiRuwFzgbvH24Ie6iSPSrIu5oqFfDp2+El
EzAC9kqo4rmCoUQcX5ABAoIBAQD/egDF3mu7yVXQx8EFSKAsTi/Bq6z4gmBtQdN3
g4MbdlfZ5/rrQZ1pbm2VlqLYz0hab4V0rtJKFKAg4UdSxZTnuR3tZCgSjNd4wy8b
Meuwyls/IMKPmcrKmBWSxesQRs/wMlKPWYv7SUgJhgHqiPuDP74s1FaYdnPeVjKO
BDABdzk3MJaU/FEPqHs+0RhjhHOaVm4v2T+/AYIFZ0Gvm+uSlhjlfse83Ui2WV5b
0DUWS0Gl80UCkPloDWpDa21bcpmcYqrBUnppYz4XPvzNjnYZflubJU7ZGoEB3DNw
u8IlVidH2/c86eGKvCNihs01f3Oxg8XpEB9W6mFjhDVE7FnJAoIBAQDOhI2+tcG+
oQybL3IhvvTYnpxPh9GU9qxCEsb71qR+xmXXdIAf9ZPtuL+iet2B60CICcn94gtn
ntqm4mSjJ9qg7CVnwCLWRhSqGE7Ha+4xnAQgRIpaVhRYThS0ZEL2GbYpezkc78Ql
ZKDWJLAz/4sGzQcbBt+xVM2962bTFVSWRdeXfe6BoRbn5f9V9ZA7wSE5porC1k1h
C6HiApOXmZMryILGrrE2YxO/S4cA+TVwH2/dU8W4Ti50EcYvMUwLf3hE3hZwIUvR
HSecQBcb5fp3mZpnvaZjNTx7vKh6jjRXfmSRhI2dbSZZh6k+5NFP5C28A4Hkhuh6
lpAGQxQ2h8SvAoIBAQDUTtBbn3aKbUvaoFZBDNTHXQaE/SVWtApsYZraJDmNVfC2
DvnQDgxBtNpuyOt2H/Rx62HN0QbDN5bHHFAIclhHpehAAs7mc5MRMatw/zBuEAx6
TsBBVD5Z1L+A5Odu9FoTs842gOU6o/CwsWPgQ4w4y31Ahgmc1DuAVsPWj5ZRcYHj
4oYRNAotaAdb8apB8a2cYh1ZuEIoeplR4jiNNpczj3cLKSvWQVMO7v/ibwnfCBV7
UspT0qThmtxnQNx1daxAcSKUW/WMpUPRT7AJJ03v67k3Gm8HLuZs5FD/a5lxK8Kj
DiLNxVOA1s7VL09UGSHNMMQE5jgVI9xhNlqKd5w5AoIBAQCe0y639szUMMOjLbAW
5+ciGYmZWJkEeVktT4ec8wx7O1Xjh4NqENH9x1IKQXfNjQGKHg0spgWjYXZDVmWT
XPk1PafezNN9+1O1JRChKg58NMKvlkbZBs6KwzIFMf6VilygNlZMPNGa+HMBfiHN
O8DOMCxAyt6KYPACGeJwgD0XfQs7ROyC4ULegfIHR93vNq64ya55/ZpxAiMz0Et2
EfQvffullYBQlY4AVrOzOfWxD1xW2TB8eBQdy/WhIcacKSJzxGF5RwIqBsQJ1Phw
ykQAay9mjWJDdhPYDdV8u5ThnSD3EPxgkCsoO78b0ZpwWMobiI8DFAYDEXwedMQ8
09mdAoIBAQCxsvbB38jiPPqsF77jxWzogVdPR6xO6oLkVcvu4KI1Fq0wFdDH4eBJ
gOxV8e+wsQBVO1XXP4i3UhbCq7gPgh+YJWWy66RKiuwpIPf2yvGbbUgBpp73MxOx
ycerjh7LRS56gAAJ6d5hyYf293E5QEAOZh+Niuz3m1xMytdvG7IefX3TMe0GMgZh
I5djpaHvjgB6qHeVsLuUWjC2FPXtrz2STay08tq/Pc5g+57bTfOyHo0BZ3C/uhZa
l1NzswracGQIzo03zk/X3Z6P2YOea4BkZ0Iwh34wOHJnTkfEeSx6y+oSFMcFRthT
yfFCZUk/sVCc/C1a4VigczXftUGiRrTR
-----END PRIVATE KEY-----
ca.crt: |
-----BEGIN CERTIFICATE-----
MIIFczCCA1ugAwIBAgIQTYJITQ3SZ4BBS9UzXfJIuTANBgkqhkiG9w0BAQsFADBM
MRMwEQYKCZImiZPyLGQBGRYDdm13MRQwEgYKCZImiZPyLGQBGRYEY29ycDEfMB0G
A1UEAxMWY29udHJvbGNlbnRlci5jb3JwLnZtdzAeFw0yMjAzMjExOTE3MjhaFw0z
NzAzMjExOTI3MjNaMEwxEzARBgoJkiaJk/IsZAEZFgN2bXcxFDASBgoJkiaJk/Is
ZAEZFgRjb3JwMR8wHQYDVQQDExZjb250cm9sY2VudGVyLmNvcnAudm13MIICIjAN
BgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEA2OYKxckOjhgufWu1YEnatvJ1M127
gwPbFNj11/dICXaPe+mjN1Hce0PiS2QaaeAe8kH+mOKRa2JjaGdXr6rOiB80KZOR
uw0GzSJyL5w7ewR+NJf31YO62BD/mt3sHeMnCXmSBxOQvb0nGkhTr1y+rDpvxJ87
zNczgfN54to6S379wjOsC4bkHLnMJ5EtJG78pPqX1+1wcVOURNJ6y9BcejLnoy/y
CFpXKOVxKHzy2nnsitAuBb+hD+Jxw8/jFQUhxH0VlgyfXCQdegasSA9RHtZtfpVs
hshisjkSlvQmbsEknBZrAfBVIYidwt3w050jVhiUs5Ql6vDotY6Gqtzzgq0obv6P
7E9NPej3BzhPSIUyqnpf57UWI4zUiRJvbSu/J2MCBKHwYfzke1cnvLA7viDEdB9+
/Htk9aG9/1B6ddDfafrcSOWtkTfHWYLv21o3Uwoh9W5OpK9JikZu/PqnpZkUi+2C
L+WCww/BS1yhQwVif6PqUMeSLz3jtq3w6R/ruUMlO+0E5//bskDT6QGxBgcvMF9n
Dl+u0uqHKOdiUvOXBtF139HKUrZsq0m3WPoel2/p+cVVJYsyJG/rRpeh1g/X0cB3
9EuTjX6vnrT+IS8ZfAaoHzpmgh1vGu2r2xgPq2E8x4ji9FGV8YTjAs60Nw7YxKUW
Wgj+YNpxP2SxFqUCAwEAAaNRME8wCwYDVR0PBAQDAgGGMA8GA1UdEwEB/wQFMAMB
Af8wHQYDVR0OBBYEFMaD85ZQCWnnZTKtlo2FgeBfJBnHMBAGCSsGAQQBgjcVAQQD
AgEAMA0GCSqGSIb3DQEBCwUAA4ICAQAutXwOtsmYcbj/bs3Mydx0Di9m+6UVTEZd
ORRrTus/BL/TNryO7zo2beczGPK26MwqhmUZiaF61jRb36kxmFPVx2uV2np4LbQj
5MrxtPzf2XXy4b7ADqQpLgu4rR3mZiXGmzUoV17hmAhyfSU1qm4FssXGK2ypWsQs
BwsKX4DsIijJJZbXwKFaauq0LtnkgeGWdoEFFWAH0yJWPbz9h+ovlCxq0DBiG00l
brnY90sqpoiWTxMKNCXDDhNjvtxO3kQIDQVvbNMCEbmYG+RrWQHtvufw97RK/cTL
9dKFSblIIizMINVwM/gqtlVVvWP1EFaUy0xG5bvOO+SCe+TlA7rz4/RORqqE5Ugg
7F8fWz+o6BM/qf/Kwh+WN42dyR1rOsFqEVNamZLjrAzgwjQ/nquRRMl2cK6yg6Fq
d0O42wwYPpLUEFv4xe4a3kpRvvhshNkzR4IacbmaUlnzmlewoFXVueEblviBHJoV
1OUC6qfLkCjfCEv470Kr5vDe5Y/l/7j8EYj7a/wa2++kq+7xd+bj/DDed85fm3Yk
dhfp7bGXKm4KbPLzkSpiYWbE+EbArLtIk62exjcJvJPdoxMTxgbdelzl/snPLrdg
w0oGuTTBfxSMKs767N3G1q5tz0mwFpIqIQtXUSmaJ+9p7IkpWcThLnyYYo1IpWm/
ZHtjzZMQVA==
-----END CERTIFICATE-----
Some important things to note from this specification:
- ingress is enabled (
ingress: enabled: true
) - ingress is configured for URLs ending in / (
prefix:
). - The FQDN for Grafana is grafana.corp.vmw (
virtual_host_fqdn:
) - A custom certificate is supplied under the ingress section (
tls.crt
,tls.key
,ca.crt
). This is the wildcard cert for the lab (works for anything ending in corp.vmw). - The pvc for grafana is 2GB and will be created under the k8s-policy storageClass.
- The admin password for the Grafana UI is VMware1! (base64 encoded as
Vk13YXJlMSE=
undergrafana: secret: admin_password:
).
The Grafana package can be installed.
tanzu package install grafana -p grafana.tanzu.vmware.com -v 7.5.16+vmware.1-tkg.1 --values-file grafana-data-values.yaml -n tanzu-system-dashboards
In the vSphere Client, you’ll see the same storage-related activity as was observed during the Prometheus package deployment when the Grafana pvc is created.
Check to see that that necessary components have been successfully deployed:
tanzu package installed list -n tanzu-system-dashboards
NAME PACKAGE-NAME PACKAGE-VERSION STATUS
grafana grafana.tanzu.vmware.com 7.5.16+vmware.1-tkg.1 Reconcile succeeded
kubectl -n tanzu-system-dashboards get all
NAME READY STATUS RESTARTS AGE
pod/grafana-594559bc55-9mzzn 2/2 Running 0 2m18s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/grafana NodePort 10.98.84.82 <none> 80:31110/TCP 2m18s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/grafana 1/1 1 1 2m18s
NAME DESIRED CURRENT READY AGE
replicaset.apps/grafana-594559bc55 1 1 1 2m18s
kubectl -n tanzu-system-dashboards get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
grafana-pvc Bound pvc-243405bc-93e3-4570-bfdc-8451cbe503af 2Gi RWO k8s-policy 2m41s
kubectl -n tanzu-system-dashboards get httpproxy
NAME FQDN TLS SECRET STATUS STATUS DESCRIPTION
grafana-httpproxy grafana.corp.vmw grafana-tls valid Valid HTTPProxy

You can log in to Grafana at https://grafana.corp.vmw with the admin username and password configured in the data-values file (VMware1!).

Once logged in, you can navigate to Dashboards, Manager and then access the TKG Kubernetes cluster monitoring (via Prometheus) dashboard.

Install the fluent-bit package
You need a destination for fluent-bit log forwarding and my lab is using vRealize Log Insight (vRLI). By default, vRLI is listing for syslog traffic on port 514 (UDP) so this is what will be used when configuring fluent-bit.
Create the namespace.
kubectl create ns tanzu-system-logging
Create the data-values specification.
fluent-bit-data-values.yaml
namespace: "tanzu-system-logging"
fluent_bit:
config:
service: |
[Service]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
outputs: |
[OUTPUT]
Name stdout
Match *
[OUTPUT]
Name syslog
Match kube.*
Host vrli-01a.corp.vmw
Port 514
Mode udp
Syslog_Format rfc5424
Syslog_Hostname_key tkg2_cluster-1
Syslog_Appname_key pod_name
Syslog_Procid_key container_name
Syslog_Message_key message
Syslog_SD_key k8s
Syslog_SD_key labels
Syslog_SD_key annotations
Syslog_SD_key tkg
[OUTPUT]
Name syslog
Match kube_systemd.*
Host vrli-01a.corp.vmw
Port 514
Mode udp
Syslog_Format rfc5424
Syslog_Hostname_key tkg2_cluster-1
Syslog_Appname_key tkg2_instance
Syslog_Message_key MESSAGE
Syslog_SD_key systemd
inputs: |
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser docker
Tag kube.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
[INPUT]
Name tail
Tag audit.*
Path /var/log/audit/audit.log
Parser logfmt
DB /var/log/flb_system_audit.db
Mem_Buf_Limit 50MB
Refresh_Interval 10
Skip_Long_Lines On
[INPUT]
Name systemd
Tag kube_systemd.*
Path /var/log/journal
DB /var/log/flb_kube_systemd.db
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Systemd_Filter _SYSTEMD_UNIT=containerd.service
Read_From_Tail On
Strip_Underscores On
[INPUT]
Name tail
Tag apiserver_audit.*
Path /var/log/kubernetes/audit.log
Parser json
DB /var/log/flb_kube_audit.db
Mem_Buf_Limit 50MB
Refresh_Interval 10
Skip_Long_Lines On
filters: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude On
[FILTER]
Name record_modifier
Match *
Record tkg_cluster tkg-wld-01a
Record tkg_instance tkg-mgmt
[FILTER]
Name nest
Match kube.*
Operation nest
Wildcard tkg_instance*
Nest_Under tkg
[FILTER]
Name nest
Match kube_systemd.*
Operation nest
Wildcard SYSTEMD*
Nest_Under systemd
[FILTER]
Name modify
Match kube.*
Copy kubernetes k8s
[FILTER]
Name nest
Match kube.*
Operation lift
Nested_Under kubernetes
parsers: |
[PARSER]
Name apache
Format regex
Regex ^(?[^ ]*) [^ ]* (?[^ ]*) \[(?
There is very little that is customized in this specification.
Host: vrli-01a.corp.vmw
(the vRLI FQDN)Syslog_Hostname_key: tkg2_cluster_1
(arbitrary but helps to identify where the logs came from and should be unique if there are multiple fluent-bit installations for multiple clusters)Syslog_Appname_key: tkg2_instance
(again, arbitrary but will help to identify where the logs came from)
The fluent-bit package can be installed.
tanzu package install fluent-bit -p fluent-bit.tanzu.vmware.com -n tanzu-system-logging -v 1.8.15+vmware.1-tkg.1 --values-file fluent-bit-data-values.yaml -n tanzu-system-logging
Check to see that that necessary components have been successfully deployed:
tanzu package installed list -n tanzu-system-logging
NAME PACKAGE-NAME PACKAGE-VERSION STATUS
fluent-bit fluent-bit.tanzu.vmware.com 1.8.15+vmware.1-tkg.1 Reconcile succeeded
kubectl -n tanzu-system-logging get all
NAME READY STATUS RESTARTS AGE
pod/fluent-bit-964bh 1/1 Running 0 4m55s
pod/fluent-bit-lmdvk 1/1 Running 0 4m55s
pod/fluent-bit-p28qh 1/1 Running 0 4m55s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/fluent-bit 3 3 3 3 3 <none> 4m55s
You can quickly validate that data is seen in vRLI on the Overview page:

You can also navigate to Explore Logs to get a more detailed view of the logs coming in to vRLI form the TKGS cluster.

Harbor
One of the other popular packages is the Harbor package. I didn’t install it as I had already enabled the Image Registry (Harbor) service on the supervisor cluster. The installation will look very similar to the Prometheus and Grafana installations as it can (should) use a custom certificate, has PVCs and an ingress. The harbor-data-values.yaml file will look very similar to the one in the Deploy Harbor section of my earlier post, How to configure external-dns with Microsoft DNS in TKG 1.3 (plus Harbor and Contour).