Installing Packages to a TKGS cluster in vSphere 8 with Tanzu

I’ve written a few posts in the past about using packages (previously extensions) within the context of TGK clusters installed on vSphere (
Upgrading from TKG 1.3 to 1.4 (including extensions) on vSphere, How to configure external-dns with Microsoft DNS in TKG 1.3 (plus Harbor and Contour),
Working with TKG Extensions and Shared Services in TKG 1.2). The same package framework is available for TKGS clusters in vSphere with Tanzu as well. In this post, I’ll walk through installing most of them.

Packages provide extra functionality to the TKGS cluster. There are many different kinds of packages that can be installed for different purposes…ingress, dynamic DNS record creation, certificate services, etc…

Install the tanzu CLI

You can download the tanzu CLI from Customer Connect. You’ll find it under the Tanzu Kubernetes Grid product and there are OS-specific versions available. You should have a .tar.gz file whose contents you’ll need to extract on a system with access to the cluster.

tar -zxvf tanzu-cli-bundle-linux-amd64.tar.gz
 
cli/
cli/core/
cli/core/v0.28.1/
cli/core/v0.28.1/tanzu-core-linux_amd64
cli/tanzu-framework-plugins-standalone-linux-amd64.tar.gz
cli/tanzu-framework-plugins-context-linux-amd64.tar.gz
cli/ytt-linux-amd64-v0.43.1+vmware.1.gz
cli/kapp-linux-amd64-v0.53.2+vmware.1.gz
cli/imgpkg-linux-amd64-v0.31.1+vmware.1.gz
cli/kbld-linux-amd64-v0.35.1+vmware.1.gz
cli/vendir-linux-amd64-v0.30.1+vmware.1.gz

This is obviously for a linux system so the tanzu-core-linux_amd64 file needs to be made executable (and moved to a location that is more easily accessible).

mv cli/core/v0.28.1/tanzu-core-linux_amd64 /usr/local/bin/tanzu
chmod +x /usr/local/bin/tanzu

You can now run the tanzu init command to do the initial configuration of the tanzu CLI.

tanzu init
 
ℹ  Checking for required plugins...
ℹ  Installing plugin 'secret:v0.28.1' with target 'kubernetes'
ℹ  Installing plugin 'telemetry:v0.28.1' with target 'kubernetes'
ℹ  Installing plugin 'isolated-cluster:v0.28.1'
ℹ  Installing plugin 'login:v0.28.1'
ℹ  Installing plugin 'management-cluster:v0.28.1' with target 'kubernetes'
ℹ  Installing plugin 'package:v0.28.1' with target 'kubernetes'
ℹ  Installing plugin 'pinniped-auth:v0.28.1'
ℹ  Successfully installed all required plugins
✔  successfully initialized CLI

Prepare the TKGS cluster for package installation

Use the kubectl login command to access the TKGS cluster.

kubectl vsphere login --server wcp.corp.vmw -u vmwadmin@corp.vmw --tanzu-kubernetes-cluster-namespace tkg2-cluster-namespace --tanzu-kubernetes-cluster-name tkg2-cluster-1

A Package Repository is needed as it defines where the packages can be downloaded from.

Create a PackageRepositry spec file and then deploy it to the cluster:

apiVersion: packaging.carvel.dev/v1alpha1
kind: PackageRepository
metadata:
  name: tanzu-standard
  namespace: tkg-system
spec:
  fetch:
    imgpkgBundle:
      image: projects.registry.vmware.com/tkg/packages/standard/repo:v1.6.0

Note: You can get the proper image value from the VMware Install Package Repository documentation.

kubectl apply -f packagerepo.yaml

Check to make sure that the PackageRepository object is created:

kubectl get packagerepositories -A
NAMESPACE    NAME             AGE   DESCRIPTION
tkg-system   tanzu-standard   33s   Reconcile succeeded
tanzu package repository list -A
NAMESPACE   NAME            SOURCE                                                                   STATUS
tkg-system  tanzu-standard  (imgpkg) projects.registry.vmware.com/tkg/packages/standard/repo:v1.6.0  Reconcile succeeded

Now you’re ready to start installing packages.

Install the cert-manager package

cert-manage is a prerequisite for pretty much every other package so it needs to be installed first.

Create the cert-manager namespace:

kubectl create ns cert-manager

Before you can install the cert-manager package, you need to know which version to install. You can get a listing of all available packages and their versions by querying the packages CRD.

kubectl -n tkg-system get packages
NAME                                                                 PACKAGEMETADATA NAME                           VERSION                 AGE
cert-manager.tanzu.vmware.com.1.1.0+vmware.1-tkg.2                   cert-manager.tanzu.vmware.com                  1.1.0+vmware.1-tkg.2    2m31s
cert-manager.tanzu.vmware.com.1.1.0+vmware.2-tkg.1                   cert-manager.tanzu.vmware.com                  1.1.0+vmware.2-tkg.1    2m31s
cert-manager.tanzu.vmware.com.1.5.3+vmware.2-tkg.1                   cert-manager.tanzu.vmware.com                  1.5.3+vmware.2-tkg.1    2m31s
cert-manager.tanzu.vmware.com.1.5.3+vmware.4-tkg.1                   cert-manager.tanzu.vmware.com                  1.5.3+vmware.4-tkg.1    2m31s
cert-manager.tanzu.vmware.com.1.7.2+vmware.1-tkg.1                   cert-manager.tanzu.vmware.com                  1.7.2+vmware.1-tkg.1    2m31s
contour.tanzu.vmware.com.1.17.1+vmware.1-tkg.1                       contour.tanzu.vmware.com                       1.17.1+vmware.1-tkg.1   2m31s
contour.tanzu.vmware.com.1.17.2+vmware.1-tkg.2                       contour.tanzu.vmware.com                       1.17.2+vmware.1-tkg.2   2m31s
contour.tanzu.vmware.com.1.17.2+vmware.1-tkg.3                       contour.tanzu.vmware.com                       1.17.2+vmware.1-tkg.3   2m31s
contour.tanzu.vmware.com.1.18.2+vmware.1-tkg.1                       contour.tanzu.vmware.com                       1.18.2+vmware.1-tkg.1   2m31s
contour.tanzu.vmware.com.1.20.2+vmware.1-tkg.1                       contour.tanzu.vmware.com                       1.20.2+vmware.1-tkg.1   2m31s
external-dns.tanzu.vmware.com.0.10.0+vmware.1-tkg.1                  external-dns.tanzu.vmware.com                  0.10.0+vmware.1-tkg.1   2m31s
external-dns.tanzu.vmware.com.0.10.0+vmware.1-tkg.2                  external-dns.tanzu.vmware.com                  0.10.0+vmware.1-tkg.2   2m31s
external-dns.tanzu.vmware.com.0.11.0+vmware.1-tkg.2                  external-dns.tanzu.vmware.com                  0.11.0+vmware.1-tkg.2   2m31s
external-dns.tanzu.vmware.com.0.8.0+vmware.1-tkg.1                   external-dns.tanzu.vmware.com                  0.8.0+vmware.1-tkg.1    2m31s
fluent-bit.tanzu.vmware.com.1.7.5+vmware.1-tkg.1                     fluent-bit.tanzu.vmware.com                    1.7.5+vmware.1-tkg.1    2m31s
fluent-bit.tanzu.vmware.com.1.7.5+vmware.2-tkg.1                     fluent-bit.tanzu.vmware.com                    1.7.5+vmware.2-tkg.1    2m31s
fluent-bit.tanzu.vmware.com.1.8.15+vmware.1-tkg.1                    fluent-bit.tanzu.vmware.com                    1.8.15+vmware.1-tkg.1   2m31s
fluxcd-helm-controller.tanzu.vmware.com.0.21.0+vmware.1-tkg.1        fluxcd-helm-controller.tanzu.vmware.com        0.21.0+vmware.1-tkg.1   2m31s
fluxcd-kustomize-controller.tanzu.vmware.com.0.24.4+vmware.1-tkg.1   fluxcd-kustomize-controller.tanzu.vmware.com   0.24.4+vmware.1-tkg.1   2m31s
fluxcd-source-controller.tanzu.vmware.com.0.24.4+vmware.1-tkg.1      fluxcd-source-controller.tanzu.vmware.com      0.24.4+vmware.1-tkg.1   2m31s
fluxcd-source-controller.tanzu.vmware.com.0.24.4+vmware.1-tkg.2      fluxcd-source-controller.tanzu.vmware.com      0.24.4+vmware.1-tkg.2   2m31s
fluxcd-source-controller.tanzu.vmware.com.0.24.4+vmware.1-tkg.4      fluxcd-source-controller.tanzu.vmware.com      0.24.4+vmware.1-tkg.4   2m31s
grafana.tanzu.vmware.com.7.5.16+vmware.1-tkg.1                       grafana.tanzu.vmware.com                       7.5.16+vmware.1-tkg.1   2m31s
grafana.tanzu.vmware.com.7.5.7+vmware.1-tkg.1                        grafana.tanzu.vmware.com                       7.5.7+vmware.1-tkg.1    2m30s
grafana.tanzu.vmware.com.7.5.7+vmware.2-tkg.1                        grafana.tanzu.vmware.com                       7.5.7+vmware.2-tkg.1    2m30s
harbor.tanzu.vmware.com.2.2.3+vmware.1-tkg.1                         harbor.tanzu.vmware.com                        2.2.3+vmware.1-tkg.1    2m30s
harbor.tanzu.vmware.com.2.2.3+vmware.1-tkg.2                         harbor.tanzu.vmware.com                        2.2.3+vmware.1-tkg.2    2m30s
harbor.tanzu.vmware.com.2.3.3+vmware.1-tkg.1                         harbor.tanzu.vmware.com                        2.3.3+vmware.1-tkg.1    2m30s
harbor.tanzu.vmware.com.2.5.3+vmware.1-tkg.1                         harbor.tanzu.vmware.com                        2.5.3+vmware.1-tkg.1    2m30s
multus-cni.tanzu.vmware.com.3.7.1+vmware.1-tkg.1                     multus-cni.tanzu.vmware.com                    3.7.1+vmware.1-tkg.1    2m30s
multus-cni.tanzu.vmware.com.3.7.1+vmware.2-tkg.1                     multus-cni.tanzu.vmware.com                    3.7.1+vmware.2-tkg.1    2m30s
multus-cni.tanzu.vmware.com.3.7.1+vmware.2-tkg.2                     multus-cni.tanzu.vmware.com                    3.7.1+vmware.2-tkg.2    2m30s
multus-cni.tanzu.vmware.com.3.8.0+vmware.1-tkg.1                     multus-cni.tanzu.vmware.com                    3.8.0+vmware.1-tkg.1    2m30s
prometheus.tanzu.vmware.com.2.27.0+vmware.1-tkg.1                    prometheus.tanzu.vmware.com                    2.27.0+vmware.1-tkg.1   2m29s
prometheus.tanzu.vmware.com.2.27.0+vmware.2-tkg.1                    prometheus.tanzu.vmware.com                    2.27.0+vmware.2-tkg.1   2m29s
prometheus.tanzu.vmware.com.2.36.2+vmware.1-tkg.1                    prometheus.tanzu.vmware.com                    2.36.2+vmware.1-tkg.1   2m29s
whereabouts.tanzu.vmware.com.0.5.1+vmware.2-tkg.1                    whereabouts.tanzu.vmware.com                   0.5.1+vmware.2-tkg.1    2m29s

We’ll go with the most recent version, 1.7.2, for this installation and install it to the cert-manager namespace that was created earlier.

tanzu package install cert-manager -p cert-manager.tanzu.vmware.com -n cert-manager -v 1.7.2+vmware.1-tkg.1

tanzu package install cert-manager output

10:33:00AM: Creating service account 'cert-manager-cert-manager-sa'
10:33:00AM: Creating cluster admin role 'cert-manager-cert-manager-cluster-role'
10:33:00AM: Creating cluster role binding 'cert-manager-cert-manager-cluster-rolebinding'
10:33:00AM: Creating overlay secrets
10:33:00AM: Creating package install resource
10:33:00AM: Waiting for PackageInstall reconciliation for 'cert-manager'
10:33:00AM: Fetch started (3s ago)
10:33:03AM: Fetching
            | apiVersion: vendir.k14s.io/v1alpha1
            | directories:
            | - contents:
            |   - imgpkgBundle:
            |       image: projects.registry.vmware.com/tkg/packages/standard/cert-manager@sha256:e7711b3ce0f05ece458d43c4ddb57f3ff3b98fe562d83b07db4b095d6789c292
            |     path: .
            |   path: "0"
            | kind: LockConfig
            |
10:33:03AM: Fetch succeeded
10:33:03AM: Template succeeded
10:33:03AM: Deploy started (2s ago)
10:33:05AM: Deploying
            | Target cluster 'https://10.96.0.1:443' (nodes: tkg2-cluster-1-5vs4x-lflzx, 2+)
            | Changes
            | Namespace     Name                                                Kind                            Age  Op      Op st.  Wait to    Rs  Ri
            | (cluster)     cert-manager                                        Namespace                       10m  update  -       reconcile  ok  -
            | ^             cert-manager-cainjector                             ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-cainjector                             ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-controller-approve:cert-manager-io     ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-controller-approve:cert-manager-io     ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-controller-certificates                ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-controller-certificates                ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-controller-certificatesigningrequests  ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-controller-certificatesigningrequests  ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-controller-challenges                  ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-controller-challenges                  ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-controller-clusterissuers              ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-controller-clusterissuers              ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-controller-ingress-shim                ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-controller-ingress-shim                ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-controller-issuers                     ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-controller-issuers                     ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-controller-orders                      ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-controller-orders                      ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             cert-manager-edit                                   ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-view                                   ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-webhook                                MutatingWebhookConfiguration    -    create  -       reconcile  -   -
            | ^             cert-manager-webhook                                ValidatingWebhookConfiguration  -    create  -       reconcile  -   -
            | ^             cert-manager-webhook:subjectaccessreviews           ClusterRole                     -    create  -       reconcile  -   -
            | ^             cert-manager-webhook:subjectaccessreviews           ClusterRoleBinding              -    create  -       reconcile  -   -
            | ^             certificaterequests.cert-manager.io                 CustomResourceDefinition        -    create  -       reconcile  -   -
            | ^             certificates.cert-manager.io                        CustomResourceDefinition        -    create  -       reconcile  -   -
            | ^             challenges.acme.cert-manager.io                     CustomResourceDefinition        -    create  -       reconcile  -   -
            | ^             clusterissuers.cert-manager.io                      CustomResourceDefinition        -    create  -       reconcile  -   -
            | ^             issuers.cert-manager.io                             CustomResourceDefinition        -    create  -       reconcile  -   -
            | ^             orders.acme.cert-manager.io                         CustomResourceDefinition        -    create  -       reconcile  -   -
            | cert-manager  cert-manager                                        Deployment                      -    create  -       reconcile  -   -
            | ^             cert-manager                                        Service                         -    create  -       reconcile  -   -
            | ^             cert-manager                                        ServiceAccount                  -    create  -       reconcile  -   -
            | ^             cert-manager-cainjector                             Deployment                      -    create  -       reconcile  -   -
            | ^             cert-manager-cainjector                             ServiceAccount                  -    create  -       reconcile  -   -
            | ^             cert-manager-webhook                                ConfigMap                       -    create  -       reconcile  -   -
            | ^             cert-manager-webhook                                Deployment                      -    create  -       reconcile  -   -
            | ^             cert-manager-webhook                                Service                         -    create  -       reconcile  -   -
            | ^             cert-manager-webhook                                ServiceAccount                  -    create  -       reconcile  -   -
            | ^             cert-manager-webhook:dynamic-serving                Role                            -    create  -       reconcile  -   -
            | ^             cert-manager-webhook:dynamic-serving                RoleBinding                     -    create  -       reconcile  -   -
            | kube-system   cert-manager-cainjector:leaderelection              Role                            -    create  -       reconcile  -   -
            | ^             cert-manager-cainjector:leaderelection              RoleBinding                     -    create  -       reconcile  -   -
            | ^             cert-manager:leaderelection                         Role                            -    create  -       reconcile  -   -
            | ^             cert-manager:leaderelection                         RoleBinding                     -    create  -       reconcile  -   -
            | Op:      45 create, 0 delete, 1 update, 0 noop, 0 exists
            | Wait to: 46 reconcile, 0 delete, 0 noop
            | 5:33:07PM: ---- applying 23 changes [0/46 done] ----
            | 5:33:07PM: create validatingwebhookconfiguration/cert-manager-webhook (admissionregistration.k8s.io/v1) cluster
            | 5:33:07PM: create clusterrole/cert-manager-controller-challenges (rbac.authorization.k8s.io/v1) cluster
            | 5:33:08PM: create customresourcedefinition/certificaterequests.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:08PM: create customresourcedefinition/certificates.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:09PM: update namespace/cert-manager (v1) cluster
            | 5:33:09PM: create customresourcedefinition/challenges.acme.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-controller-issuers (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-cainjector (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-controller-certificates (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-controller-clusterissuers (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-controller-orders (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-controller-certificatesigningrequests (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-controller-ingress-shim (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-view (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-edit (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-controller-approve:cert-manager-io (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create clusterrole/cert-manager-webhook:subjectaccessreviews (rbac.authorization.k8s.io/v1) cluster
            | 5:33:09PM: create role/cert-manager-cainjector:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
            | 5:33:09PM: create role/cert-manager:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
            | 5:33:09PM: create mutatingwebhookconfiguration/cert-manager-webhook (admissionregistration.k8s.io/v1) cluster
            | 5:33:09PM: create customresourcedefinition/issuers.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:09PM: create customresourcedefinition/clusterissuers.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:10PM: create customresourcedefinition/orders.acme.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:10PM: ---- waiting on 23 changes [0/46 done] ----
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-challenges (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile validatingwebhookconfiguration/cert-manager-webhook (admissionregistration.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile customresourcedefinition/orders.acme.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile customresourcedefinition/challenges.acme.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-issuers (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile namespace/cert-manager (v1) cluster
            | 5:33:10PM: ok: reconcile customresourcedefinition/certificaterequests.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-cainjector (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-clusterissuers (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-certificates (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile customresourcedefinition/certificates.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-edit (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-orders (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-ingress-shim (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-certificatesigningrequests (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-view (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-webhook:subjectaccessreviews (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile clusterrole/cert-manager-controller-approve:cert-manager-io (rbac.authorization.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile mutatingwebhookconfiguration/cert-manager-webhook (admissionregistration.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile role/cert-manager:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
            | 5:33:10PM: ok: reconcile role/cert-manager-cainjector:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
            | 5:33:10PM: ok: reconcile customresourcedefinition/clusterissuers.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:10PM: ok: reconcile customresourcedefinition/issuers.cert-manager.io (apiextensions.k8s.io/v1) cluster
            | 5:33:10PM: ---- applying 5 changes [23/46 done] ----
            | 5:33:10PM: create role/cert-manager-webhook:dynamic-serving (rbac.authorization.k8s.io/v1) namespace: cert-manager
            | 5:33:10PM: create configmap/cert-manager-webhook (v1) namespace: cert-manager
            | 5:33:10PM: create serviceaccount/cert-manager (v1) namespace: cert-manager
            | 5:33:11PM: create serviceaccount/cert-manager-webhook (v1) namespace: cert-manager
            | 5:33:11PM: create serviceaccount/cert-manager-cainjector (v1) namespace: cert-manager
            | 5:33:11PM: ---- waiting on 5 changes [23/46 done] ----
            | 5:33:11PM: ok: reconcile serviceaccount/cert-manager-webhook (v1) namespace: cert-manager
            | 5:33:11PM: ok: reconcile role/cert-manager-webhook:dynamic-serving (rbac.authorization.k8s.io/v1) namespace: cert-manager
            | 5:33:11PM: ok: reconcile serviceaccount/cert-manager-cainjector (v1) namespace: cert-manager
            | 5:33:11PM: ok: reconcile serviceaccount/cert-manager (v1) namespace: cert-manager
            | 5:33:11PM: ok: reconcile configmap/cert-manager-webhook (v1) namespace: cert-manager
            | 5:33:11PM: ---- applying 13 changes [28/46 done] ----
            | 5:33:11PM: create clusterrolebinding/cert-manager-controller-challenges (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create clusterrolebinding/cert-manager-controller-issuers (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create clusterrolebinding/cert-manager-cainjector (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create clusterrolebinding/cert-manager-controller-clusterissuers (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create rolebinding/cert-manager-webhook:dynamic-serving (rbac.authorization.k8s.io/v1) namespace: cert-manager
            | 5:33:11PM: create clusterrolebinding/cert-manager-controller-certificates (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create clusterrolebinding/cert-manager-controller-ingress-shim (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create clusterrolebinding/cert-manager-controller-orders (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create clusterrolebinding/cert-manager-controller-certificatesigningrequests (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create clusterrolebinding/cert-manager-controller-approve:cert-manager-io (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create rolebinding/cert-manager-cainjector:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
            | 5:33:11PM: create clusterrolebinding/cert-manager-webhook:subjectaccessreviews (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: create rolebinding/cert-manager:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
            | 5:33:11PM: ---- waiting on 13 changes [28/46 done] ----
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-clusterissuers (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-challenges (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile rolebinding/cert-manager-webhook:dynamic-serving (rbac.authorization.k8s.io/v1) namespace: cert-manager
            | 5:33:11PM: ok: reconcile rolebinding/cert-manager:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-issuers (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-cainjector (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-certificates (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile rolebinding/cert-manager-cainjector:leaderelection (rbac.authorization.k8s.io/v1) namespace: kube-system
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-webhook:subjectaccessreviews (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-certificatesigningrequests (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-ingress-shim (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-approve:cert-manager-io (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ok: reconcile clusterrolebinding/cert-manager-controller-orders (rbac.authorization.k8s.io/v1) cluster
            | 5:33:11PM: ---- applying 5 changes [41/46 done] ----
            | 5:33:12PM: create service/cert-manager (v1) namespace: cert-manager
            | 5:33:12PM: create service/cert-manager-webhook (v1) namespace: cert-manager
            | 5:33:13PM: create deployment/cert-manager-cainjector (apps/v1) namespace: cert-manager
            | 5:33:13PM: create deployment/cert-manager (apps/v1) namespace: cert-manager
            | 5:33:13PM: create deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
            | 5:33:13PM: ---- waiting on 5 changes [41/46 done] ----
            | 5:33:13PM: ok: reconcile service/cert-manager-webhook (v1) namespace: cert-manager
            | 5:33:13PM: ok: reconcile service/cert-manager (v1) namespace: cert-manager
            | 5:33:13PM: ongoing: reconcile deployment/cert-manager-cainjector (apps/v1) namespace: cert-manager
            | 5:33:13PM:  ^ Waiting for 1 unavailable replicas
            | 5:33:13PM:  L ok: waiting on replicaset/cert-manager-cainjector-79bf859fb7 (apps/v1) namespace: cert-manager
            | 5:33:13PM:  L ongoing: waiting on pod/cert-manager-cainjector-79bf859fb7-hnzc4 (v1) namespace: cert-manager
            | 5:33:13PM:     ^ Pending: ContainerCreating
            | 5:33:13PM: ongoing: reconcile deployment/cert-manager (apps/v1) namespace: cert-manager
            | 5:33:13PM:  ^ Waiting for generation 2 to be observed
            | 5:33:13PM:  L ok: waiting on replicaset/cert-manager-6c69844b6b (apps/v1) namespace: cert-manager
            | 5:33:13PM:  L ongoing: waiting on pod/cert-manager-6c69844b6b-bnmcp (v1) namespace: cert-manager
            | 5:33:13PM:     ^ Pending: ContainerCreating
            | 5:33:13PM: ongoing: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
            | 5:33:13PM:  ^ Waiting for generation 2 to be observed
            | 5:33:13PM:  L ok: waiting on replicaset/cert-manager-webhook-599749fbd5 (apps/v1) namespace: cert-manager
            | 5:33:13PM:  L ongoing: waiting on pod/cert-manager-webhook-599749fbd5-5c2wx (v1) namespace: cert-manager
            | 5:33:13PM:     ^ Pending: ContainerCreating
            | 5:33:13PM: ---- waiting on 3 changes [43/46 done] ----
            | 5:33:13PM: ongoing: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
            | 5:33:13PM:  ^ Waiting for 1 unavailable replicas
            | 5:33:13PM:  L ok: waiting on replicaset/cert-manager-webhook-599749fbd5 (apps/v1) namespace: cert-manager
            | 5:33:13PM:  L ongoing: waiting on pod/cert-manager-webhook-599749fbd5-5c2wx (v1) namespace: cert-manager
            | 5:33:13PM:     ^ Pending: ContainerCreating
            | 5:33:13PM: ongoing: reconcile deployment/cert-manager (apps/v1) namespace: cert-manager
            | 5:33:13PM:  ^ Waiting for 1 unavailable replicas
            | 5:33:13PM:  L ok: waiting on replicaset/cert-manager-6c69844b6b (apps/v1) namespace: cert-manager
            | 5:33:13PM:  L ongoing: waiting on pod/cert-manager-6c69844b6b-bnmcp (v1) namespace: cert-manager
            | 5:33:13PM:     ^ Pending: ContainerCreating
            | 5:33:27PM: ok: reconcile deployment/cert-manager (apps/v1) namespace: cert-manager
            | 5:33:27PM: ---- waiting on 2 changes [44/46 done] ----
            | 5:33:33PM: ongoing: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
            | 5:33:33PM:  ^ Waiting for 1 unavailable replicas
            | 5:33:33PM:  L ok: waiting on replicaset/cert-manager-webhook-599749fbd5 (apps/v1) namespace: cert-manager
            | 5:33:33PM:  L ongoing: waiting on pod/cert-manager-webhook-599749fbd5-5c2wx (v1) namespace: cert-manager
            | 5:33:33PM:     ^ Condition Ready is not True (False)
            | 5:33:37PM: ok: reconcile deployment/cert-manager-cainjector (apps/v1) namespace: cert-manager
            | 5:33:37PM: ---- waiting on 1 changes [45/46 done] ----
            | 5:33:37PM: ongoing: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
            | 5:33:37PM:  ^ Waiting for 1 unavailable replicas
            | 5:33:37PM:  L ok: waiting on replicaset/cert-manager-webhook-599749fbd5 (apps/v1) namespace: cert-manager
            | 5:33:37PM:  L ok: waiting on pod/cert-manager-webhook-599749fbd5-5c2wx (v1) namespace: cert-manager
            | 5:33:38PM: ok: reconcile deployment/cert-manager-webhook (apps/v1) namespace: cert-manager
            | 5:33:38PM: ---- applying complete [46/46 done] ----
            | 5:33:38PM: ---- waiting complete [46/46 done] ----
            | Succeeded
10:33:38AM: Deploy succeeded (1s ago)

You can use either the kubectl or the tanzu command to query the installation of the cert-manager package.

kubectl -n cert-manager get packageinstalls
NAME           PACKAGE NAME                    PACKAGE VERSION        DESCRIPTION           AGE
cert-manager   cert-manager.tanzu.vmware.com   1.7.2+vmware.1-tkg.1   Reconcile succeeded   106s
tanzu package installed list -n cert-manager
 
  NAME          PACKAGE-NAME                   PACKAGE-VERSION       STATUS
  cert-manager  cert-manager.tanzu.vmware.com  1.7.2+vmware.1-tkg.1  Reconcile succeeded
tanzu package installed get -n cert-manager cert-manager
 
NAMESPACE:          cert-manager
NAME:               cert-manager
PACKAGE-NAME:       cert-manager.tanzu.vmware.com
PACKAGE-VERSION:    1.7.2+vmware.1-tkg.1
STATUS:             Reconcile succeeded
CONDITIONS:         - type: ReconcileSucceeded
  status: "True"
  reason: ""
  message: ""

You can also check the cert-manager namespace for created resources.

kubectl -n cert-manager get all
NAME                                           READY   STATUS    RESTARTS   AGE
pod/cert-manager-6c69844b6b-bnmcp              1/1     Running   0          16m
pod/cert-manager-cainjector-79bf859fb7-hnzc4   1/1     Running   0          16m
pod/cert-manager-webhook-599749fbd5-5c2wx      1/1     Running   0          16m
 
NAME                           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/cert-manager           ClusterIP   10.105.81.68    <none>        9402/TCP   16m
service/cert-manager-webhook   ClusterIP   10.96.177.102   <none>        443/TCP    16m
 
NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cert-manager              1/1     1            1           16m
deployment.apps/cert-manager-cainjector   1/1     1            1           16m
deployment.apps/cert-manager-webhook      1/1     1            1           16m
 
NAME                                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/cert-manager-6c69844b6b              1         1         1       16m
replicaset.apps/cert-manager-cainjector-79bf859fb7   1         1         1       16m
replicaset.apps/cert-manager-webhook-599749fbd5      1         1         1       16m

Install the Contour package

It’s always a good idea to install Contour early on as it will provide the needed ingress for other packages.

With most packages, you have to define a “data-values” specification that determines much of how the package will be configured. You can read up on the available parameters for all of the available packages at Installing Tanzu Packages on TKG 2 Clusters on Supervisor

infrastructure_provider: vsphere
namespace: tanzu-system-ingress
contour:
 configFileContents: {}
 useProxyProtocol: false
 replicas: 2
 pspNames: "vmware-system-restricted"
 logLevel: info
envoy:
 service:
   type: LoadBalancer
   annotations: {}
   nodePorts:
     http: null
     https: null
   externalTrafficPolicy: Cluster
   disableWait: false
 hostPorts:
   enable: true
   http: 80
   https: 443
 hostNetwork: false
 terminationGracePeriodSeconds: 300
 logLevel: info
 pspNames: null
certificates:
 duration: 8760h
 renewBefore: 360h

There is very little that is customized here. The most important point is that the envoy service is type LoadBalancer. This means that it will be realized by NSX and accessible from outside of the cluster (very important for ingress).

Note: You can use the imgpkg command (included in the tanzu CLI download) to pull a default data-values yaml file, via a command similar to the following:

imgpkg pull -b $(kubectl -n tkg-system get packages contour.tanzu.vmware.com.1.20.2+vmware.1-tkg.1 -o jsonpath='{.spec.template.spec.fetch[0].imgpkgBundle.image}') -o /tmp/contour

The default file in this example would be located at /tmp/contour/config/values.yaml. You can use this same process to get defaullt data-values yaml files for other packages as well.

Just as was done for cert-manager, a unique namespace should be created for the Contour package.

kubectl create ns tanzu-system-ingress

You will also need to specify a version of Contour to deploy. From the earlier kubectl -n tkg-system get packages output, you can see the available Contour versions.

contour.tanzu.vmware.com.1.17.1+vmware.1-tkg.1                       contour.tanzu.vmware.com                       1.17.1+vmware.1-tkg.1   2m31s
contour.tanzu.vmware.com.1.17.2+vmware.1-tkg.2                       contour.tanzu.vmware.com                       1.17.2+vmware.1-tkg.2   2m31s
contour.tanzu.vmware.com.1.17.2+vmware.1-tkg.3                       contour.tanzu.vmware.com                       1.17.2+vmware.1-tkg.3   2m31s
contour.tanzu.vmware.com.1.18.2+vmware.1-tkg.1                       contour.tanzu.vmware.com                       1.18.2+vmware.1-tkg.1   2m31s
contour.tanzu.vmware.com.1.20.2+vmware.1-tkg.1                       contour.tanzu.vmware.com                       1.20.2+vmware.1-tkg.1   2m31s

The latest version is 1.20.2 so we’ll install that.

tanzu package install contour -p contour.tanzu.vmware.com -v 1.20.2+vmware.1-tkg.1 --values-file contour-data-values.yaml -n tanzu-system-ingress

The output is skipped for this and the rest of the packages as it’s very similar to the output for the cert-manager package. When the installation is done  you should see output similar to the following:

10:49:47AM: Deploy succeeded

You should also validate that the package installation was successful and the necessary objects were created.

tanzu package installed list -n tanzu-system-ingress
 
  NAME     PACKAGE-NAME              PACKAGE-VERSION        STATUS
  contour  contour.tanzu.vmware.com  1.20.2+vmware.1-tkg.1  Reconcile succeeded
kubectl -n tanzu-system-ingress get all
NAME                           READY   STATUS    RESTARTS   AGE
pod/contour-5c6cb8f577-pg5j6   1/1     Running   0          3m5s
pod/contour-5c6cb8f577-vq5dt   1/1     Running   0          3m5s
pod/envoy-fl4rx                2/2     Running   0          3m6s
pod/envoy-tltqf                2/2     Running   0          3m6s
 
NAME              TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
service/contour   ClusterIP      10.96.189.3     <none>        8001/TCP                     3m5s
service/envoy     LoadBalancer   10.101.72.152   10.40.14.70   80:30412/TCP,443:30108/TCP   3m5s
 
NAME                   DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/envoy   2         2         2       2            2           <none>          3m7s
 
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/contour   2/2     2            2           3m5s
 
NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/contour-5c6cb8f577   2         2         2       3m5s

You can see that the envoy service has an external IP address of 10.40.14.70, clearly in the range specified for ingress during Workload Management configuration.

In NSX, you can see a new load balancer for this address:

Looking at the server pool for this load balancer, you can see that there are two members, the envoy pods.

Since the envoy pods take on the IP addresses of the worker nodes on which these run, these IPs can be seen in the cluster by querying the nodes.

kubectl get nodes -o wide
NAME                                                              STATUS   ROLES                  AGE    VERSION            INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                 KERNEL-VERSION       CONTAINER-RUNTIME
tkg2-cluster-1-5vs4x-lflzx                                        Ready    control-plane,master   125m   v1.23.8+vmware.2   10.244.0.114   <none>        VMware Photon OS/Linux   4.19.256-4.ph3-esx   containerd://1.6.6
tkg2-cluster-1-tkg2-cluster-1-nodepool-1-6hgnv-685f8bb498-kjjsk   Ready    <none>                 121m   v1.23.8+vmware.2   10.244.0.116   <none>        VMware Photon OS/Linux   4.19.256-4.ph3-esx   containerd://1.6.6
tkg2-cluster-1-tkg2-cluster-1-nodepool-1-6hgnv-685f8bb498-mx58w   Ready    <none>                 121m   v1.23.8+vmware.2   10.244.0.115   <none>        VMware Photon OS/Linux   4.19.256-4.ph3-esx   containerd://1.6.6

Install the Service Discovery package

The service discovery package (or external-DNS) is a great one to install as it will allow for DNS records to be created automatically for systems with an ingress component (or a httpproxy component in the case of Contour). External-DNS is an open source project that has been included with TKG since version 1.3. External-DNS synchronizes exposed Kubernetes Services and Ingresses with DNS providers. vSphere with Tanzu can use external-DNS to assist with service discovery as it will automatically create DNS records for httpproxy resources created via Contour. AWS (Route53), Azure, and RFC2136 (BIND) are currently supported but we’re suing RFC2136 since this is what is needed to work with Microsoft DNS.

You can read more about this specific package in my earlier post, How to configure external-dns with Microsoft DNS in TKG 1.3 (plus Harbor and Contour).

Create the namespace:

kubectl create ns tanzu-system-service-discovery

A configmap is needed that defines the kerberos configuration that the service discovery package will useā€¦this is fairly generic and the only things customized were the domain/realm name and the kdc/admin_server addresses:

apiVersion: v1
kind: ConfigMap
metadata:
  name: krb5.conf
  namespace: tanzu-system-service-discovery
data:
  krb5.conf: |
    [logging]
    default = FILE:/var/log/krb5libs.log
    kdc = FILE:/var/log/krb5kdc.log
    admin_server = FILE:/var/log/kadmind.log
 
    [libdefaults]
    dns_lookup_realm = false
    ticket_lifetime = 24h
    renew_lifetime = 7d
    forwardable = true
    rdns = false
    pkinit_anchors = /etc/pki/tls/certs/ca-bundle.crt
    default_ccache_name = KEYRING:persistent:%{uid}
 
    default_realm = CORP.VMW
 
    [realms]
    CORP.VMW = {
      kdc = controlcenter.corp.vmw
      admin_server = controlcenter.corp.vmw
    }
 
    [domain_realm]
    corp.vmw = CORP.VMW
    .corp.vmw = CORP.VMW

Create the ConfigMap:

kubectl apply -f external-dns-krb5-cm.yaml

And a data-values specification will needed for the service discovery package.

namespace: tanzu-system-service-discovery
deployment:
  args:
    - --provider=rfc2136
    - --rfc2136-host=controlcenter.corp.vmw
    - --rfc2136-port=53
    - --rfc2136-zone=corp.vmw
    - --rfc2136-gss-tsig
    - --rfc2136-kerberos-realm=corp.vmw
    - --rfc2136-kerberos-username=administrator
    - --rfc2136-kerberos-password=VMware1!
    - --rfc2136-tsig-axfr
    - --source=service
    - --source=ingress
    - --source=contour-httpproxy
    - --domain-filter=corp.vmw
    - --txt-owner-id=k8s
    - --txt-prefix=external-dns-
    - --registry=txt
    - --policy=upsert-only
  env: []
  securityContext: {}
  volumeMounts:
  - name: kerberos-config-volume
    mountPath: /etc/krb5.conf
    subPath: krb5.conf
  volumes:
  - name: kerberos-config-volume
    configMap:
      defaultMode: 420
      name: krb5.conf

I’ve made a number of changes to this file to allow for external DNS to communicate with my Microsoft DNS implementation.

  • set the rfc2136-host value to controlcenter.corp.vmw, the FQDN of my AD/DNS server
  • set the rfc2136-zone, rfc2126-kerberos-realm, and domain-filter values to corp.vmw, the DNS zone name
  • set the rfc2136-kerberos-username value to administrator, the admin user in my AD domain
  • set the rfc2136-kerberos-password value to VMware1!, the administrator user password in my AD domain
  • set the source value to contour-httprpoxy, telling external DNS to only look for httpproxy resources (not ingress objects)
  • specified the volumeMounts and volumes stanzas to point to the krb5 configMap created earlier

The available versions were shown earlier:

external-dns.tanzu.vmware.com.0.10.0+vmware.1-tkg.1                  external-dns.tanzu.vmware.com                  0.10.0+vmware.1-tkg.1   2m31s
external-dns.tanzu.vmware.com.0.10.0+vmware.1-tkg.2                  external-dns.tanzu.vmware.com                  0.10.0+vmware.1-tkg.2   2m31s
external-dns.tanzu.vmware.com.0.11.0+vmware.1-tkg.2                  external-dns.tanzu.vmware.com                  0.11.0+vmware.1-tkg.2   2m31s
external-dns.tanzu.vmware.com.0.8.0+vmware.1-tkg.1                   external-dns.tanzu.vmware.com                  0.8.0+vmware.1-tkg.1    2m31s

The package can be installed with the latest version, 0.11.0.

tanzu package install external-dns -p external-dns.tanzu.vmware.com -n tanzu-system-service-discovery -v 0.11.0+vmware.1-tkg.2 --values-file external-dns-data-values.yaml

Check that the components have been successfully installed.

tanzu package installed list -n tanzu-system-service-discovery
 
  NAME          PACKAGE-NAME                   PACKAGE-VERSION        STATUS
  external-dns  external-dns.tanzu.vmware.com  0.11.0+vmware.1-tkg.2  Reconcile succeeded
kubectl -n tanzu-system-service-discovery get all
NAME                               READY   STATUS    RESTARTS   AGE
pod/external-dns-77d947745-tcjz9   1/1     Running   0          63s
 
NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/external-dns   1/1     1            1           63s
 
NAME                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/external-dns-77d947745   1         1         1       63s

You’ll be able to check that this package is functional when you install other packages that have an ingress.

Install the Prometheus package

Create the namespace:

kubectl create ns tanzu-system-monitoring

Get the desired version:

prometheus.tanzu.vmware.com.2.27.0+vmware.1-tkg.1                    prometheus.tanzu.vmware.com                    2.27.0+vmware.1-tkg.1   2m29s
prometheus.tanzu.vmware.com.2.27.0+vmware.2-tkg.1                    prometheus.tanzu.vmware.com                    2.27.0+vmware.2-tkg.1   2m29s
prometheus.tanzu.vmware.com.2.36.2+vmware.1-tkg.1                    prometheus.tanzu.vmware.com                    2.36.2+vmware.1-tkg.1   2m29s

We’ll use 2.36.2, the latest version.

Create the data-values specification.

prometheus-data-values.yaml

alertmanager:
  config:
    alertmanager_yml: |
      global: {}
      receivers:
      - name: default-receiver
      templates:
      - '/etc/alertmanager/templates/*.tmpl'
      route:
        group_interval: 5m
        group_wait: 10s
        receiver: default-receiver
        repeat_interval: 3h
  deployment:
    replicas: 1
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    updateStrategy: Recreate
  pvc:
    accessMode: ReadWriteOnce
    storage: 2Gi
    storageClassName: k8s-policy
  service:
    port: 80
    targetPort: 9093
    type: ClusterIP
ingress:
  alertmanager_prefix: /alertmanager/
  alertmanagerServicePort: 80
  enabled: true
  prometheus_prefix: /
  prometheusServicePort: 80
  tlsCertificate:
    ca.crt: ca
    tls.crt:  |
      -----BEGIN CERTIFICATE-----
      MIIHxTCCBa2gAwIBAgITIgAAAAQnSpH7QfxTKAAAAAAABDANBgkqhkiG9w0BAQsF
      ADBMMRMwEQYKCZImiZPyLGQBGRYDdm13MRQwEgYKCZImiZPyLGQBGRYEY29ycDEf
      MB0GA1UEAxMWY29udHJvbGNlbnRlci5jb3JwLnZtdzAeFw0yMjAzMjExOTQ1MzNa
      Fw0zMjAzMTgxOTQ1MzNaMG0xCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9y
      bmlhMRIwEAYDVQQHEwlQYWxvIEFsdG8xDzANBgNVBAoTBlZNd2FyZTEPMA0GA1UE
      CxMGVk13YXJlMRMwEQYDVQQDDAoqLmNvcnAudm13MIICIjANBgkqhkiG9w0BAQEF
      AAOCAg8AMIICCgKCAgEAzhh0/CNdiskBkkzZzrhBbKiQEzekxnw0uCgLCpkmQzIe
      m98jgt7LgS2dQFf5vABpTqIX11t4EoY/1gmsrz1IjBv26VrIjSLFmGppRgh+7NPY
      LbXVB6AbrgtlC+a89d14h37l30khZ62H2KTvmUojihUSeYZfi50IMsW/UPr8Z94V
      xhI2bTy/PDL/6sd+qmTpqaZplooEQBn18D0VqJhbTCK+kbdxgM01/8PcZoTcjAEN
      XcUR64p3H0KG0QeQn8r81nOsEihNBUCsrc6PiqPDRGtOJ+BsUhQy+BVmk/dIwnqM
      PJN7cTElz3h+ScJdUr0vPBfZjkI/glRk4GugFomg/qdHvt0OVsLG9efBbiZJo8/X
      PWEcRppCYE+HexbiVwKCML2i75nsELdH4+UVmfrywhAyZf4hvlz2/w5LQudPrfHz
      qPZ76NLmUkkK7buPGsktbtMS4n8fUGsr23Y4UxvZNRrvb6aPp3gSylDrzsV4dTQb
      PiLxzrZymiYw8tReMchCXbb9LGoN50xYbq2nAyUPhxnH0HOuw415shfkLTTGx1ms
      Ht+NeTUbavSh4dBa6EKFfKbYQ9Fukw3XkeMClX0Mdo/EQemivLGO+hIxoDIexkQs
      kXU2y/1uRhCOW5Rcsjr3EgwXGHENr8WUIut8wG1QoW8lg0ma73ErIY37dtFORGcC
      AwEAAaOCAn0wggJ5MA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEFBQcD
      ATAVBgNVHREEDjAMggoqLmNvcnAudm13MB0GA1UdDgQWBBQHTiEY3UJ11GdfNS9U
      iba5lC0f3zAfBgNVHSMEGDAWgBTGg/OWUAlp52UyrZaNhYHgXyQZxzCB1wYDVR0f
      BIHPMIHMMIHJoIHGoIHDhoHAbGRhcDovLy9DTj1jb250cm9sY2VudGVyLmNvcnAu
      dm13LENOPWNvbnRyb2xjZW50ZXIsQ049Q0RQLENOPVB1YmxpYyUyMEtleSUyMFNl
      cnZpY2VzLENOPVNlcnZpY2VzLENOPUNvbmZpZ3VyYXRpb24sREM9Y29ycCxEQz12
      bXc/Y2VydGlmaWNhdGVSZXZvY2F0aW9uTGlzdD9iYXNlP29iamVjdENsYXNzPWNS
      TERpc3RyaWJ1dGlvblBvaW50MIHFBggrBgEFBQcBAQSBuDCBtTCBsgYIKwYBBQUH
      MAKGgaVsZGFwOi8vL0NOPWNvbnRyb2xjZW50ZXIuY29ycC52bXcsQ049QUlBLENO
      PVB1YmxpYyUyMEtleSUyMFNlcnZpY2VzLENOPVNlcnZpY2VzLENOPUNvbmZpZ3Vy
      YXRpb24sREM9Y29ycCxEQz12bXc/Y0FDZXJ0aWZpY2F0ZT9iYXNlP29iamVjdENs
      YXNzPWNlcnRpZmljYXRpb25BdXRob3JpdHkwPAYJKwYBBAGCNxUHBC8wLQYlKwYB
      BAGCNxUIgfaGVYGazCSBvY8jgZr5P5S9VzyB26Mxg6eFPwIBZAIBAjAbBgkrBgEE
      AYI3FQoEDjAMMAoGCCsGAQUFBwMBMA0GCSqGSIb3DQEBCwUAA4ICAQCCxqhinZTi
      NWHGvTgms+KdmNfIFR/R6GTFF8bO/bgfRw5pVeSDurEechjlO2hRDWOHn4H2+fIQ
      vN4I6cjEbVnDBbVrRkbCfNLD1Wjj/zz65pZ8PmjQUiFl9L4HxaUH5sF/3Jrylsu0
      M2oITByEfl5WpfC0oyB2/9nKYLPLOK5j2OicHxEino1RPAPdOk5gU5c6+Ed74kVh
      KtgK3r8Lc4/IhwfmSDgGl1DmEkzv/u+0bQTOOH1fVKSl3p9+YieADc3s2SJWFF0F
      mKElJHlZEeAg+MI16zNbQhowZE2SE+b9VTGK9KDkCmYGRjpHc61onQNTzIH5rDFx
      /0aBOGp3+tdA+QEI8VgpQlaa0BtsKyY3l/DAg7I42x4Zv9ta7vZUe81v11PqdSJQ
      v7NOriJkRNPneErH2QNsbi00p+TlUFzOl85AEtlB722/fDHbDGSxngDAhImCv4uC
      xPnwVRo94AI0A6ol0FEct2z2wgehQKQbwcskNpOE7wryd+/yqrm+Z5o0crfaPPwX
      uEwDtQjCRM+8wDWBcQnvyOJe374nFLpGpX8tZLjOg2U0wwdsfAxtYGZUAN0V0xm0
      YYsIjp7/f+Pk1DjzWx8JIAbzItKLucDreAmmDXqk+DrBP9LYqtmjB0n7nSErgK8G
      sA3kGCJdOkI0kgF10gsinaouG2jVlwNOsw==
      -----END CERTIFICATE-----
    tls.key: |
      -----BEGIN PRIVATE KEY-----
      MIIJRAIBADANBgkqhkiG9w0BAQEFAASCCS4wggkqAgEAAoICAQDOGHT8I12KyQGS
      TNnOuEFsqJATN6TGfDS4KAsKmSZDMh6b3yOC3suBLZ1AV/m8AGlOohfXW3gShj/W
      CayvPUiMG/bpWsiNIsWYamlGCH7s09gttdUHoBuuC2UL5rz13XiHfuXfSSFnrYfY
      pO+ZSiOKFRJ5hl+LnQgyxb9Q+vxn3hXGEjZtPL88Mv/qx36qZOmppmmWigRAGfXw
      PRWomFtMIr6Rt3GAzTX/w9xmhNyMAQ1dxRHrincfQobRB5CfyvzWc6wSKE0FQKyt
      zo+Ko8NEa04n4GxSFDL4FWaT90jCeow8k3txMSXPeH5Jwl1SvS88F9mOQj+CVGTg
      a6AWiaD+p0e+3Q5Wwsb158FuJkmjz9c9YRxGmkJgT4d7FuJXAoIwvaLvmewQt0fj
      5RWZ+vLCEDJl/iG+XPb/DktC50+t8fOo9nvo0uZSSQrtu48ayS1u0xLifx9Qayvb
      djhTG9k1Gu9vpo+neBLKUOvOxXh1NBs+IvHOtnKaJjDy1F4xyEJdtv0sag3nTFhu
      racDJQ+HGcfQc67DjXmyF+QtNMbHWawe3415NRtq9KHh0FroQoV8pthD0W6TDdeR
      4wKVfQx2j8RB6aK8sY76EjGgMh7GRCyRdTbL/W5GEI5blFyyOvcSDBcYcQ2vxZQi
      63zAbVChbyWDSZrvcSshjft20U5EZwIDAQABAoICAHmP83DFa2dxKHwi2FYWWIC+
      7Dxplcd9e5skA1889lSsO2G1PDz1LRQE07wgKC28EGFROr7MNQa4KO8WxcSXYTND
      S2BZK/ITkHlWSsIEQNlwGxLbLcxRpAIEtpVOhCaBe5ZwQyZw/EMrF/WxU6IXGN9Z
      jowftjujZDKOcUpSwI6DcFRkabYFHsdjTZAuG4hl/W0TuzQQNHGa3nXVkfDf7Pn7
      hGxux4QxhqhV3qqZs3zhIgEtPGSyR5EorFyfGa8nC/tyPwx2uPdgLnpWXFRqQ8MX
      iAH9XecMAwRRmy+rrD8KCa2xUB5z3tmBOPxIqMMk07eeWbSPXuaA4P9+e+7PPyXl
      BEZ/6Y5wICrd2WrzJQOUrLQpbiSRZi7tTxUxYCKWXmB93RDjG8yl5qXZZ5G173PK
      hGHH8KyUWD6tW0ytFEW714IaqJN8ffTCkThnveWKOd/jW3/ffxxShMj9+FnnUEfp
      dfQCq3rZEafoJX/A98TOibq/f5Rogky3D3azhs6gz/+NBterX6+7U2OnSK4cPAGr
      KPn0HJT99gnojkFO32L0N8QriXSdY5gpTw+ZGzgOu78WO5JwvCd+LENNCMfIQckh
      Jm/GM+0sTG03bhlQwyfJeV43ETyarGfiRuwFzgbvH24Ie6iSPSrIu5oqFfDp2+El
      EzAC9kqo4rmCoUQcX5ABAoIBAQD/egDF3mu7yVXQx8EFSKAsTi/Bq6z4gmBtQdN3
      g4MbdlfZ5/rrQZ1pbm2VlqLYz0hab4V0rtJKFKAg4UdSxZTnuR3tZCgSjNd4wy8b
      Meuwyls/IMKPmcrKmBWSxesQRs/wMlKPWYv7SUgJhgHqiPuDP74s1FaYdnPeVjKO
      BDABdzk3MJaU/FEPqHs+0RhjhHOaVm4v2T+/AYIFZ0Gvm+uSlhjlfse83Ui2WV5b
      0DUWS0Gl80UCkPloDWpDa21bcpmcYqrBUnppYz4XPvzNjnYZflubJU7ZGoEB3DNw
      u8IlVidH2/c86eGKvCNihs01f3Oxg8XpEB9W6mFjhDVE7FnJAoIBAQDOhI2+tcG+
      oQybL3IhvvTYnpxPh9GU9qxCEsb71qR+xmXXdIAf9ZPtuL+iet2B60CICcn94gtn
      ntqm4mSjJ9qg7CVnwCLWRhSqGE7Ha+4xnAQgRIpaVhRYThS0ZEL2GbYpezkc78Ql
      ZKDWJLAz/4sGzQcbBt+xVM2962bTFVSWRdeXfe6BoRbn5f9V9ZA7wSE5porC1k1h
      C6HiApOXmZMryILGrrE2YxO/S4cA+TVwH2/dU8W4Ti50EcYvMUwLf3hE3hZwIUvR
      HSecQBcb5fp3mZpnvaZjNTx7vKh6jjRXfmSRhI2dbSZZh6k+5NFP5C28A4Hkhuh6
      lpAGQxQ2h8SvAoIBAQDUTtBbn3aKbUvaoFZBDNTHXQaE/SVWtApsYZraJDmNVfC2
      DvnQDgxBtNpuyOt2H/Rx62HN0QbDN5bHHFAIclhHpehAAs7mc5MRMatw/zBuEAx6
      TsBBVD5Z1L+A5Odu9FoTs842gOU6o/CwsWPgQ4w4y31Ahgmc1DuAVsPWj5ZRcYHj
      4oYRNAotaAdb8apB8a2cYh1ZuEIoeplR4jiNNpczj3cLKSvWQVMO7v/ibwnfCBV7
      UspT0qThmtxnQNx1daxAcSKUW/WMpUPRT7AJJ03v67k3Gm8HLuZs5FD/a5lxK8Kj
      DiLNxVOA1s7VL09UGSHNMMQE5jgVI9xhNlqKd5w5AoIBAQCe0y639szUMMOjLbAW
      5+ciGYmZWJkEeVktT4ec8wx7O1Xjh4NqENH9x1IKQXfNjQGKHg0spgWjYXZDVmWT
      XPk1PafezNN9+1O1JRChKg58NMKvlkbZBs6KwzIFMf6VilygNlZMPNGa+HMBfiHN
      O8DOMCxAyt6KYPACGeJwgD0XfQs7ROyC4ULegfIHR93vNq64ya55/ZpxAiMz0Et2
      EfQvffullYBQlY4AVrOzOfWxD1xW2TB8eBQdy/WhIcacKSJzxGF5RwIqBsQJ1Phw
      ykQAay9mjWJDdhPYDdV8u5ThnSD3EPxgkCsoO78b0ZpwWMobiI8DFAYDEXwedMQ8
      09mdAoIBAQCxsvbB38jiPPqsF77jxWzogVdPR6xO6oLkVcvu4KI1Fq0wFdDH4eBJ
      gOxV8e+wsQBVO1XXP4i3UhbCq7gPgh+YJWWy66RKiuwpIPf2yvGbbUgBpp73MxOx
      ycerjh7LRS56gAAJ6d5hyYf293E5QEAOZh+Niuz3m1xMytdvG7IefX3TMe0GMgZh
      I5djpaHvjgB6qHeVsLuUWjC2FPXtrz2STay08tq/Pc5g+57bTfOyHo0BZ3C/uhZa
      l1NzswracGQIzo03zk/X3Z6P2YOea4BkZ0Iwh34wOHJnTkfEeSx6y+oSFMcFRthT
      yfFCZUk/sVCc/C1a4VigczXftUGiRrTR
      -----END PRIVATE KEY-----
    ca.crt: |
      -----BEGIN CERTIFICATE-----
      MIIFczCCA1ugAwIBAgIQTYJITQ3SZ4BBS9UzXfJIuTANBgkqhkiG9w0BAQsFADBM
      MRMwEQYKCZImiZPyLGQBGRYDdm13MRQwEgYKCZImiZPyLGQBGRYEY29ycDEfMB0G
      A1UEAxMWY29udHJvbGNlbnRlci5jb3JwLnZtdzAeFw0yMjAzMjExOTE3MjhaFw0z
      NzAzMjExOTI3MjNaMEwxEzARBgoJkiaJk/IsZAEZFgN2bXcxFDASBgoJkiaJk/Is
      ZAEZFgRjb3JwMR8wHQYDVQQDExZjb250cm9sY2VudGVyLmNvcnAudm13MIICIjAN
      BgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEA2OYKxckOjhgufWu1YEnatvJ1M127
      gwPbFNj11/dICXaPe+mjN1Hce0PiS2QaaeAe8kH+mOKRa2JjaGdXr6rOiB80KZOR
      uw0GzSJyL5w7ewR+NJf31YO62BD/mt3sHeMnCXmSBxOQvb0nGkhTr1y+rDpvxJ87
      zNczgfN54to6S379wjOsC4bkHLnMJ5EtJG78pPqX1+1wcVOURNJ6y9BcejLnoy/y
      CFpXKOVxKHzy2nnsitAuBb+hD+Jxw8/jFQUhxH0VlgyfXCQdegasSA9RHtZtfpVs
      hshisjkSlvQmbsEknBZrAfBVIYidwt3w050jVhiUs5Ql6vDotY6Gqtzzgq0obv6P
      7E9NPej3BzhPSIUyqnpf57UWI4zUiRJvbSu/J2MCBKHwYfzke1cnvLA7viDEdB9+
      /Htk9aG9/1B6ddDfafrcSOWtkTfHWYLv21o3Uwoh9W5OpK9JikZu/PqnpZkUi+2C
      L+WCww/BS1yhQwVif6PqUMeSLz3jtq3w6R/ruUMlO+0E5//bskDT6QGxBgcvMF9n
      Dl+u0uqHKOdiUvOXBtF139HKUrZsq0m3WPoel2/p+cVVJYsyJG/rRpeh1g/X0cB3
      9EuTjX6vnrT+IS8ZfAaoHzpmgh1vGu2r2xgPq2E8x4ji9FGV8YTjAs60Nw7YxKUW
      Wgj+YNpxP2SxFqUCAwEAAaNRME8wCwYDVR0PBAQDAgGGMA8GA1UdEwEB/wQFMAMB
      Af8wHQYDVR0OBBYEFMaD85ZQCWnnZTKtlo2FgeBfJBnHMBAGCSsGAQQBgjcVAQQD
      AgEAMA0GCSqGSIb3DQEBCwUAA4ICAQAutXwOtsmYcbj/bs3Mydx0Di9m+6UVTEZd
      ORRrTus/BL/TNryO7zo2beczGPK26MwqhmUZiaF61jRb36kxmFPVx2uV2np4LbQj
      5MrxtPzf2XXy4b7ADqQpLgu4rR3mZiXGmzUoV17hmAhyfSU1qm4FssXGK2ypWsQs
      BwsKX4DsIijJJZbXwKFaauq0LtnkgeGWdoEFFWAH0yJWPbz9h+ovlCxq0DBiG00l
      brnY90sqpoiWTxMKNCXDDhNjvtxO3kQIDQVvbNMCEbmYG+RrWQHtvufw97RK/cTL
      9dKFSblIIizMINVwM/gqtlVVvWP1EFaUy0xG5bvOO+SCe+TlA7rz4/RORqqE5Ugg
      7F8fWz+o6BM/qf/Kwh+WN42dyR1rOsFqEVNamZLjrAzgwjQ/nquRRMl2cK6yg6Fq
      d0O42wwYPpLUEFv4xe4a3kpRvvhshNkzR4IacbmaUlnzmlewoFXVueEblviBHJoV
      1OUC6qfLkCjfCEv470Kr5vDe5Y/l/7j8EYj7a/wa2++kq+7xd+bj/DDed85fm3Yk
      dhfp7bGXKm4KbPLzkSpiYWbE+EbArLtIk62exjcJvJPdoxMTxgbdelzl/snPLrdg
      w0oGuTTBfxSMKs767N3G1q5tz0mwFpIqIQtXUSmaJ+9p7IkpWcThLnyYYo1IpWm/
      ZHtjzZMQVA==
      -----END CERTIFICATE-----
  virtual_host_fqdn: prometheus.corp.vmw
kube_state_metrics:
  deployment:
    replicas: 1
  service:
    port: 80
    targetPort: 8080
    telemetryPort: 81
    telemetryTargetPort: 8081
    type: ClusterIP
namespace: tanzu-system-monitoring
node_exporter:
  daemonset:
    hostNetwork: false
    updatestrategy: RollingUpdate
  service:
    port: 9100
    targetPort: 9100
    type: ClusterIP
prometheus:
  config:
    alerting_rules_yml: |
      {}
    alerts_yml: |
      {}
    prometheus_yml: |
      global:
        evaluation_interval: 1m
        scrape_interval: 1m
        scrape_timeout: 10s
      rule_files:
      - /etc/config/alerting_rules.yml
      - /etc/config/recording_rules.yml
      - /etc/config/alerts
      - /etc/config/rules
      scrape_configs:
      - job_name: 'prometheus'
        scrape_interval: 5s
        static_configs:
        - targets: ['localhost:9090']
      - job_name: 'kube-state-metrics'
        static_configs:
        - targets: ['prometheus-kube-state-metrics.prometheus.svc.cluster.local:8080']
 
      - job_name: 'node-exporter'
        static_configs:
        - targets: ['prometheus-node-exporter.prometheus.svc.cluster.local:9100']
 
      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name
      - job_name: kubernetes-nodes-cadvisor
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - replacement: kubernetes.default.svc:443
          target_label: __address__
        - regex: (.+)
          replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
          source_labels:
          - __meta_kubernetes_node_name
          target_label: __metrics_path__
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      - job_name: kubernetes-apiservers
        kubernetes_sd_configs:
        - role: endpoints
        relabel_configs:
        - action: keep
          regex: default;kubernetes;https
          source_labels:
          - __meta_kubernetes_namespace
          - __meta_kubernetes_service_name
          - __meta_kubernetes_endpoint_port_name
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      alerting:
        alertmanagers:
        - scheme: http
          static_configs:
          - targets:
            - alertmanager.prometheus.svc:80
        - kubernetes_sd_configs:
            - role: pod
          relabel_configs:
          - source_labels: [__meta_kubernetes_namespace]
            regex: default
            action: keep
          - source_labels: [__meta_kubernetes_pod_label_app]
            regex: prometheus
            action: keep
          - source_labels: [__meta_kubernetes_pod_label_component]
            regex: alertmanager
            action: keep
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_probe]
            regex: .*
            action: keep
          - source_labels: [__meta_kubernetes_pod_container_port_number]
            regex:
            action: drop
    recording_rules_yml: |
      groups:
        - name: kube-apiserver.rules
          interval: 3m
          rules:
          - expr: |2
              (
                (
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1d]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1d]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1d]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1d]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
            labels:
              verb: read
            record: apiserver_request:burnrate1d
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1h]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1h]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1h]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1h]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1h]))
            labels:
              verb: read
            record: apiserver_request:burnrate1h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[2h]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[2h]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[2h]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[2h]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[2h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[2h]))
            labels:
              verb: read
            record: apiserver_request:burnrate2h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[30m]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30m]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[30m]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[30m]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[30m]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[30m]))
            labels:
              verb: read
            record: apiserver_request:burnrate30m
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[3d]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[3d]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[3d]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[3d]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[3d]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[3d]))
            labels:
              verb: read
            record: apiserver_request:burnrate3d
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[5m]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[5m]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[5m]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[5m]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
            labels:
              verb: read
            record: apiserver_request:burnrate5m
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[6h]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[6h]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[6h]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[6h]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[6h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[6h]))
            labels:
              verb: read
            record: apiserver_request:burnrate6h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1d]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[1d]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[1d]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1d]))
            labels:
              verb: write
            record: apiserver_request:burnrate1d
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1h]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[1h]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[1h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1h]))
            labels:
              verb: write
            record: apiserver_request:burnrate1h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[2h]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[2h]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[2h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[2h]))
            labels:
              verb: write
            record: apiserver_request:burnrate2h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[30m]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[30m]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[30m]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[30m]))
            labels:
              verb: write
            record: apiserver_request:burnrate30m
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[3d]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[3d]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[3d]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[3d]))
            labels:
              verb: write
            record: apiserver_request:burnrate3d
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[5m]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[5m]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
            labels:
              verb: write
            record: apiserver_request:burnrate5m
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[6h]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[6h]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[6h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[6h]))
            labels:
              verb: write
            record: apiserver_request:burnrate6h
          - expr: |
              sum by (code,resource) (rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
            labels:
              verb: read
            record: code_resource:apiserver_request_total:rate5m
          - expr: |
              sum by (code,resource) (rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
            labels:
              verb: write
            record: code_resource:apiserver_request_total:rate5m
          - expr: |
              histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))) > 0
            labels:
              quantile: "0.99"
              verb: read
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
          - expr: |
              histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))) > 0
            labels:
              quantile: "0.99"
              verb: write
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
          - expr: |2
              sum(rate(apiserver_request_duration_seconds_sum{subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod)
              /
              sum(rate(apiserver_request_duration_seconds_count{subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod)
            record: cluster:apiserver_request_duration_seconds:mean5m
          - expr: |
              histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
            labels:
              quantile: "0.99"
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
          - expr: |
              histogram_quantile(0.9, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
            labels:
              quantile: "0.9"
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
          - expr: |
              histogram_quantile(0.5, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
            labels:
              quantile: "0.5"
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
        - interval: 3m
          name: kube-apiserver-availability.rules
          rules:
          - expr: |2
              1 - (
                (
                  # write too slow
                  sum(increase(apiserver_request_duration_seconds_count{verb=~"POST|PUT|PATCH|DELETE"}[30d]))
                  -
                  sum(increase(apiserver_request_duration_seconds_bucket{verb=~"POST|PUT|PATCH|DELETE",le="1"}[30d]))
                ) +
                (
                  # read too slow
                  sum(increase(apiserver_request_duration_seconds_count{verb=~"LIST|GET"}[30d]))
                  -
                  (
                    (
                      sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30d]))
                      or
                      vector(0)
                    )
                    +
                    sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope="namespace",le="0.5"}[30d]))
                    +
                    sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope="cluster",le="5"}[30d]))
                  )
                ) +
                # errors
                sum(code:apiserver_request_total:increase30d{code=~"5.."} or vector(0))
              )
              /
              sum(code:apiserver_request_total:increase30d)
            labels:
              verb: all
            record: apiserver_request:availability30d
          - expr: |2
              1 - (
                sum(increase(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[30d]))
                -
                (
                  # too slow
                  (
                    sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30d]))
                    or
                    vector(0)
                  )
                  +
                  sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[30d]))
                  +
                  sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[30d]))
                )
                +
                # errors
                sum(code:apiserver_request_total:increase30d{verb="read",code=~"5.."} or vector(0))
              )
              /
              sum(code:apiserver_request_total:increase30d{verb="read"})
            labels:
              verb: read
            record: apiserver_request:availability30d
          - expr: |2
              1 - (
                (
                  # too slow
                  sum(increase(apiserver_request_duration_seconds_count{verb=~"POST|PUT|PATCH|DELETE"}[30d]))
                  -
                  sum(increase(apiserver_request_duration_seconds_bucket{verb=~"POST|PUT|PATCH|DELETE",le="1"}[30d]))
                )
                +
                # errors
                sum(code:apiserver_request_total:increase30d{verb="write",code=~"5.."} or vector(0))
              )
              /
              sum(code:apiserver_request_total:increase30d{verb="write"})
            labels:
              verb: write
            record: apiserver_request:availability30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~"LIST|GET"})
            labels:
              verb: read
            record: code:apiserver_request_total:increase30d
          - expr: |
              sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~"POST|PUT|PATCH|DELETE"})
            labels:
              verb: write
            record: code:apiserver_request_total:increase30d
    rules_yml: |
      {}
  deployment:
    configmapReload:
      containers:
        args:
          - --volume-dir=/etc/config
          - --webhook-url=http://127.0.0.1:9090/-/reload
    containers:
      args:
        - --storage.tsdb.retention.time=42d
        - --config.file=/etc/config/prometheus.yml
        - --storage.tsdb.path=/data
        - --web.console.libraries=/etc/prometheus/console_libraries
        - --web.console.templates=/etc/prometheus/consoles
        - --web.enable-lifecycle
    replicas: 1
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    updateStrategy: Recreate
  pvc:
    accessMode: ReadWriteOnce
    storage: 20Gi
    storageClassName: k8s-policy
  service:
    port: 80
    targetPort: 9090
    type: ClusterIP
pushgateway:
  deployment:
    replicas: 1
  service:
    port: 9091
    targetPort: 9091
    type: ClusterIP

Some important things to note from this specification:

  • ingress is enabled (ingress: enabled: true)
  • ingress is configured for URLs ending in /alertmanager/ (alertmanagerprefix:) and / (prometheus_prefix:).
  • The FQDN for Prometheus is prometheus.corp.vmw (virtual_host_fqdn:)
  • A custom certificate is supplied under the ingress section (tls.crttls.keyca.crt). This is the wildcard certificate created for use in my environment (works for anything ending in corp.vmw).
  • The pvc for alertmanager is 2GB and will be created under the k8s-policy storageClass.
  • The pvc for promethues is 20GB and will be created under the k8s-policy storageClass.

The Prometheus package can be installed.

tanzu package install prometheus -p prometheus.tanzu.vmware.com -v 2.36.2+vmware.1-tkg.1 --values-file prometheus-data-values.yaml -n tanzu-system-monitoring

Since this package has pvcs, you will see some activity around this in the vSphere Client.

Check to see that that necessary components have been successfully deployed:

tanzu package installed list -n tanzu-system-monitoring
 
  NAME        PACKAGE-NAME                 PACKAGE-VERSION        STATUS
  prometheus  prometheus.tanzu.vmware.com  2.36.2+vmware.1-tkg.1  Reconcile succeeded
kubectl -n tanzu-system-monitoring get all
NAME                                                 READY   STATUS    RESTARTS   AGE
pod/alertmanager-d56577ff5-mlq6k                     1/1     Running   0          2m1s
pod/prometheus-kube-state-metrics-7b48c44779-62mk2   1/1     Running   0          2m2s
pod/prometheus-node-exporter-5qtrs                   1/1     Running   0          2m2s
pod/prometheus-node-exporter-bv5q9                   1/1     Running   0          2m2s
pod/prometheus-node-exporter-dpmxz                   1/1     Running   0          2m2s
pod/prometheus-pushgateway-5cc854c88b-b4wmr          1/1     Running   0          119s
pod/prometheus-server-5b6c8b5444-nvxgt               2/2     Running   0          2m1s
 
NAME                                    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
service/alertmanager                    ClusterIP   10.111.1.128   <none>        80/TCP          2m1s
service/prometheus-kube-state-metrics   ClusterIP   None           <none>        80/TCP,81/TCP   2m2s
service/prometheus-node-exporter        ClusterIP   10.109.6.230   <none>        9100/TCP        2m1s
service/prometheus-pushgateway          ClusterIP   10.97.135.94   <none>        9091/TCP        2m2s
service/prometheus-server               ClusterIP   10.102.5.183   <none>        80/TCP          2m
 
NAME                                      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/prometheus-node-exporter   3         3         3       3            3           <none>          2m2s
 
NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/alertmanager                    1/1     1            1           2m1s
deployment.apps/prometheus-kube-state-metrics   1/1     1            1           2m2s
deployment.apps/prometheus-pushgateway          1/1     1            1           2m
deployment.apps/prometheus-server               1/1     1            1           2m1s
 
NAME                                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/alertmanager-d56577ff5                     1         1         1       2m1s
replicaset.apps/prometheus-kube-state-metrics-7b48c44779   1         1         1       2m2s
replicaset.apps/prometheus-pushgateway-5cc854c88b          1         1         1       2m
replicaset.apps/prometheus-server-5b6c8b5444               1         1         1       2m1s

You can see the 2GB and 20GB pvcs created for alertmanager and prometheus:

kubectl -n tanzu-system-monitoring get pvc
NAME                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
alertmanager        Bound    pvc-a53f7091-9823-4b70-a9b4-c3d7a1e27a4b   2Gi        RWO            k8s-policy     2m30s
prometheus-server   Bound    pvc-41745d1d-9401-41d7-b44d-ba430ecc5cda   20Gi       RWO            k8s-policy     2m30s

And since this package has an ingress configured, you will see an httpproxy resource in the tanzu-system-monitoring namespace:

kubectl -n tanzu-system-monitoring get httpproxy
NAME                   FQDN                  TLS SECRET       STATUS   STATUS DESCRIPTION
prometheus-httpproxy   prometheus.corp.vmw   prometheus-tls   valid    Valid HTTPProxy

To check that Service Discovery (external-DNS) picked this up, and created the necessary record, you can check the DNS application to see that a new DNS record exists.

This record is using the same IP address that was seen for the envoy service earlier, 10.40.14.70 (provided as an NSX LB). You can point a browser to https://prometheus.corp.vmw to check that the Prometheus package is working, the certificate configured is in place, that the ingress is working properly and the DNS record resolves to the correct address.

Everything looks as it should and there is no certificate warning.

Install the Grafana package

The next package we’ll install is Grafana and it’s very similar to Prometheus as it is using ingress and has a pvc.

Create the namespace.

kubectl create ns tanzu-system-dashboards

Determine the version to use:

grafana.tanzu.vmware.com.7.5.16+vmware.1-tkg.1                       grafana.tanzu.vmware.com                       7.5.16+vmware.1-tkg.1   2m31s
grafana.tanzu.vmware.com.7.5.7+vmware.1-tkg.1                        grafana.tanzu.vmware.com                       7.5.7+vmware.1-tkg.1    2m30s
grafana.tanzu.vmware.com.7.5.7+vmware.2-tkg.1                        grafana.tanzu.vmware.com                       7.5.7+vmware.2-tkg.1    2m30s

Again, we’ll use the latest version, 7.5.16.

Create a data-values specification for the Grafana package:

grafana-data-values.yaml

namespace: tanzu-system-dashboards
grafana:
  deployment:
    replicas: 1
    updateStrategy: Recreate
  service:
    type: NodePort
    port: 80
    targetPort: 3000
  config:
    grafana_ini: |
      [analytics]
      check_for_updates = false
      [grafana_net]
      url = https://grafana.com
      [log]
      mode = console
      [paths]
      data = /var/lib/grafana/data
      logs = /var/log/grafana
      plugins = /var/lib/grafana/plugins
      provisioning = /etc/grafana/provisioning
    datasource_yaml: |-
      apiVersion: 1
      datasources:
        - name: Prometheus
          type: prometheus
          url: prometheus-server.tanzu-system-monitoring.svc.cluster.local
          access: proxy
          isDefault: true
    dashboardProvider_yaml: |-
      apiVersion: 1
      providers:
        - name: 'sidecarDashboardProvider'
          orgId: 1
          folder: ''
          folderUid: ''
          type: file
          disableDeletion: false
          updateIntervalSeconds: 10
          allowUiUpdates: false
          options:
            path: /tmp/dashboards
            foldersFromFilesStructure: true
  pvc:
    annotations: {}
    storageClassName: k8s-policy
    accessMode: ReadWriteOnce
    storage: "2Gi"
  secret:
    type: "Opaque"
    admin_password: "Vk13YXJlMSE="
ingress:
  enabled: true
  virtual_host_fqdn: "grafana.corp.vmw"
  prefix: "/"
  servicePort: 80
  tlsCertificate:
    tls.crt: |
      -----BEGIN CERTIFICATE-----
      MIIHxTCCBa2gAwIBAgITIgAAAAQnSpH7QfxTKAAAAAAABDANBgkqhkiG9w0BAQsF
      ADBMMRMwEQYKCZImiZPyLGQBGRYDdm13MRQwEgYKCZImiZPyLGQBGRYEY29ycDEf
      MB0GA1UEAxMWY29udHJvbGNlbnRlci5jb3JwLnZtdzAeFw0yMjAzMjExOTQ1MzNa
      Fw0zMjAzMTgxOTQ1MzNaMG0xCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9y
      bmlhMRIwEAYDVQQHEwlQYWxvIEFsdG8xDzANBgNVBAoTBlZNd2FyZTEPMA0GA1UE
      CxMGVk13YXJlMRMwEQYDVQQDDAoqLmNvcnAudm13MIICIjANBgkqhkiG9w0BAQEF
      AAOCAg8AMIICCgKCAgEAzhh0/CNdiskBkkzZzrhBbKiQEzekxnw0uCgLCpkmQzIe
      m98jgt7LgS2dQFf5vABpTqIX11t4EoY/1gmsrz1IjBv26VrIjSLFmGppRgh+7NPY
      LbXVB6AbrgtlC+a89d14h37l30khZ62H2KTvmUojihUSeYZfi50IMsW/UPr8Z94V
      xhI2bTy/PDL/6sd+qmTpqaZplooEQBn18D0VqJhbTCK+kbdxgM01/8PcZoTcjAEN
      XcUR64p3H0KG0QeQn8r81nOsEihNBUCsrc6PiqPDRGtOJ+BsUhQy+BVmk/dIwnqM
      PJN7cTElz3h+ScJdUr0vPBfZjkI/glRk4GugFomg/qdHvt0OVsLG9efBbiZJo8/X
      PWEcRppCYE+HexbiVwKCML2i75nsELdH4+UVmfrywhAyZf4hvlz2/w5LQudPrfHz
      qPZ76NLmUkkK7buPGsktbtMS4n8fUGsr23Y4UxvZNRrvb6aPp3gSylDrzsV4dTQb
      PiLxzrZymiYw8tReMchCXbb9LGoN50xYbq2nAyUPhxnH0HOuw415shfkLTTGx1ms
      Ht+NeTUbavSh4dBa6EKFfKbYQ9Fukw3XkeMClX0Mdo/EQemivLGO+hIxoDIexkQs
      kXU2y/1uRhCOW5Rcsjr3EgwXGHENr8WUIut8wG1QoW8lg0ma73ErIY37dtFORGcC
      AwEAAaOCAn0wggJ5MA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEFBQcD
      ATAVBgNVHREEDjAMggoqLmNvcnAudm13MB0GA1UdDgQWBBQHTiEY3UJ11GdfNS9U
      iba5lC0f3zAfBgNVHSMEGDAWgBTGg/OWUAlp52UyrZaNhYHgXyQZxzCB1wYDVR0f
      BIHPMIHMMIHJoIHGoIHDhoHAbGRhcDovLy9DTj1jb250cm9sY2VudGVyLmNvcnAu
      dm13LENOPWNvbnRyb2xjZW50ZXIsQ049Q0RQLENOPVB1YmxpYyUyMEtleSUyMFNl
      cnZpY2VzLENOPVNlcnZpY2VzLENOPUNvbmZpZ3VyYXRpb24sREM9Y29ycCxEQz12
      bXc/Y2VydGlmaWNhdGVSZXZvY2F0aW9uTGlzdD9iYXNlP29iamVjdENsYXNzPWNS
      TERpc3RyaWJ1dGlvblBvaW50MIHFBggrBgEFBQcBAQSBuDCBtTCBsgYIKwYBBQUH
      MAKGgaVsZGFwOi8vL0NOPWNvbnRyb2xjZW50ZXIuY29ycC52bXcsQ049QUlBLENO
      PVB1YmxpYyUyMEtleSUyMFNlcnZpY2VzLENOPVNlcnZpY2VzLENOPUNvbmZpZ3Vy
      YXRpb24sREM9Y29ycCxEQz12bXc/Y0FDZXJ0aWZpY2F0ZT9iYXNlP29iamVjdENs
      YXNzPWNlcnRpZmljYXRpb25BdXRob3JpdHkwPAYJKwYBBAGCNxUHBC8wLQYlKwYB
      BAGCNxUIgfaGVYGazCSBvY8jgZr5P5S9VzyB26Mxg6eFPwIBZAIBAjAbBgkrBgEE
      AYI3FQoEDjAMMAoGCCsGAQUFBwMBMA0GCSqGSIb3DQEBCwUAA4ICAQCCxqhinZTi
      NWHGvTgms+KdmNfIFR/R6GTFF8bO/bgfRw5pVeSDurEechjlO2hRDWOHn4H2+fIQ
      vN4I6cjEbVnDBbVrRkbCfNLD1Wjj/zz65pZ8PmjQUiFl9L4HxaUH5sF/3Jrylsu0
      M2oITByEfl5WpfC0oyB2/9nKYLPLOK5j2OicHxEino1RPAPdOk5gU5c6+Ed74kVh
      KtgK3r8Lc4/IhwfmSDgGl1DmEkzv/u+0bQTOOH1fVKSl3p9+YieADc3s2SJWFF0F
      mKElJHlZEeAg+MI16zNbQhowZE2SE+b9VTGK9KDkCmYGRjpHc61onQNTzIH5rDFx
      /0aBOGp3+tdA+QEI8VgpQlaa0BtsKyY3l/DAg7I42x4Zv9ta7vZUe81v11PqdSJQ
      v7NOriJkRNPneErH2QNsbi00p+TlUFzOl85AEtlB722/fDHbDGSxngDAhImCv4uC
      xPnwVRo94AI0A6ol0FEct2z2wgehQKQbwcskNpOE7wryd+/yqrm+Z5o0crfaPPwX
      uEwDtQjCRM+8wDWBcQnvyOJe374nFLpGpX8tZLjOg2U0wwdsfAxtYGZUAN0V0xm0
      YYsIjp7/f+Pk1DjzWx8JIAbzItKLucDreAmmDXqk+DrBP9LYqtmjB0n7nSErgK8G
      sA3kGCJdOkI0kgF10gsinaouG2jVlwNOsw==
      -----END CERTIFICATE-----
    tls.key: |
      -----BEGIN PRIVATE KEY-----
      MIIJRAIBADANBgkqhkiG9w0BAQEFAASCCS4wggkqAgEAAoICAQDOGHT8I12KyQGS
      TNnOuEFsqJATN6TGfDS4KAsKmSZDMh6b3yOC3suBLZ1AV/m8AGlOohfXW3gShj/W
      CayvPUiMG/bpWsiNIsWYamlGCH7s09gttdUHoBuuC2UL5rz13XiHfuXfSSFnrYfY
      pO+ZSiOKFRJ5hl+LnQgyxb9Q+vxn3hXGEjZtPL88Mv/qx36qZOmppmmWigRAGfXw
      PRWomFtMIr6Rt3GAzTX/w9xmhNyMAQ1dxRHrincfQobRB5CfyvzWc6wSKE0FQKyt
      zo+Ko8NEa04n4GxSFDL4FWaT90jCeow8k3txMSXPeH5Jwl1SvS88F9mOQj+CVGTg
      a6AWiaD+p0e+3Q5Wwsb158FuJkmjz9c9YRxGmkJgT4d7FuJXAoIwvaLvmewQt0fj
      5RWZ+vLCEDJl/iG+XPb/DktC50+t8fOo9nvo0uZSSQrtu48ayS1u0xLifx9Qayvb
      djhTG9k1Gu9vpo+neBLKUOvOxXh1NBs+IvHOtnKaJjDy1F4xyEJdtv0sag3nTFhu
      racDJQ+HGcfQc67DjXmyF+QtNMbHWawe3415NRtq9KHh0FroQoV8pthD0W6TDdeR
      4wKVfQx2j8RB6aK8sY76EjGgMh7GRCyRdTbL/W5GEI5blFyyOvcSDBcYcQ2vxZQi
      63zAbVChbyWDSZrvcSshjft20U5EZwIDAQABAoICAHmP83DFa2dxKHwi2FYWWIC+
      7Dxplcd9e5skA1889lSsO2G1PDz1LRQE07wgKC28EGFROr7MNQa4KO8WxcSXYTND
      S2BZK/ITkHlWSsIEQNlwGxLbLcxRpAIEtpVOhCaBe5ZwQyZw/EMrF/WxU6IXGN9Z
      jowftjujZDKOcUpSwI6DcFRkabYFHsdjTZAuG4hl/W0TuzQQNHGa3nXVkfDf7Pn7
      hGxux4QxhqhV3qqZs3zhIgEtPGSyR5EorFyfGa8nC/tyPwx2uPdgLnpWXFRqQ8MX
      iAH9XecMAwRRmy+rrD8KCa2xUB5z3tmBOPxIqMMk07eeWbSPXuaA4P9+e+7PPyXl
      BEZ/6Y5wICrd2WrzJQOUrLQpbiSRZi7tTxUxYCKWXmB93RDjG8yl5qXZZ5G173PK
      hGHH8KyUWD6tW0ytFEW714IaqJN8ffTCkThnveWKOd/jW3/ffxxShMj9+FnnUEfp
      dfQCq3rZEafoJX/A98TOibq/f5Rogky3D3azhs6gz/+NBterX6+7U2OnSK4cPAGr
      KPn0HJT99gnojkFO32L0N8QriXSdY5gpTw+ZGzgOu78WO5JwvCd+LENNCMfIQckh
      Jm/GM+0sTG03bhlQwyfJeV43ETyarGfiRuwFzgbvH24Ie6iSPSrIu5oqFfDp2+El
      EzAC9kqo4rmCoUQcX5ABAoIBAQD/egDF3mu7yVXQx8EFSKAsTi/Bq6z4gmBtQdN3
      g4MbdlfZ5/rrQZ1pbm2VlqLYz0hab4V0rtJKFKAg4UdSxZTnuR3tZCgSjNd4wy8b
      Meuwyls/IMKPmcrKmBWSxesQRs/wMlKPWYv7SUgJhgHqiPuDP74s1FaYdnPeVjKO
      BDABdzk3MJaU/FEPqHs+0RhjhHOaVm4v2T+/AYIFZ0Gvm+uSlhjlfse83Ui2WV5b
      0DUWS0Gl80UCkPloDWpDa21bcpmcYqrBUnppYz4XPvzNjnYZflubJU7ZGoEB3DNw
      u8IlVidH2/c86eGKvCNihs01f3Oxg8XpEB9W6mFjhDVE7FnJAoIBAQDOhI2+tcG+
      oQybL3IhvvTYnpxPh9GU9qxCEsb71qR+xmXXdIAf9ZPtuL+iet2B60CICcn94gtn
      ntqm4mSjJ9qg7CVnwCLWRhSqGE7Ha+4xnAQgRIpaVhRYThS0ZEL2GbYpezkc78Ql
      ZKDWJLAz/4sGzQcbBt+xVM2962bTFVSWRdeXfe6BoRbn5f9V9ZA7wSE5porC1k1h
      C6HiApOXmZMryILGrrE2YxO/S4cA+TVwH2/dU8W4Ti50EcYvMUwLf3hE3hZwIUvR
      HSecQBcb5fp3mZpnvaZjNTx7vKh6jjRXfmSRhI2dbSZZh6k+5NFP5C28A4Hkhuh6
      lpAGQxQ2h8SvAoIBAQDUTtBbn3aKbUvaoFZBDNTHXQaE/SVWtApsYZraJDmNVfC2
      DvnQDgxBtNpuyOt2H/Rx62HN0QbDN5bHHFAIclhHpehAAs7mc5MRMatw/zBuEAx6
      TsBBVD5Z1L+A5Odu9FoTs842gOU6o/CwsWPgQ4w4y31Ahgmc1DuAVsPWj5ZRcYHj
      4oYRNAotaAdb8apB8a2cYh1ZuEIoeplR4jiNNpczj3cLKSvWQVMO7v/ibwnfCBV7
      UspT0qThmtxnQNx1daxAcSKUW/WMpUPRT7AJJ03v67k3Gm8HLuZs5FD/a5lxK8Kj
      DiLNxVOA1s7VL09UGSHNMMQE5jgVI9xhNlqKd5w5AoIBAQCe0y639szUMMOjLbAW
      5+ciGYmZWJkEeVktT4ec8wx7O1Xjh4NqENH9x1IKQXfNjQGKHg0spgWjYXZDVmWT
      XPk1PafezNN9+1O1JRChKg58NMKvlkbZBs6KwzIFMf6VilygNlZMPNGa+HMBfiHN
      O8DOMCxAyt6KYPACGeJwgD0XfQs7ROyC4ULegfIHR93vNq64ya55/ZpxAiMz0Et2
      EfQvffullYBQlY4AVrOzOfWxD1xW2TB8eBQdy/WhIcacKSJzxGF5RwIqBsQJ1Phw
      ykQAay9mjWJDdhPYDdV8u5ThnSD3EPxgkCsoO78b0ZpwWMobiI8DFAYDEXwedMQ8
      09mdAoIBAQCxsvbB38jiPPqsF77jxWzogVdPR6xO6oLkVcvu4KI1Fq0wFdDH4eBJ
      gOxV8e+wsQBVO1XXP4i3UhbCq7gPgh+YJWWy66RKiuwpIPf2yvGbbUgBpp73MxOx
      ycerjh7LRS56gAAJ6d5hyYf293E5QEAOZh+Niuz3m1xMytdvG7IefX3TMe0GMgZh
      I5djpaHvjgB6qHeVsLuUWjC2FPXtrz2STay08tq/Pc5g+57bTfOyHo0BZ3C/uhZa
      l1NzswracGQIzo03zk/X3Z6P2YOea4BkZ0Iwh34wOHJnTkfEeSx6y+oSFMcFRthT
      yfFCZUk/sVCc/C1a4VigczXftUGiRrTR
      -----END PRIVATE KEY-----
    ca.crt: |
      -----BEGIN CERTIFICATE-----
      MIIFczCCA1ugAwIBAgIQTYJITQ3SZ4BBS9UzXfJIuTANBgkqhkiG9w0BAQsFADBM
      MRMwEQYKCZImiZPyLGQBGRYDdm13MRQwEgYKCZImiZPyLGQBGRYEY29ycDEfMB0G
      A1UEAxMWY29udHJvbGNlbnRlci5jb3JwLnZtdzAeFw0yMjAzMjExOTE3MjhaFw0z
      NzAzMjExOTI3MjNaMEwxEzARBgoJkiaJk/IsZAEZFgN2bXcxFDASBgoJkiaJk/Is
      ZAEZFgRjb3JwMR8wHQYDVQQDExZjb250cm9sY2VudGVyLmNvcnAudm13MIICIjAN
      BgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEA2OYKxckOjhgufWu1YEnatvJ1M127
      gwPbFNj11/dICXaPe+mjN1Hce0PiS2QaaeAe8kH+mOKRa2JjaGdXr6rOiB80KZOR
      uw0GzSJyL5w7ewR+NJf31YO62BD/mt3sHeMnCXmSBxOQvb0nGkhTr1y+rDpvxJ87
      zNczgfN54to6S379wjOsC4bkHLnMJ5EtJG78pPqX1+1wcVOURNJ6y9BcejLnoy/y
      CFpXKOVxKHzy2nnsitAuBb+hD+Jxw8/jFQUhxH0VlgyfXCQdegasSA9RHtZtfpVs
      hshisjkSlvQmbsEknBZrAfBVIYidwt3w050jVhiUs5Ql6vDotY6Gqtzzgq0obv6P
      7E9NPej3BzhPSIUyqnpf57UWI4zUiRJvbSu/J2MCBKHwYfzke1cnvLA7viDEdB9+
      /Htk9aG9/1B6ddDfafrcSOWtkTfHWYLv21o3Uwoh9W5OpK9JikZu/PqnpZkUi+2C
      L+WCww/BS1yhQwVif6PqUMeSLz3jtq3w6R/ruUMlO+0E5//bskDT6QGxBgcvMF9n
      Dl+u0uqHKOdiUvOXBtF139HKUrZsq0m3WPoel2/p+cVVJYsyJG/rRpeh1g/X0cB3
      9EuTjX6vnrT+IS8ZfAaoHzpmgh1vGu2r2xgPq2E8x4ji9FGV8YTjAs60Nw7YxKUW
      Wgj+YNpxP2SxFqUCAwEAAaNRME8wCwYDVR0PBAQDAgGGMA8GA1UdEwEB/wQFMAMB
      Af8wHQYDVR0OBBYEFMaD85ZQCWnnZTKtlo2FgeBfJBnHMBAGCSsGAQQBgjcVAQQD
      AgEAMA0GCSqGSIb3DQEBCwUAA4ICAQAutXwOtsmYcbj/bs3Mydx0Di9m+6UVTEZd
      ORRrTus/BL/TNryO7zo2beczGPK26MwqhmUZiaF61jRb36kxmFPVx2uV2np4LbQj
      5MrxtPzf2XXy4b7ADqQpLgu4rR3mZiXGmzUoV17hmAhyfSU1qm4FssXGK2ypWsQs
      BwsKX4DsIijJJZbXwKFaauq0LtnkgeGWdoEFFWAH0yJWPbz9h+ovlCxq0DBiG00l
      brnY90sqpoiWTxMKNCXDDhNjvtxO3kQIDQVvbNMCEbmYG+RrWQHtvufw97RK/cTL
      9dKFSblIIizMINVwM/gqtlVVvWP1EFaUy0xG5bvOO+SCe+TlA7rz4/RORqqE5Ugg
      7F8fWz+o6BM/qf/Kwh+WN42dyR1rOsFqEVNamZLjrAzgwjQ/nquRRMl2cK6yg6Fq
      d0O42wwYPpLUEFv4xe4a3kpRvvhshNkzR4IacbmaUlnzmlewoFXVueEblviBHJoV
      1OUC6qfLkCjfCEv470Kr5vDe5Y/l/7j8EYj7a/wa2++kq+7xd+bj/DDed85fm3Yk
      dhfp7bGXKm4KbPLzkSpiYWbE+EbArLtIk62exjcJvJPdoxMTxgbdelzl/snPLrdg
      w0oGuTTBfxSMKs767N3G1q5tz0mwFpIqIQtXUSmaJ+9p7IkpWcThLnyYYo1IpWm/
      ZHtjzZMQVA==
      -----END CERTIFICATE-----

Some important things to note from this specification:

  • ingress is enabled (ingress: enabled: true)
  • ingress is configured for URLs ending in  / (prefix:).
  • The FQDN for Grafana is grafana.corp.vmw (virtual_host_fqdn:)
  • A custom certificate is supplied under the ingress section (tls.crttls.keyca.crt). This is the wildcard cert for the lab (works for anything ending in corp.vmw).
  • The pvc for grafana is 2GB and will be created under the k8s-policy storageClass.
  • The admin password for the Grafana UI is VMware1! (base64 encoded as Vk13YXJlMSE= under grafana: secret: admin_password:).

The Grafana package can be installed.

tanzu package install grafana -p grafana.tanzu.vmware.com -v 7.5.16+vmware.1-tkg.1 --values-file grafana-data-values.yaml -n tanzu-system-dashboards

In the vSphere Client, you’ll see the same storage-related activity as was observed during the Prometheus package deployment when the Grafana pvc is created.

Check to see that that necessary components have been successfully deployed:

tanzu package installed list -n tanzu-system-dashboards
 
  NAME     PACKAGE-NAME              PACKAGE-VERSION        STATUS
  grafana  grafana.tanzu.vmware.com  7.5.16+vmware.1-tkg.1  Reconcile succeeded
kubectl -n tanzu-system-dashboards get all
NAME                           READY   STATUS    RESTARTS   AGE
pod/grafana-594559bc55-9mzzn   2/2     Running   0          2m18s
 
NAME              TYPE       CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
service/grafana   NodePort   10.98.84.82   <none>        80:31110/TCP   2m18s
 
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/grafana   1/1     1            1           2m18s
 
NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/grafana-594559bc55   1         1         1       2m18s
kubectl -n tanzu-system-dashboards get pvc
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
grafana-pvc   Bound    pvc-243405bc-93e3-4570-bfdc-8451cbe503af   2Gi        RWO            k8s-policy     2m41s
kubectl -n tanzu-system-dashboards get httpproxy
NAME                FQDN               TLS SECRET    STATUS   STATUS DESCRIPTION
grafana-httpproxy   grafana.corp.vmw   grafana-tls   valid    Valid HTTPProxy

You can log in to Grafana at https://grafana.corp.vmw with the admin username and password configured in the data-values file (VMware1!).

Once logged in, you can navigate to Dashboards, Manager and then access the TKG Kubernetes cluster monitoring (via Prometheus) dashboard.

Install the fluent-bit package

You need a destination for fluent-bit log forwarding and my lab is using vRealize Log Insight (vRLI). By default, vRLI is listing for syslog traffic on port 514 (UDP) so this is what will be used when configuring fluent-bit.

Create the namespace.

kubectl create ns tanzu-system-logging

Create the data-values specification.

fluent-bit-data-values.yaml

namespace: "tanzu-system-logging"
fluent_bit:
  config:
    service: |
      [Service]
        Flush         1
        Log_Level     info
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020
    outputs: |
      [OUTPUT]
        Name              stdout
        Match             *
 
      [OUTPUT]
        Name   syslog
        Match  kube.*
        Host   vrli-01a.corp.vmw
        Port   514
        Mode   udp
        Syslog_Format        rfc5424
        Syslog_Hostname_key  tkg2_cluster-1
        Syslog_Appname_key   pod_name
        Syslog_Procid_key    container_name
        Syslog_Message_key   message
        Syslog_SD_key        k8s
        Syslog_SD_key        labels
        Syslog_SD_key        annotations
        Syslog_SD_key        tkg
 
      [OUTPUT]
        Name   syslog
        Match  kube_systemd.*
        Host   vrli-01a.corp.vmw
        Port   514
        Mode   udp
        Syslog_Format        rfc5424
        Syslog_Hostname_key  tkg2_cluster-1
        Syslog_Appname_key   tkg2_instance
        Syslog_Message_key   MESSAGE
        Syslog_SD_key        systemd
 
    inputs: |
      [INPUT]
        Name tail
        Path /var/log/containers/*.log
        Parser docker
        Tag kube.*
        Mem_Buf_Limit 5MB
        Skip_Long_Lines On
 
      [INPUT]
        Name              tail
        Tag               audit.*
        Path              /var/log/audit/audit.log
        Parser            logfmt
        DB                /var/log/flb_system_audit.db
        Mem_Buf_Limit     50MB
        Refresh_Interval  10
        Skip_Long_Lines   On
 
      [INPUT]
        Name                systemd
        Tag                 kube_systemd.*
        Path                /var/log/journal
        DB                  /var/log/flb_kube_systemd.db
        Systemd_Filter      _SYSTEMD_UNIT=kubelet.service
        Systemd_Filter      _SYSTEMD_UNIT=containerd.service
        Read_From_Tail      On
        Strip_Underscores   On
 
      [INPUT]
        Name              tail
        Tag               apiserver_audit.*
        Path              /var/log/kubernetes/audit.log
        Parser            json
        DB                /var/log/flb_kube_audit.db
        Mem_Buf_Limit     50MB
        Refresh_Interval  10
        Skip_Long_Lines   On
 
    filters: |
      [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
 
      [FILTER]
        Name                record_modifier
        Match               *
        Record tkg_cluster tkg-wld-01a
        Record tkg_instance tkg-mgmt
 
      [FILTER]
        Name                  nest
        Match                 kube.*
        Operation             nest
        Wildcard              tkg_instance*
        Nest_Under            tkg
 
      [FILTER]
        Name                  nest
        Match                 kube_systemd.*
        Operation             nest
        Wildcard              SYSTEMD*
        Nest_Under            systemd
 
      [FILTER]
        Name                  modify
        Match                 kube.*
        Copy                  kubernetes k8s
 
      [FILTER]
        Name                  nest
        Match                 kube.*
        Operation             lift
        Nested_Under          kubernetes
 
    parsers: |
      [PARSER]
        Name   apache
        Format regex
        Regex  ^(?[^ ]*) [^ ]* (?[^ ]*) \[(?


There is very little that is customized in this specification.

  • Host: vrli-01a.corp.vmw (the vRLI FQDN)
  • Syslog_Hostname_key: tkg2_cluster_1 (arbitrary but helps to identify where the logs came from and should be unique if there are multiple fluent-bit installations for multiple clusters)
  • Syslog_Appname_key: tkg2_instance (again, arbitrary but will help to identify where the logs came from)

The fluent-bit package can be installed.

tanzu package install fluent-bit -p fluent-bit.tanzu.vmware.com -n tanzu-system-logging -v 1.8.15+vmware.1-tkg.1 --values-file fluent-bit-data-values.yaml -n tanzu-system-logging

Check to see that that necessary components have been successfully deployed:

tanzu package installed list -n tanzu-system-logging
 
  NAME        PACKAGE-NAME                 PACKAGE-VERSION        STATUS
  fluent-bit  fluent-bit.tanzu.vmware.com  1.8.15+vmware.1-tkg.1  Reconcile succeeded
kubectl -n tanzu-system-logging get all
NAME                   READY   STATUS    RESTARTS   AGE
pod/fluent-bit-964bh   1/1     Running   0          4m55s
pod/fluent-bit-lmdvk   1/1     Running   0          4m55s
pod/fluent-bit-p28qh   1/1     Running   0          4m55s
 
NAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/fluent-bit   3         3         3       3            3           <none>          4m55s

You can quickly validate that data is seen in vRLI on the Overview page:

You can also navigate to Explore Logs to get a more detailed view of the logs coming in to vRLI form the TKGS cluster.

Harbor

One of the other popular packages is the Harbor package. I didn’t install it as I had already enabled the Image Registry (Harbor) service on the supervisor cluster. The installation will look very similar to the Prometheus and Grafana installations as it can (should) use a custom certificate, has PVCs and an ingress. The harbor-data-values.yaml file will look very similar to the one in the Deploy Harbor section of my earlier post, How to configure external-dns with Microsoft DNS in TKG 1.3 (plus Harbor and Contour).

Leave a Comment

Your email address will not be published. Required fields are marked *