How to Configure Fluent Bit and Kafka in Tanzu Kubernetes Grid

Tanzu Kubernetes Grid provides several different Fluent Bit manifest files to help you deploy and configure Fluent Bit for use with Splunk, Elastic Search, Kafka and a generic HTTP endpoint. In this post, I’ll walk though not only the Fluent Bit configuration which VMware has documented but the deployment of  Kafka in a TKG cluster. You’ll notice that much of this post is similar to How to Configure Fluent Bit and Splunk in Tanzu Kubernetes Grid since they both share most of the same Fluent Bit configuration.

In addition to the tkg cli utility and the OVA files needed to stand up a TKG cluster, you’ll need to download the extensions file as well from https://my.vmware.com/web/vmware/downloads/details?downloadGroup=TKG-112&productId=988&rPId=46507. This contains the manifest files needed to deploy/configure the authentication, log forwarding and ingress solutions that VMware supports for TKG.

I’ll be using Helm to install Kafka so you’ll want to make sure you have Helm deployed to your TKG cluster if you want to follow along with these steps.

I have a storage class named k8s-policy, which maps to an NFS volume mounted to all of my ESXi hosts, and accessible to Kubernetes via vSphere CNS.

Once you have a TKG cluster up and running, the first step will be to extract the contents of the extensions bundle. You should see a folder structure similar to the following:

ls tkg-extensions-v1.1.0/ 
authentication cert-manager ingress logging

We’ll be working in the logging section so you can start to focus on the manifest files in this location and its sub-directories.

The first step to deploying Fluent Bit in a TKG cluster is to create the tanzu-system-logging namespace and the needed RBAC components. These steps are the same regardless of the logging backend.  You can read more about this step in detail at Create Namespace and RBAC Components.  You should inspect the manifest files prior to applying them to make sure that you’re okay with the RBAC objects being created. Note: If you’ve already configured Fluent Bit for Splunk, you can skip applying these four manifests.

kubectl apply -f tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/00-fluent-bit-namespace.yaml 
kubectl apply -f tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/01-fluent-bit-service-account.yaml 
kubectl apply -f tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/02-fluent-bit-role.yaml 
kubectl apply -f tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/03-fluent-bit-role-binding.yaml

I decided to deploy Kafka via the Bitnami Helm chart. You should note that I chose to place all components in the tanzu-system-logging namespace only as my personal preference…there is no requirement to do so.

The first step is to add the Bitnami Helm repo:

helm repo add bitnami https://charts.bitnami.com/bitnami 
"bitnami" has been added to your repositories

Once the repo is added, you can use Helm to deploy Kafka. I’m using a few specific items in the following command:

  • I’m setting the namespace to tanzu-system-logging
  • I’m setting the storage classes to k8s-policy.
  • I’m setting deleteTopicEnable to true only because I went through a lot of trial and error and liked having the ability to clean up my mistakes.
  • I’m setting autoCreateTopicsEnable to true to avoid having to create any topics manually.
helm install --namespace tanzu-system-logging kafka --set global.storageClass=k8s-policy --set deleteTopicEnable=true --set autoCreateTopicsEnable=true --set persistence.storageClass=k8s-policy bitnami/kafka 
NAME: kafka 
LAST DEPLOYED: Fri May 29 12:46:26 2020 
NAMESPACE: tanzu-system-logging 
STATUS: deployed 
REVISION: 1 
TEST SUITE: None 
NOTES: 
** Please be patient while the chart is being deployed ** 
Kafka can be accessed via port 9092 on the following DNS name from within your cluster: 
    kafka.tanzu-system-logging.svc.cluster.local 
To create a a pod that you can use as a Kafka client run the following command: 
    kubectl run kafka-client --rm --tty -i --restart='Never' --image docker.io/bitnami/kafka:2.5.0-debian-10-r29 --namespace tanzu-system-logging --command -- bash
    PRODUCER: 
        kafka-console-producer.sh --broker-list 127.0.0.1:9092 --topic test 
    CONSUMER: 
        kafka-console-consumer.sh --bootstrap-server 127.0.0.1:9092 --topic test --from-beginning

As with our Splunk deployment, you should now see a number of Kafka items in the tanzu-system-logging namespace:

kubectl -n tanzu-system-logging get po,svc,pvc,pv,secrets,cm 
NAME                                  READY   STATUS    RESTARTS   AGE 
pod/kafka-0                           1/1     Running   0          1d 
pod/kafka-zookeeper-0                 1/1     Running   0          1d 
NAME                                    TYPE           CLUSTER-IP       EXTERNAL-IP               PORT(S)                      AGE 
service/kafka                           ClusterIP      100.69.1.17      <none>                    9092/TCP                     1d 
service/kafka-headless                  ClusterIP      None             <none>                    9092/TCP,9093/TCP            1d 
service/kafka-zookeeper                 ClusterIP      100.69.13.129    <none>                    2181/TCP,2888/TCP,3888/TCP   1d 
service/kafka-zookeeper-headless        ClusterIP      None             <none>                    2181/TCP,2888/TCP,3888/TCP   1d
NAME                                                   STATUS   VOLUME                                     CAPACITY   ACCESS   MODES        STORAGECLASS AGE 
persistentvolumeclaim/data-kafka-0                     Bound    pvc-4688e2dd-2194-4e11-b0ed-bebd598797ed   8Gi        RWO      k8s-policy   1d 
persistentvolumeclaim/data-kafka-zookeeper-0           Bound    pvc-6e775ec6-e28b-428c-8c3b-7bcb1cf9a06a   8Gi        RWO      k8s-policy   1d 
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                      STORAGECLASS   REASON   AGE 
persistentvolume/pvc-4688e2dd-2194-4e11-b0ed-bebd598797ed   8Gi        RWO            Delete           Bound    tanzu-system-logging/        data-kafka-0                  k8s-policy              1d 
persistentvolume/pvc-6e775ec6-e28b-428c-8c3b-7bcb1cf9a06a   8Gi        RWO            Delete           Bound    tanzu-system-logging/        data-kafka-zookeeper-0        k8s-policy              1d 
NAME                                 TYPE                                  DATA   AGE 
secret/kafka-token-9d7qg             kubernetes.io/service-account-token   3      1d 
secret/sh.helm.release.v1.kafka.v1   helm.sh/release.v1                    1      1d 
NAME                                   DATA   AGE 
configmap/kafka-scripts                1      1d

The default username for logging in to Splunk is admin but the password is randomly generated and must be obtained from the splunk-s1-standalone-secrets secret.

kubectl -n tanzu-system-logging get secret splunk-s1-standalone-secrets -o jsonpath='{.data.password}' | base64 --decode zB6DD5odyeK1sc8mCiQmt7rm

There are a Fluent Bit pieces to configure before we can make use of Kafka.

The tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/output/kafka/04-fluent-bit-configmap.yaml file has the instance-specific configuration information that we’ll need to provide to allow Fluent Bit to forward logs to our Kafka deployment. In my example, I am changing the following from the default values:

  • Setting the Cluster name to vsphere-test (since that’s the name of my TKG workload cluster).
  • Setting the Instance name to vsphere (this was an arbitrary choice).
  • Setting the Kafka Broker Service name to kafka, the name of the Kafka service.
  • Setting the Kafka Topic name to vsphere-test (again, since that’s the name of my TKG workload cluster)
sed -i 's/<TKG_CLUSTER_NAME>/vsphere-test/' tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/output/kafka/04-fluent-bit-configmap.yaml 
sed -i 's/<TKG_INSTANCE_NAME>/vsphere/' tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/output/kafka/04-fluent-bit-configmap.yaml 
sed -i 's/<KAFKA_BROKER_SERVICE_NAME>/kafka/' tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/output/kafka/04-fluent-bit-configmap.yaml 
sed -i 's/<KAFKA_TOPIC_NAME>/vsphere-test/' tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/output/kafka/04-fluent-bit-configmap.yaml

The only other file we need to be concerned with is tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/output/kafka/05-fluent-bit-ds.yaml but it’s fine in the default configuration.

Now we can deploy Fluent Bit and see what data is getting sent to Kafka:

kubectl apply -f tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/output/kafka/04-fluent-bit-configmap.yaml 
kubectl apply -f tkg-extensions-v1.1.0/logging/fluent-bit/vsphere/output/kafka/05-fluent-bit-ds.yaml

While there is no Kafka UI (unless you install a third-party one), we can test that data is being received from within a test pod.

Create a test pod with the kafka client utilities installed:

kubectl -n tanzu-system-logging run kafka-client --rm --tty -i --restart='Never' --image docker.io/bitnami/kafka:2.5.0-debian-10-r29 --command -- bash

By default, no topics are created but the fluent-bit pods should have forced the vsphere-test topic to be auto-created. You can check this by running the following command in the test pod (we have to connect to the zookeeper service directly to query for topics):

/opt/bitnami/kafka/bin/kafka-topics.sh --list --zookeeper kafka-zookeeper-0.kafka-zookeeper-headless.tanzu-system-logging.svc.cluster.local:2181 
 vsphere-test

You can also watch live events come in to the vsphere-test topic by running the following command in the test pod (if your Kafka service and/or port are different from what I’ve configured, you’ll need to replace kafka:9092 with appropriate values, as well as ensure the topic name is correct):

kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic vsphere-test 
{"@timestamp":1590758248.163003,"stream":"stdout","logtag":"F","message":"{"caller":"main.go:267","event":"startUpdate","msg":"start of service update","service":"kube-system/cloud-controller-manager","ts":"2020-05-29T13:17:28.162225165Z"}","kubernetes":{"pod_name":"speaker-kk4km","namespace_name":"metallb-system","pod_id":"c962a8be-ad93-4b1d-ab8c-089b3bc70cbf","labels":{"app":"metallb","component":"speaker","controller-revision-hash":"5b585bbb4b","pod-template-generation":"1"},"annotations":{"prometheus.io/port":"7472","prometheus.io/scrape":"true"},"host":"vsphere-test-md-0-78dc4b86-jflhn","container_name":"speaker","docker_id":"1e20c6adf4cea2da531f7e740d15ea536b08566cafc634ec0d66a7981fa56484","container_hash":"2b74eca0f25e946e9a1dc4b94b9da067b1fec4244364d266283dfbbab546a629"},"tkg_cluster":"vsphere-test","tkg_instance":"vsphere"} 
{"@timestamp":1590758248.163088,"stream":"stdout","logtag":"F","message":"{"caller":"main.go:271","event":"endUpdate","msg":"end of service update","service":"kube-system/cloud-controller-manager","ts":"2020-05-29T13:17:28.162360144Z"}","kubernetes":{"pod_name":"speaker-kk4km","namespace_name":"metallb-system","pod_id":"c962a8be-ad93-4b1d-ab8c-089b3bc70cbf","labels":{"app":"metallb","component":"speaker","controller-revision-hash":"5b585bbb4b","pod-template-generation":"1"},"annotations":{"prometheus.io/port":"7472","prometheus.io/scrape":"true"},"host":"vsphere-test-md-0-78dc4b86-jflhn","container_name":"speaker","docker_id":"1e20c6adf4cea2da531f7e740d15ea536b08566cafc634ec0d66a7981fa56484","container_hash":"2b74eca0f25e946e9a1dc4b94b9da067b1fec4244364d266283dfbbab546a629"},"tkg_cluster":"vsphere-test","tkg_instance":"vsphere"} 
{"@timestamp":1590758248.403326,"stream":"stdout","logtag":"F","message":"{"caller":"main.go:202","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 192.168.100.104:7946","ts":"2020-05-29T13:17:28.403011638Z"}","kubernetes":{"pod_name":"speaker-kk4km","namespace_name":"metallb-system","pod_id":"c962a8be-ad93-4b1d-ab8c-089b3bc70cbf","labels":{"app":"metallb","component":"speaker","controller-revision-hash":"5b585bbb4b","pod-template-generation":"1"},"annotations":{"prometheus.io/port":"7472","prometheus.io/scrape":"true"},"host":"vsphere-test-md-0-78dc4b86-jflhn","container_name":"speaker","docker_id":"1e20c6adf4cea2da531f7e740d15ea536b08566cafc634ec0d66a7981fa56484","container_hash":"2b74eca0f25e946e9a1dc4b94b9da067b1fec4244364d266283dfbbab546a629"},"tkg_cluster":"vsphere-test","tkg_instance":"vsphere"}

In my next post I’ll walk through a similar configuration but using Elastic Search as the log receiver.

Leave a Comment

Your email address will not be published.