This might seem like an odd topic but we recently had an ask for this as the Splunk instance in use was not configured to use TLS…this would also prove beneficial for a lab or POC where you might not want to go through the effort of using TLS (though it’s arguably not much effort depending on how you’re deployed Splunk).
You can read up on how extensions and shared services work in TKG 1.2 in my earlier post, Working with TKG Extensions and Shared Services in TKG 1.2. I did not cover fluent-bit in this but did go over it and the various output options in three even earlier posts,
How to Configure Fluent Bit and Splunk in Tanzu Kubernetes Grid, How to Configure Fluent Bit and Elasticsearch in Tanzu Kubernetes Grid and
How to Configure Fluent Bit and Kafka in Tanzu Kubernetes Grid.
One of the key pieces to understand for making customizations to the extensions is ytt, or YAML Templating Tool. This is part of the Carvel tools, formerly k14s, and is a new inclusion with the 1.2 version of TKG. ytt allows us to quickly and easy create yaml templates where we can set values and patch existing yaml files in an automated fashion.
The fluent-bit extension in TKG 1.2 will use TLS for communication with Splunk and there is no obvious way of changing this. The documentation for configuring the fluent-bit extension only covers changing the following values: host, port and token. If you’ve ever looked into a fluent-bit configmap that is used for sending logs to Splunk, you’ll quickly see that there are dozens more options that could be changed.
We can start out largely following the instructions for deploying the fluent-bit extension as documented but need to stop and make some changes when we get to the Deploy the Fluent Bit Extension step. Before we can actually deploy the extension, we need to create a ytt overlay file and modify the extension manifest. The overlay file is what will allow us to pass in parameters that are not readily available for change.
When fluent-bit is deployed via the fluent-bit extension, a configmap named fluent-bit-config is created that is the primary configuration source for the fluent-bit pods. The portions of this configmap that we have easy access to change (via the fluent-bit-data-values.yaml
file) are the Splunk host, port and token (as noted earlier). If we take a look at the portion of this configmap where these values are set, you’ll quickly see that there are also two TLS parameters:
output-splunk.conf: |
[OUTPUT]
Name splunk
Match *
Host 192.168.110.10
Port 8088
Splunk_Token 03d998ac-2a6f-4173-b445-a35bc693963d
TLS On
TLS.Verify Off
You can see from this example that my Host, Port and Splunk_Token values are set but also that TLS is set to On and TLS.Verify is set to Off. If I were using a self-signed certificate in Splunk, this would probably be fine but since I’m not using TLS at all, we need to get this TLS value to be Off.
Back to the ytt overlay file mentioned eariler….it should look like the following:
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"metadata":{"name":"fluent-bit-config"}})
---
data:
#@overlay/match missing_ok=True
output-splunk.conf: |
[OUTPUT]
Name splunk
Match *
Host 192.168.110.10
Port 8088
Splunk_Token 03d998ac-2a6f-4173-b445-a35bc693963d
TLS Off
TLS.Verify Off
Some points to note here:
- The first line is just noting that this is an overlay file
- The second line is noting that we should only be matching the fluent-bit-config (the fluent-bit-config configmap actually) section of the configuration.
- The rest is supplying the entire contents of the
output-splunk.conf
file, including the already specified host, port and token values. If the entire contents is not specified here, any missing values will not show up in the fluent-bit-config configmap.
With this overlay file created, we will now create a configmap from it that will be called by the extension.
kubectl create configmap fluent-bit-overlay -n tanzu-system-logging --from-file=fluent-bit-overlay.yaml
It’s worth noting that the naming of the configmap being created and the overlay file are arbitrary, as long as you use the same when modifying the extension manifest.
We can validate that the configmap was created as expected as well:
kubectl -n tanzu-system-logging get cm fluent-bit-overlay -o yaml
apiVersion: v1
data:
fluent-bit-overlay.yaml: |+
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"metadata":{"name":"fluent-bit-config"}})
---
data:
#@overlay/match missing_ok=True
output-splunk.conf: |
[OUTPUT]
Name splunk
Match *
Host 192.168.110.10
Port 8088
Splunk_Token 03d998ac-2a6f-4173-b445-a35bc693963d
TLS Off
TLS.Verify Off
kind: ConfigMap
metadata:
creationTimestamp: "2021-01-03T16:27:10Z"
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:data:
.: {}
f:fluent-bit-overlay.yaml: {}
manager: kubectl-create
operation: Update
time: "2021-01-03T16:27:10Z"
name: fluent-bit-overlay
namespace: tanzu-system-logging
resourceVersion: "402569"
selfLink: /api/v1/namespaces/tanzu-system-logging/configmaps/fluent-bit-overlay
uid: b6f4c25c-e8bb-458e-b55d-927bed43428c
Before we can resume with the steps noted at Deploy the Fluent Bit Extension, we need to modify the fluent-bit-extension.yaml
file such that it will use this newly created configmap. Below is the contents of this file with new/changed sections called out:
cat tkg-extensions-v1.2.0+vmware.1/extensions/logging/fluent-bit/fluent-bit-extension.yaml:
apiVersion: clusters.tmc.cloud.vmware.com/v1alpha1
kind: Extension
metadata:
name: fluent-bit
namespace: tanzu-system-logging
annotations:
tmc.cloud.vmware.com/managed: "false"
spec:
description: fluent-bit
version: "v1.5.3_vmware.1"
name: fluent-bit
namespace: tanzu-system-logging
deploymentStrategy:
type: KUBERNETES_NATIVE
objects: |
apiVersion: kappctrl.k14s.io/v1alpha1
kind: App
metadata:
name: fluent-bit
annotations:
tmc.cloud.vmware.com/orphan-resource: "true"
spec:
syncPeriod: 5m
serviceAccountName: fluent-bit-extension-sa
fetch:
- image:
url: registry.tkg.vmware.run/tkg-extensions-templates:v1.2.0_vmware.1
- inline: # this is the start of a new section
pathsFrom:
- configMapRef:
name: fluent-bit-overlay # this is the end of the new section
template:
- ytt:
ignoreUnknownComments: true
paths:
- 0/tkg-extensions/common # this line is changed, "0/" was added to the beginning of it
- 0/tkg-extensions/logging/fluent-bit # this line is changed, "0/" was added to the beginning of it
- 1/fluent-bit-overlay.yaml # this is a new line
inline:
pathsFrom:
- secretRef:
name: fluent-bit-data-values
deploy:
- kapp:
rawOptions: ["--wait-timeout=5m"]
To quickly summarize what happened in this file, we added a reference to the fluent-bit-overlay configmap that was created earlier. We then modified the ytt paths that get called such that the two original paths are called first (the 0 in front of them denotes the order) and the fluent-bit-overlay.yaml
file (referenced in the fluent-bit-overlay configmap) is called second (the 1 in front of it denotes the order).
With this done, we can apply the fluent-bit-extension.yaml
file as documented. We’ll see some normal activity in logs for the fluent-bit pods.
[2021/01/03 16:27:45] [ info] inotify_fs_add(): inode=928549 watch_fd=24 name=/var/log/containers/antrea-agent-ml74s_kube-system_antrea-agent-689d754a1f0696f929a39ae858df8125ec82a54f26c2d34693b25cf7da33009b.log
[2021/01/03 16:27:46] [ info] inotify_fs_add(): inode=928360 watch_fd=25 name=/var/log/containers/kube-apiserver-tkg-cluster-control-plane-2p9mr_kube-system_kube-apiserver-eee20ab9d6dc22050ad78490b61ab34b31f921c089c2944817925a5254fbfec9.log
[2021/01/03 16:27:46] [ info] inotify_fs_add(): inode=929054 watch_fd=26 name=/var/log/containers/vsphere-csi-controller-66b875d646-m7mzs_vmware-system-csi_csi-attacher-ac08583d69abca7eb4c9b6b4854087d7b99185f0165fd840c95539748bc16928.log
[2021/01/03 16:27:46] [ info] inotify_fs_add(): inode=929249 watch_fd=27 name=/var/log/containers/vsphere-csi-controller-66b875d646-m7mzs_vmware-system-csi_csi-provisioner-9590d5364eaea3013938f7d9a6128ee1e27635882f028e1b89718d6d0003df64.log
[2021/01/03 16:27:46] [ info] inotify_fs_add(): inode=929227 watch_fd=28 name=/var/log/containers/vsphere-csi-controller-66b875d646-m7mzs_vmware-system-csi_csi-resizer-40716f949359aa51102bdc3a3b2e222dbb87b58e2c93b8ed88cbdfcf63345b1d.log
[2021/01/03 16:27:47] [ info] inotify_fs_add(): inode=928353 watch_fd=29 name=/var/log/containers/etcd-tkg-cluster-control-plane-2p9mr_kube-system_etcd-ab07059ea4bd235e08fa14df56b33ced3fa127c0e0d5bc92d51f78b9da5444b3.log
[2021/01/03 16:27:47] [ info] inotify_fs_add(): inode=928960 watch_fd=30 name=/var/log/containers/kube-controller-manager-tkg-cluster-control-plane-2p9mr_kube-system_kube-controller-manager-0cba068f85fafa8900fe51a7cb8ba844bf1630547e6578b58df30f27ab923861.log
By comparison, the following is seen when the default fluent-bit configuration is used against a Splunk instance with TLS disabled:
[2021/01/02 23:47:08] [error] [io_tls] flb_io_tls.c:356 SSL - An invalid SSL record was received
And lastly, we’ll see data coming from our TKG cluster in to Splunk:
All things considered, this was a relatively simple change to make to the fluent-bit configuration but the same process could be used with all of the extensions included with TKG.