
Fix three issues with the fluentd-gcp liveness probe: h1. STUCK_THRESHOLD_SECONDS was overridden by LIVENESS_THRESHOLD_SECONDS if defined Probably a copy/paste issue introduced inedf1ffc074
h1. `[[` is [a bashism](https://stackoverflow.com/a/47576482), and will always failed when called with `/bin/sh` Introduced bya844523c20
Given that we call the liveness probe with `/bin/sh`, we cannot use the double-bracketed `[[` syntax for test, as it is not POSIX-compliant and will throw an error. Annoyingly, even through it prints an error, `sh` returns with exit code 0 in this case: ```bash root@fluentd-7mprs:/# sh liveness.sh liveness.sh: 8: liveness.sh: [[: not found liveness.sh: 15: liveness.sh: [[: not found root@fluentd-7mprs:/# echo $? 0 ``` Which means the liveness probe is considered successful by Kubernetes, despite failing to test things as it was intended. This is also probably the reason why this bug wasn't reported sooner :) Thankfully, the test in this case can just as easily be written as POSIX-compliant as it doesn't use any bash-specific features within the `[[` block. h1. Buffers are transient and cannot be relied upon for monitoring Finally, after fixing the above issue, we started seeing the fluentd containers being restarted very often, and found an issue with the underlying logic of the liveness probe. The probe checks that the pod is still alive by running the following command: `find /var/log/fluentd-buffers -type f -newer /tmp/marker-stuck -print -quit` This checks if any _regular_ file exists under `/var/log/fluentd-buffers` that is more recent than a predetermined time, and will return an empty string otherwise. The issue is that these buffers are temporary and volatile, they get created and deleted constantly. Here is an example of running that check every second on a running fluentd: ``` root@fluentd-eks-playground-jdc8m:/# LIVENESS_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-300}; root@fluentd-eks-playground-jdc8m:/# STUCK_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-900}; root@fluentd-eks-playground-jdc8m:/# touch -d "${STUCK_THRESHOLD_SECONDS} seconds ago" /tmp/marker-stuck; root@fluentd-eks-playground-jdc8m:/# touch -d "${LIVENESS_THRESHOLD_SECONDS} seconds ago" /tmp/marker-liveness; root@fluentd-eks-playground-jdc8m:/# while true; do date ; find /var/log/fluentd-buffers -type f -newer /tmp/marker-stuck -print -quit ; sleep 1 ; done Fri Feb 22 10:52:57 UTC 2019 Fri Feb 22 10:52:58 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964ccf4c7004103c3fa7c8533f85.log Fri Feb 22 10:52:59 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964ccf4c7004103c3fa7c8533f85.log Fri Feb 22 10:53:00 UTC 2019 Fri Feb 22 10:53:01 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964fb8b2eedcccd2763ea7775cc2.log Fri Feb 22 10:53:02 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964fb8b2eedcccd2763ea7775cc2.log Fri Feb 22 10:53:03 UTC 2019 Fri Feb 22 10:53:04 UTC 2019 Fri Feb 22 10:53:05 UTC 2019 Fri Feb 22 10:53:06 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log Fri Feb 22 10:53:07 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log Fri Feb 22 10:53:08 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log Fri Feb 22 10:53:09 UTC 2019 Fri Feb 22 10:53:10 UTC 2019 Fri Feb 22 10:53:11 UTC 2019 Fri Feb 22 10:53:12 UTC 2019 Fri Feb 22 10:53:13 UTC 2019 Fri Feb 22 10:53:14 UTC 2019 Fri Feb 22 10:53:15 UTC 2019 Fri Feb 22 10:53:16 UTC 2019 ``` We can see buffers being created, then disappearing. The LivenessProbe running under these conditions has a ~50% chance of failing, despite fluentd being perfectly happy. I believe that check is probably ok for fluentd installs using large amounts of buffers, in which case the liveness probe will be correct more often than not, but fluentd installs that use buffering less intensively will be negatively impacted by this. My solution to fix this is to check the last updated time of buffering _folders_ within `/var/log/fluentd_buffers`. These _do_ get updated when buffers are created, and do not get deleted as buffers are emptied, making them the perfect candidate for our use. Here's an example with the `-d` flag for directories: ``` root@fluentd-eks-playground-jdc8m:/# while true; do date ; find /var/log/fluentd-buffers -type d -newer /tmp/marker-stuck -print -quit ; sleep 1 ; done Fri Feb 22 10:57:51 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:52 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:53 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:54 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:55 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:56 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:57 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:58 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:59 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:58:00 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:58:01 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:58:02 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:58:03 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer ``` And example of the directory being updated as new buffers come in: ``` root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer total 0 drwxr-xr-x 2 root root 6 Feb 22 11:17 . drwxr-xr-x 3 root root 38 Feb 22 11:14 .. root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer total 16K drwxr-xr-x 2 root root 224 Feb 22 11:18 . drwxr-xr-x 3 root root 38 Feb 22 11:14 .. -rw-r--r-- 1 root root 1.8K Feb 22 11:18 buffer.b58279be6e21e8b29fc333a7d50096ed0.log -rw-r--r-- 1 root root 215 Feb 22 11:18 buffer.b58279be6e21e8b29fc333a7d50096ed0.log.meta -rw-r--r-- 1 root root 429 Feb 22 11:18 buffer.b58279be6f09bdfe047a96486a525ece2.log -rw-r--r-- 1 root root 195 Feb 22 11:18 buffer.b58279be6f09bdfe047a96486a525ece2.log.meta root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer total 0 drwxr-xr-x 2 root root 6 Feb 22 11:18 . drwxr-xr-x 3 root root 38 Feb 22 11:14 .. ```
Stackdriver Logging Agent
==============
Stackdriver Logging Agent is a DaemonSet which spawns a pod on each node that reads logs, generated by kubelet, container runtime and containers and sends them to the Stackdriver. When logs are exported to the Stackdriver, they can be searched, viewed, and analyzed.
Learn more at: https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver
Troubleshooting
In Kubernetes clusters in version 1.10.0 or later, fluentd-gcp DaemonSet can be manually scaled. This is useful e.g. when applications running in the cluster are sending a large volume of logs (i.e. over 100kB/s), causing fluentd-gcp to fail with OutOfMemory errors. Conversely, if the applications aren't generating a lot of logs, it may be useful to reduce the amount of resources consumed by fluentd-gcp, making these resources available to other applications. To learn more about Kubernetes resource requests and limits, see the official documentation (CPU, memory). The amount of resources requested by fluentd-gcp on every node in the cluster can be fetched by running following command:
$ kubectl get ds -n kube-system -l k8s-app=fluentd-gcp \
-o custom-columns=NAME:.metadata.name,\
CPU_REQUEST:.spec.template.spec.containers[].resources.requests.cpu,\
MEMORY_REQUEST:.spec.template.spec.containers[].resources.requests.memory,\
MEMORY_LIMIT:.spec.template.spec.containers[].resources.limits.memory
This will display an output similar to the following:
NAME CPU_REQUEST MEMORY_REQUEST MEMORY_LIMIT
fluentd-gcp-v2.0.15 100m 200Mi 300Mi
In order to change those values, a ScalingPolicy needs to be defined. Currently, only base values are supported (no automatic scaling). The ScalingPolicy can be created using kubectl. E.g. to set cpu request to 101m, memory request to 150Mi and memory limit to 400Mi:
$ cat <<EOF | kubectl apply -f -
apiVersion: scalingpolicy.kope.io/v1alpha1
kind: ScalingPolicy
metadata:
name: fluentd-gcp-scaling-policy
namespace: kube-system
spec:
containers:
- name: fluentd-gcp
resources:
requests:
- resource: cpu
base: 101m
- resource: memory
base: 150Mi
limits:
- resource: memory
base: 400Mi
EOF
To remove the override and go back to GKE-provided defaults, it is enough to just remove the ScalingPolicy:
$ kubectl delete -n kube-system scalingpolicies.scalingpolicy.kope.io/fluentd-gcp-scaling-policy