docker!

Written by Claudio Kuenzler - 0 comments

Published on December 15th 2021 - Listed in Rancher Kubernetes Docker Cloud Containers Linux

In the past few days, our Kubernetes monitoring, using the monitoring plugin check_rancher2, informed us about a particular node running into a Disk Pressure situation. This happened every (felt) couple of hours recently. When we lined up the last alerts, it was definitely always the same cluster node running into this situation:

2021-12-15 20:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-15 10:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-15 04:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-14 16:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-14 12:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-14 04:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -

But whenever we looked at the node, everything seemed in order.

Digging deeper into the kubelet

The primary focus was of course on /var/lib/docker, a dedicated partition for the containers managed by Kubernetes. Yet nothing would show that this partition would be anywhere close to the 85% (default) threshold Kubernetes uses internally (for both file system and inode usage):

A pretty steady line of usage... A closer look was required. What would actually trigger a Disk Pressure situation besides that? Was it a false positive?

Looking into the kubelet container on that particular node showed indeed a Disk Pressure condition and the eviction manager was launched:

root@node3:~# docker logs --tail 200 --follow kubelet
[...]
W1215 18:59:26.580229    1421 eviction_manager.go:335] eviction manager: attempting to reclaim ephemeral-storage
I1215 18:59:26.580324    1421 eviction_manager.go:346] eviction manager: must evict pod(s) to reclaim ephemeral-storage
I1215 18:59:26.581969    1421 eviction_manager.go:364] eviction manager: pods ranked for eviction: prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a), [...]
I1215 18:59:26.582314    1421 kuberuntime_container.go:600] Killing container "docker://24aa11c82131a170c906c7fed3d56d8acbfab010990e7d4298f7bf05d75e9e9f" with 0 second grace period
I1215 18:59:26.582365    1421 kuberuntime_container.go:600] Killing container "docker://fcf91f17d6c46f96092e75827a2145c5ed7b3b6bc07dc55d0e2425578ed33920" with 0 second grace period
I1215 18:59:26.582316    1421 kuberuntime_container.go:600] Killing container "docker://e28cd5ebd8d3d071c75412374d96db9b4af3f3e49349c59b2f6ee95303072b99" with 0 second grace period
I1215 18:59:26.582367    1421 kuberuntime_container.go:600] Killing container "docker://6f8c60182ff157c9a26f578bbe3548805daefd6af6732a7031935030114e31c7" with 0 second grace period
I1215 18:59:26.582427    1421 kuberuntime_container.go:600] Killing container "docker://85dd62619735870aa378ab864edc65b935fe809fd1fff7fda3c5f0cb14ccc9a5" with 0 second grace period
I1215 18:59:26.678176    1421 kubelet.go:1925] SyncLoop (DELETE, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a)"
I1215 18:59:26.683967    1421 kubelet.go:1919] SyncLoop (REMOVE, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a)"
I1215 18:59:26.684109    1421 kubelet_pods.go:1125] Killing unwanted pod "prometheus-cluster-monitoring-0"
I1215 18:59:26.684170    1421 kuberuntime_container.go:600] Killing container "docker://24aa11c82131a170c906c7fed3d56d8acbfab010990e7d4298f7bf05d75e9e9f" with 2 second grace period
I1215 18:59:26.684217    1421 kuberuntime_container.go:600] Killing container "docker://e28cd5ebd8d3d071c75412374d96db9b4af3f3e49349c59b2f6ee95303072b99" with 2 second grace period
I1215 18:59:26.684319    1421 kuberuntime_container.go:600] Killing container "docker://6f8c60182ff157c9a26f578bbe3548805daefd6af6732a7031935030114e31c7" with 2 second grace period
I1215 18:59:26.684220    1421 kuberuntime_container.go:600] Killing container "docker://85dd62619735870aa378ab864edc65b935fe809fd1fff7fda3c5f0cb14ccc9a5" with 2 second grace period
I1215 18:59:26.684367    1421 kuberuntime_container.go:600] Killing container "docker://fcf91f17d6c46f96092e75827a2145c5ed7b3b6bc07dc55d0e2425578ed33920" with 2 second grace period
2021-12-15 18:59:26.966 [WARNING][19230] k8s.go 444: WorkloadEndpoint does not exist in the datastore, moving forward with the clean up ContainerID="ba9e30a7ef98e5ab7ff0665d06ff407df62c3422b8c17f41fae392f7dd1a83f1" WorkloadEndpoint="node3-k8s-prometheus--cluster--monitoring--0-eth0"
2021-12-15 18:59:27.120 [WARNING][19320] k8s.go 444: WorkloadEndpoint does not exist in the datastore, moving forward with the clean up ContainerID="ba9e30a7ef98e5ab7ff0665d06ff407df62c3422b8c17f41fae392f7dd1a83f1" WorkloadEndpoint="node3-k8s-prometheus--cluster--monitoring--0-eth0"
I1215 18:59:27.234210    1421 eviction_manager.go:571] eviction manager: pod prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a) is evicted successfully
I1215 18:59:27.234327    1421 eviction_manager.go:195] eviction manager: pods prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a) evicted, waiting for pod to be cleaned up
I1215 18:59:27.526904    1421 logs.go:312] Finish parsing log file "/var/lib/docker/containers/fcf91f17d6c46f96092e75827a2145c5ed7b3b6bc07dc55d0e2425578ed33920/fcf91f17d6c46f96092e75827a2145c5ed7b3b6bc07dc55d0e2425578ed33920-json.log"
I1215 18:59:27.537104    1421 logs.go:312] Finish parsing log file "/var/lib/docker/containers/6f8c60182ff157c9a26f578bbe3548805daefd6af6732a7031935030114e31c7/6f8c60182ff157c9a26f578bbe3548805daefd6af6732a7031935030114e31c7-json.log"
I1215 18:59:27.540311    1421 logs.go:312] Finish parsing log file "/var/lib/docker/containers/24aa11c82131a170c906c7fed3d56d8acbfab010990e7d4298f7bf05d75e9e9f/24aa11c82131a170c906c7fed3d56d8acbfab010990e7d4298f7bf05d75e9e9f-json.log"
W1215 18:59:27.540394    1421 pod_container_deletor.go:75] Container "ba9e30a7ef98e5ab7ff0665d06ff407df62c3422b8c17f41fae392f7dd1a83f1" not found in pod's containers
I1215 18:59:27.713434    1421 kubelet.go:1909] SyncLoop (ADD, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a)"
W1215 18:59:27.714554    1421 eviction_manager.go:164] Failed to admit pod prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a) - node has conditions: [DiskPressure]
I1215 18:59:27.759613    1421 kubelet.go:1925] SyncLoop (DELETE, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a)"
I1215 18:59:27.769752    1421 kubelet.go:1919] SyncLoop (REMOVE, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a)"
I1215 18:59:27.769822    1421 kubelet.go:2118] Failed to delete pod "prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a)", err: pod not found
I1215 18:59:27.787262    1421 status_manager.go:562] Pod "prometheus-cluster-monitoring-0_cattle-prometheus(ed418fc2-6fdd-4f40-8a32-da80c24fbce6)" was deleted and then recreated, skipping status update; old UID "5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a", new UID "ed418fc2-6fdd-4f40-8a32-da80c24fbce6"
I1215 18:59:27.836826    1421 kubelet.go:1909] SyncLoop (ADD, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(ed418fc2-6fdd-4f40-8a32-da80c24fbce6)"
W1215 18:59:27.837007    1421 eviction_manager.go:164] Failed to admit pod prometheus-cluster-monitoring-0_cattle-prometheus(ed418fc2-6fdd-4f40-8a32-da80c24fbce6) - node has conditions: [DiskPressure]
[...]

What are the reasons to trigger the eviction manager?

So far we had enough evidence that the node indeed ran into a DiskPressure situation and the node's kubelet container started pod eviction (deleting pods) with the eviction manager. But let's take one step back: What is causing the DiskPressure condition? We have actually already described the reasons for Kubernetes pod eviction in a previous article, but this is an excellent (real life) example to bring it up again:

The node-pressure eviction can be caused for different reasons. The most common reason is that the Kubernetes node ran into a resource limit, defined by (pre-defined) thresholds. These thresholds include:

Available disk space or inodes of the node file system itself is low (usually checked against "/var/lib/kubelet") -> nodefs.available
Available disk space or inodes for containers and images is low (mostly this is "/var/lib/docker") -> imagefs.available

Did we just mention /var/lib/kubelet? Yes. We did.

nodefs (/var/lib/kubelet)

Kubernetes does not only check the container path (imagefs), which is usually located under /var/lib/docker. Kubernetes also regularly checks the node's own filesystem (nodefs), typically located under /var/lib/kubelet. As we mentioned before, we have a dedicated partition for /var/lib/docker. But looking at /var/lib/kubelet actually results in the node's /var partition:

root@node3:~# df -h /var/lib/kubelet
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/vgsystem-lvvar ext4 3.7G 2.6G 1.1G 71% /var

The current usage is at 71% - which is rather high already. Let's take a look at the historical disk usage graphs of this /var partition:

/var disk usage - part of /var/lib/kubelet - and the reason for DiskPressure condition

By comparing the disk usage drops with the monitoring alerts at the beginning of this post, we quickly see a match.

As this partition is not only used by the kubelet container but also for other (system) log files in /var/log/, it's quite obvious that this could run into a 85% disk usage threshold after a while. A quick LV extend and file system grow later and the problem was definitely solved.

TL;DR: DiskPressure means to verify both nodefs and imagefs

If one of your Kubernetes nodes runs into a DiskPressure condition, verify both nodefs (default: /var/lib/kubelet) and imagefs (default: /var/lib/docker) filesystems. Especially the /var/lib/kubelet part is quickly forgotten (it just happened to us, too!) and can cause some head-scratching.

Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

Blog Tags:

AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Observability Office OpenSearch PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder