In the past few days, our Kubernetes monitoring, using the monitoring plugin check_rancher2, informed us about a particular node running into a Disk Pressure situation. This happened every (felt) couple of hours recently. When we lined up the last alerts, it was definitely always the same cluster node running into this situation:
2021-12-15 20:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-15 10:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-15 04:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-14 16:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-14 12:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
2021-12-14 04:02:28 CHECK_RANCHER2 CRITICAL - node3 in cluster c-fctn6 has Disk Pressure -
But whenever we looked at the node, everything seemed in order.
The primary focus was of course on /var/lib/docker, a dedicated partition for the containers managed by Kubernetes. Yet nothing would show that this partition would be anywhere close to the 85% (default) threshold Kubernetes uses internally (for both file system and inode usage):
A pretty steady line of usage... A closer look was required. What would actually trigger a Disk Pressure situation besides that? Was it a false positive?
Looking into the kubelet container on that particular node showed indeed a Disk Pressure condition and the eviction manager was launched:
root@node3:~# docker logs --tail 200 --follow kubelet
[...]
W1215 18:59:26.580229 1421 eviction_manager.go:335] eviction manager: attempting to reclaim ephemeral-storage
I1215 18:59:26.580324 1421 eviction_manager.go:346] eviction manager: must evict pod(s) to reclaim ephemeral-storage
I1215 18:59:26.581969 1421 eviction_manager.go:364] eviction manager: pods ranked for eviction: prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a), [...]
I1215 18:59:26.582314 1421 kuberuntime_container.go:600] Killing container "docker://24aa11c82131a170c906c7fed3d56d8acbfab010990e7d4298f7bf05d75e9e9f" with 0 second grace period
I1215 18:59:26.582365 1421 kuberuntime_container.go:600] Killing container "docker://fcf91f17d6c46f96092e75827a2145c5ed7b3b6bc07dc55d0e2425578ed33920" with 0 second grace period
I1215 18:59:26.582316 1421 kuberuntime_container.go:600] Killing container "docker://e28cd5ebd8d3d071c75412374d96db9b4af3f3e49349c59b2f6ee95303072b99" with 0 second grace period
I1215 18:59:26.582367 1421 kuberuntime_container.go:600] Killing container "docker://6f8c60182ff157c9a26f578bbe3548805daefd6af6732a7031935030114e31c7" with 0 second grace period
I1215 18:59:26.582427 1421 kuberuntime_container.go:600] Killing container "docker://85dd62619735870aa378ab864edc65b935fe809fd1fff7fda3c5f0cb14ccc9a5" with 0 second grace period
I1215 18:59:26.678176 1421 kubelet.go:1925] SyncLoop (DELETE, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a)"
I1215 18:59:26.683967 1421 kubelet.go:1919] SyncLoop (REMOVE, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a)"
I1215 18:59:26.684109 1421 kubelet_pods.go:1125] Killing unwanted pod "prometheus-cluster-monitoring-0"
I1215 18:59:26.684170 1421 kuberuntime_container.go:600] Killing container "docker://24aa11c82131a170c906c7fed3d56d8acbfab010990e7d4298f7bf05d75e9e9f" with 2 second grace period
I1215 18:59:26.684217 1421 kuberuntime_container.go:600] Killing container "docker://e28cd5ebd8d3d071c75412374d96db9b4af3f3e49349c59b2f6ee95303072b99" with 2 second grace period
I1215 18:59:26.684319 1421 kuberuntime_container.go:600] Killing container "docker://6f8c60182ff157c9a26f578bbe3548805daefd6af6732a7031935030114e31c7" with 2 second grace period
I1215 18:59:26.684220 1421 kuberuntime_container.go:600] Killing container "docker://85dd62619735870aa378ab864edc65b935fe809fd1fff7fda3c5f0cb14ccc9a5" with 2 second grace period
I1215 18:59:26.684367 1421 kuberuntime_container.go:600] Killing container "docker://fcf91f17d6c46f96092e75827a2145c5ed7b3b6bc07dc55d0e2425578ed33920" with 2 second grace period
2021-12-15 18:59:26.966 [WARNING][19230] k8s.go 444: WorkloadEndpoint does not exist in the datastore, moving forward with the clean up ContainerID="ba9e30a7ef98e5ab7ff0665d06ff407df62c3422b8c17f41fae392f7dd1a83f1" WorkloadEndpoint="node3-k8s-prometheus--cluster--monitoring--0-eth0"
2021-12-15 18:59:27.120 [WARNING][19320] k8s.go 444: WorkloadEndpoint does not exist in the datastore, moving forward with the clean up ContainerID="ba9e30a7ef98e5ab7ff0665d06ff407df62c3422b8c17f41fae392f7dd1a83f1" WorkloadEndpoint="node3-k8s-prometheus--cluster--monitoring--0-eth0"
I1215 18:59:27.234210 1421 eviction_manager.go:571] eviction manager: pod prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a) is evicted successfully
I1215 18:59:27.234327 1421 eviction_manager.go:195] eviction manager: pods prometheus-cluster-monitoring-0_cattle-prometheus(5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a) evicted, waiting for pod to be cleaned up
I1215 18:59:27.526904 1421 logs.go:312] Finish parsing log file "/var/lib/docker/containers/fcf91f17d6c46f96092e75827a2145c5ed7b3b6bc07dc55d0e2425578ed33920/fcf91f17d6c46f96092e75827a2145c5ed7b3b6bc07dc55d0e2425578ed33920-json.log"
I1215 18:59:27.537104 1421 logs.go:312] Finish parsing log file "/var/lib/docker/containers/6f8c60182ff157c9a26f578bbe3548805daefd6af6732a7031935030114e31c7/6f8c60182ff157c9a26f578bbe3548805daefd6af6732a7031935030114e31c7-json.log"
I1215 18:59:27.540311 1421 logs.go:312] Finish parsing log file "/var/lib/docker/containers/24aa11c82131a170c906c7fed3d56d8acbfab010990e7d4298f7bf05d75e9e9f/24aa11c82131a170c906c7fed3d56d8acbfab010990e7d4298f7bf05d75e9e9f-json.log"
W1215 18:59:27.540394 1421 pod_container_deletor.go:75] Container "ba9e30a7ef98e5ab7ff0665d06ff407df62c3422b8c17f41fae392f7dd1a83f1" not found in pod's containers
I1215 18:59:27.713434 1421 kubelet.go:1909] SyncLoop (ADD, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a)"
W1215 18:59:27.714554 1421 eviction_manager.go:164] Failed to admit pod prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a) - node has conditions: [DiskPressure]
I1215 18:59:27.759613 1421 kubelet.go:1925] SyncLoop (DELETE, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a)"
I1215 18:59:27.769752 1421 kubelet.go:1919] SyncLoop (REMOVE, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a)"
I1215 18:59:27.769822 1421 kubelet.go:2118] Failed to delete pod "prometheus-cluster-monitoring-0_cattle-prometheus(e87f5532-ddf5-4dc6-aad6-bfc6a128911a)", err: pod not found
I1215 18:59:27.787262 1421 status_manager.go:562] Pod "prometheus-cluster-monitoring-0_cattle-prometheus(ed418fc2-6fdd-4f40-8a32-da80c24fbce6)" was deleted and then recreated, skipping status update; old UID "5813f8d2-f2aa-4977-a5fe-9ea01c3bc26a", new UID "ed418fc2-6fdd-4f40-8a32-da80c24fbce6"
I1215 18:59:27.836826 1421 kubelet.go:1909] SyncLoop (ADD, "api"): "prometheus-cluster-monitoring-0_cattle-prometheus(ed418fc2-6fdd-4f40-8a32-da80c24fbce6)"
W1215 18:59:27.837007 1421 eviction_manager.go:164] Failed to admit pod prometheus-cluster-monitoring-0_cattle-prometheus(ed418fc2-6fdd-4f40-8a32-da80c24fbce6) - node has conditions: [DiskPressure]
[...]
So far we had enough evidence that the node indeed ran into a DiskPressure situation and the node's kubelet container started pod eviction (deleting pods) with the eviction manager. But let's take one step back: What is causing the DiskPressure condition? We have actually already described the reasons for Kubernetes pod eviction in a previous article, but this is an excellent (real life) example to bring it up again:
The node-pressure eviction can be caused for different reasons. The most common reason is that the Kubernetes node ran into a resource limit, defined by (pre-defined) thresholds. These thresholds include:
Did we just mention /var/lib/kubelet? Yes. We did.
Kubernetes does not only check the container path (imagefs), which is usually located under /var/lib/docker. Kubernetes also regularly checks the node's own filesystem (nodefs), typically located under /var/lib/kubelet. As we mentioned before, we have a dedicated partition for /var/lib/docker. But looking at /var/lib/kubelet actually results in the node's /var partition:
root@node3:~# df -h /var/lib/kubelet
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/vgsystem-lvvar ext4 3.7G 2.6G 1.1G 71% /var
The current usage is at 71% - which is rather high already. Let's take a look at the historical disk usage graphs of this /var partition:
By comparing the disk usage drops with the monitoring alerts at the beginning of this post, we quickly see a match.
As this partition is not only used by the kubelet container but also for other (system) log files in /var/log/, it's quite obvious that this could run into a 85% disk usage threshold after a while. A quick LV extend and file system grow later and the problem was definitely solved.
If one of your Kubernetes nodes runs into a DiskPressure condition, verify both nodefs (default: /var/lib/kubelet) and imagefs (default: /var/lib/docker) filesystems. Especially the /var/lib/kubelet part is quickly forgotten (it just happened to us, too!) and can cause some head-scratching.
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office OpenSearch PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder