Our monitoring informed me about a full filesystem on a Ubuntu server which was recently (four days ago) upgraded from 14.04 Trusty to 16.04 Xenial.
root@xenial:/# df -h /
Filesystem Type Size Used Avail Use% Mounted on
/dev/xvda1 ext4 30G 29G 0 100% /
Obviously the first question in that matter. I usually suspect large log or tmp files but checking with du nothing actually added up to the 29GB used disk space shown in df before:
root@xenial:/# du -ksh /*
14M /bin
67M /boot
0 /dev
9.3M /etc
239M /home
0 /initrd.img
0 /initrd.img.old
169M /lib
4.0K /lib64
16K /lost+found
4.0K /media
4.0K /mnt
4.0K /opt
du: cannot access '/proc/9279': No such file or directory
du: cannot access '/proc/9284/task/9284/fd/4': No such file or directory
du: cannot access '/proc/9284/task/9284/fdinfo/4': No such file or directory
du: cannot access '/proc/9284/fd/4': No such file or directory
du: cannot access '/proc/9284/fdinfo/4': No such file or directory
0 /proc
84M /root
41M /run
9.9M /sbin
4.0K /srv
0 /sys
68K /tmp
2.4G /usr
1.6G /var
0 /vmlinuz
0 /vmlinuz.old
I've come across such problems in the past (for example see Filesystem full because of Filebeat still hanging on to rotate log) so I suspected some deleted files still blocking an inode:
root@xenial:/# lsof | grep -i deleted
check-ici 1169 root 0u CHR 136,0 0t0 3 /dev/pts/0 (deleted)
check-ici 1169 root 1u CHR 136,0 0t0 3 /dev/pts/0 (deleted)
check-ici 1169 root 2u CHR 136,0 0t0 3 /dev/pts/0 (deleted)
sleep 8127 root 0u CHR 136,0 0t0 3 /dev/pts/0 (deleted)
sleep 8127 root 1u CHR 136,0 0t0 3 /dev/pts/0 (deleted)
sleep 8127 root 2u CHR 136,0 0t0 3 /dev/pts/0 (deleted)
rsyslogd 13726 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
in:imuxso 13726 13728 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
in:imklog 13726 13729 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
in:imudp 13726 13730 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
in:imtcp 13726 13731 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
rs:main 13726 13732 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
in:imtcp 13726 13733 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
in:imtcp 13726 13734 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
in:imtcp 13726 13735 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
in:imtcp 13726 13736 syslog 13w REG 202,1 25379045376 136093 /var/log/syslog.1 (deleted)
Indeed, /var/log/syslog.1 was deleted, but a process (obviously rsyslogd) is still keeping the inode busy. When this happens, the filesystem cannot reclaim the now available space.
Of course! There's logrotate in place, which should exactly manage this situation. Let's check out the logrotate config for rsyslog:
root@xenial:/# cat /etc/logrotate.d/rsyslog
/var/log/syslog
{
rotate 7
daily
missingok
notifempty
#delaycompress
compress
postrotate
reload rsyslog >/dev/null 2>&1 || true
endscript
}
[...]
Interesting... logrotate is supposed to run reload rsyslog right after rotating a logfile. Does this command actually work?
root@xenial:/# reload rsyslog
reload: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused
Of course not. Because upstart was replaced by systemd when Ubuntu was updated from 14.04 to 16.04 four days ago. It looks like the logrotate config was not adjusted when the package was updated. According to dpkg, the responsible package is rsyslog itself:
root@xenial:/# dpkg -S /etc/logrotate.d/rsyslog
rsyslog: /etc/logrotate.d/rsyslog
In good old SysVinit times, I would have used /etc/init.d/rsyslog reload, but these times are over (*shedding a tear*). Unfortunately systemd's service definition for rsyslog does not allow a reload:
root@xenial:/# systemctl reload rsyslog
Failed to reload rsyslog.service: Job type reload is not applicable for unit rsyslog.service.
See system logs and 'systemctl status rsyslog.service' for details.
There is no ExecReload command defined in the rsyslog service unit file:
root@xenial:/# cat /lib/systemd/system/rsyslog.service
[Unit]
Description=System Logging Service
Requires=syslog.socket
Documentation=man:rsyslogd(8)
Documentation=http://www.rsyslog.com/doc/
[Service]
Type=notify
ExecStart=/usr/sbin/rsyslogd -n
StandardOutput=null
Restart=on-failure
[Install]
WantedBy=multi-user.target
Alias=syslog.service
Let's try a restart in this case. Sure, not as pretty as a reload but this also releases the process from holding on to deleted files:
root@xenial:/# systemctl restart rsyslog
root@xenial:/# systemctl status rsyslog
? rsyslog.service - System Logging Service
Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2019-05-31 10:37:26 CEST; 7s ago
Docs: man:rsyslogd(8)
http://www.rsyslog.com/doc/
Main PID: 12528 (rsyslogd)
CGroup: /system.slice/rsyslog.service
??12528 /usr/sbin/rsyslogd -n
May 31 10:37:26 xenial.nzzmg.ch systemd[1]: Stopped System Logging Service.
May 31 10:37:26 xenial.nzzmg.ch systemd[1]: Starting System Logging Service...
May 31 10:37:26 xenial.nzzmg.ch systemd[1]: Started System Logging Service.
What about the file system?
root@xenial:/# df -h /
Filesystem Type Size Used Avail Use% Mounted on
/dev/xvda1 ext4 30G 4.5G 24G 16% /
And we're back to the usage as it should be.
At the end, the logrotate config for rsyslog needs to be adjusted and run systemctl restart rsyslog as postrotate command:
root@xenial:/# cat /etc/logrotate.d/rsyslog
/var/log/syslog
{
rotate 7
daily
missingok
notifempty
#delaycompress
compress
postrotate
systemctl restart rsyslog >/dev/null 2>&1 || true
endscript
}
[...]
I'm still quite surprised that this file (/etc/logrotate.d/rsyslog) was not replaced by the update...
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder