After Ubuntu update from Trusty to Xenial, disk filled up by rsyslog log due to wrong logrotate config

Written by - 0 comments

Published on - Listed in Linux


Our monitoring informed me about a full filesystem on a Ubuntu server which was recently (four days ago) upgraded from 14.04 Trusty to 16.04 Xenial.

root@xenial:/# df -h /
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/xvda1     ext4   30G   29G     0 100% /

What is using up the space?

Obviously the first question in that matter.  I usually suspect large log or tmp files but checking with du nothing actually added up to the 29GB used disk space shown in df before:

root@xenial:/# du -ksh /*
14M    /bin
67M    /boot
0    /dev
9.3M    /etc
239M    /home
0    /initrd.img
0    /initrd.img.old
169M    /lib
4.0K    /lib64
16K    /lost+found
4.0K    /media
4.0K    /mnt
4.0K    /opt
du: cannot access '/proc/9279': No such file or directory
du: cannot access '/proc/9284/task/9284/fd/4': No such file or directory
du: cannot access '/proc/9284/task/9284/fdinfo/4': No such file or directory
du: cannot access '/proc/9284/fd/4': No such file or directory
du: cannot access '/proc/9284/fdinfo/4': No such file or directory
0    /proc
84M    /root
41M    /run
9.9M    /sbin
4.0K    /srv
0    /sys
68K    /tmp
2.4G    /usr
1.6G    /var
0    /vmlinuz
0    /vmlinuz.old

I've come across such problems in the past (for example see Filesystem full because of Filebeat still hanging on to rotate log) so I suspected some deleted files still blocking an inode:

root@xenial:/# lsof | grep -i deleted
check-ici  1169             root    0u      CHR              136,0         0t0          3 /dev/pts/0 (deleted)
check-ici  1169             root    1u      CHR              136,0         0t0          3 /dev/pts/0 (deleted)
check-ici  1169             root    2u      CHR              136,0         0t0          3 /dev/pts/0 (deleted)
sleep      8127             root    0u      CHR              136,0         0t0          3 /dev/pts/0 (deleted)
sleep      8127             root    1u      CHR              136,0         0t0          3 /dev/pts/0 (deleted)
sleep      8127             root    2u      CHR              136,0         0t0          3 /dev/pts/0 (deleted)
rsyslogd  13726           syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
in:imuxso 13726 13728     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
in:imklog 13726 13729     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
in:imudp  13726 13730     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
in:imtcp  13726 13731     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
rs:main   13726 13732     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
in:imtcp  13726 13733     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
in:imtcp  13726 13734     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
in:imtcp  13726 13735     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)
in:imtcp  13726 13736     syslog   13w      REG              202,1 25379045376     136093 /var/log/syslog.1 (deleted)

Indeed, /var/log/syslog.1 was deleted, but a process (obviously rsyslogd) is still keeping the inode busy. When this happens, the filesystem cannot reclaim the now available space.

Don't you use logrotate?

Of course! There's logrotate in place, which should exactly manage this situation. Let's check out the logrotate config for rsyslog:

root@xenial:/# cat /etc/logrotate.d/rsyslog
/var/log/syslog
{
    rotate 7
    daily
    missingok
    notifempty
    #delaycompress
    compress
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
}
[...]

Interesting... logrotate is supposed to run reload rsyslog right after rotating a logfile. Does this command actually work?

root@xenial:/# reload rsyslog
reload: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused

Of course not. Because upstart was replaced by systemd when Ubuntu was updated from 14.04 to 16.04 four days ago. It looks like the logrotate config was not adjusted when the package was updated. According to dpkg, the responsible package is rsyslog itself:

root@xenial:/# dpkg -S /etc/logrotate.d/rsyslog
rsyslog: /etc/logrotate.d/rsyslog

Rsyslog is controlled by Systemd now

In good old SysVinit times, I would have used /etc/init.d/rsyslog reload, but these times are over (*shedding a tear*). Unfortunately systemd's service definition for rsyslog does not allow a reload:

root@xenial:/# systemctl reload rsyslog
Failed to reload rsyslog.service: Job type reload is not applicable for unit rsyslog.service.
See system logs and 'systemctl status rsyslog.service' for details.

There is no ExecReload command defined in the rsyslog service unit file:

root@xenial:/# cat /lib/systemd/system/rsyslog.service
[Unit]
Description=System Logging Service
Requires=syslog.socket
Documentation=man:rsyslogd(8)
Documentation=http://www.rsyslog.com/doc/

[Service]
Type=notify
ExecStart=/usr/sbin/rsyslogd -n
StandardOutput=null
Restart=on-failure

[Install]
WantedBy=multi-user.target
Alias=syslog.service

Let's try a restart in this case. Sure, not as pretty as a reload but this also releases the process from holding on to deleted files:

root@xenial:/# systemctl restart rsyslog
root@xenial:/# systemctl status rsyslog
? rsyslog.service - System Logging Service
   Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2019-05-31 10:37:26 CEST; 7s ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
 Main PID: 12528 (rsyslogd)
   CGroup: /system.slice/rsyslog.service
           ??12528 /usr/sbin/rsyslogd -n

May 31 10:37:26 xenial.nzzmg.ch systemd[1]: Stopped System Logging Service.
May 31 10:37:26 xenial.nzzmg.ch systemd[1]: Starting System Logging Service...
May 31 10:37:26 xenial.nzzmg.ch systemd[1]: Started System Logging Service.

What about the file system?

root@xenial:/# df -h /
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/xvda1     ext4   30G  4.5G   24G  16% /

And we're back to the usage as it should be.

Adapting the rsyslog logrotate config for systemd

At the end, the logrotate config for rsyslog needs to be adjusted and run systemctl restart rsyslog as postrotate command:

root@xenial:/# cat /etc/logrotate.d/rsyslog
/var/log/syslog
{
    rotate 7
    daily
    missingok
    notifempty
    #delaycompress
    compress
    postrotate
        systemctl restart rsyslog >/dev/null 2>&1 || true
    endscript
}
[...]

I'm still quite surprised that this file (/etc/logrotate.d/rsyslog) was not replaced by the update...


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Office   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder