In the past days I experienced a lot of problems with newly installed CentOS 7 machines. Besides having to use a ridiculous installer and systemd (but I'll keep that for another post) I came across problems with the new default filesystem CentOS 7 has chosen for its installations: xfs.
A few months ago I already became aware of xfs problems when an online resize didn't work as it should have - so I've been a xfs skeptical since then.
Now a couple of days ago I had an application file system running full on two machines which are being synchronized constantly. Due to weird circumstances (call it internal communication issues before I took over the machines) both machines, although intended to be installed the same way, were set up with different file system types: Machine1 running with xfs, machine2 running with ext4.
When the application, running on both machine1 and machine2, had an issue, there were a lot of inodes which needed to be cleared and the file system repaired. On machine2 (running with ext4) this happened automatically. Dmesg showed entries for orphaned inodes which were cleared. Afterwards the system rebooted itself. Not great (for the reboot), but 30s later the machine was up and running again.
On machine1 however, the inodes were never freed, nor did dmesg or any log show issues with the file system. Instead the file system was filling up until it was 100% used. Remember, machine1 and machine2 are being synchronized on the application file system, so the usage should be exactly the same. Yet while ext4 showed a usage of around 40%, xfs was full at 100%.
I decided to run xfs_repair to "force" xfs to free the used inodes. So I disabled machine1 from the load balancer, rebooted with grml and launched xfs_repair (first with the dry-run option -n):
root@grml ~ # xfs_repair -n /dev/vgapp/lvapp
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
agi unlinked bucket 0 is 863680 in ag 3 (inode=51195328)
agi unlinked bucket 1 is 3085889 in ag 3 (inode=53417537)
agi unlinked bucket 2 is 3085890 in ag 3 (inode=53417538)
agi unlinked bucket 3 is 3085891 in ag 3 (inode=53417539)
agi unlinked bucket 4 is 3719876 in ag 3 (inode=54051524)
[...]
The list went on and on... I was kind of "relieved" to see that xfs_repair has found some "unlinked" buckets, which means still referenced but undeleted inodes, according to the xfs documentation:
The di_next_unlinked value in the inode is used to track inodes that have been unlinked (deleted) but which are still referenced.
[...]
The only time the unlinked fields can be seen to be used on disk is either on an active filesystem or a crashed system. A cleanly unmounted or recovered filesystem will not have any inodes in these unlink hash chains.
So I ran xfs_repair without the dry-run:
root@grml ~ # xfs_repair /dev/vgapp/lvapp
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
agi unlinked bucket 0 is 11247360 in ag 1 (inode=28024576)
agi unlinked bucket 1 is 2556417 in ag 1 (inode=19333633)
agi unlinked bucket 2 is 11247042 in ag 1 (inode=28024258)
agi unlinked bucket 3 is 11247043 in ag 1 (inode=28024259)
agi unlinked bucket 4 is 11980292 in ag 1 (inode=28757508)
agi unlinked bucket 5 is 11247365 in ag 1 (inode=28024581)
[...]
It ran through until the end and it seemed to me that now the filesystem should be working again. Meaning that the occupied inodes were freed. I rebooted into CentOS but found that the application file system was still 100% fully used.
Back in grml I started another xfs_repair:
root@grml ~ # xfs_repair /dev/vgapp/lvapp
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes
Phase 4 - check for duplicated blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
Phase 5 - rebuld AG header and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
All looks good, right? At least I thought so. Back to a booted CentOS, the file system was still at 100% full.
Besides the issue with the full file system, I also found that xfs is much slower in accessing the application files. On a cloned environment where both machines are in a healthy state, I compared the output of ls -l.
Remember, the application file system is synchronized... The same size and the same number of files (4905) are in the application folder. Yet xfs is almost 3x slower:
Machine 1 with xfs:
[root@machine1 backup]# time ls -la
[...]
real 0m0.679s
user 0m0.526s
sys 0m0.137s
Machine 2 with ext4:
[root@machine2 backup]# time ls -la
[...]
real 0m0.240s
user 0m0.198s
sys 0m0.039s
After having already spent too much time on it and not seeing any reason (performance wise) to continue using xfs, I finally decided to reinstall machine1 with ext4. Since then I got no problems anymore.
ck from Wil, Switzerland wrote on Oct 20th, 2015:
Good point, NoName. I didn't look into the lost+found directory. Maybe this would have explained the full disk space. But shouldn't xfs_repair discover all this?
NoName from wrote on Oct 20th, 2015:
Did you have looked into the lost+found directory? I expect there many files with inode numbers as names. From the man page:
"Orphaned files and directories (allocated, in-use but unreferenced) are reconnected by placing them in the lost+found directory. The name assigned is the inode number."
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder