I'm currently creating a script to automatically backup the databases from an InfluxDB. After the initial successful run of the backup script (which uses influxd backup in the background) I stumbled across the following errors when I ran the plugin a second (and more often) time:
2019/03/14 09:48:56 Download shard 151 failed copy backup to file: err=<nil>
2019/03/14 09:48:58 Download shard 151 failed copy backup to file: err=
2019/03/14 09:49:00 Download shard 151 failed copy backup to file: err=
2019/03/14 09:49:02 Download shard 151 failed copy backup to file: err=
2019/03/14 09:49:04 Download shard 151 failed copy backup to file: err=
2019/03/14 09:49:06 Download shard 151 failed copy backup to file: err=
2019/03/14 09:49:08 Download shard 151 failed copy backup to file: err=
2019/03/14 09:49:11 Download shard 151 failed copy backup to file: err=
2019/03/14 09:49:23 Download shard 151 failed copy backup to file: err=
2019/03/14 09:50:06 Download shard 151 failed copy backup to file: err=
2019/03/14 09:52:51 backup failed: copy backup to file: err=
backup: copy backup to file: err=
The syslog entries from the Influxd daemon revealed a little bit more:
Mar 14 09:49:34 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:34.924850Z lvl=info msg="Write failed" log_id=0BkKXKwW000 service=write shard=151 error="engine: error writing WAL entry: write /var/lib/influxdb/wal/icinga/autogen/151/_02519.wal: no space left on device"
Mar 14 09:49:35 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:35.703478Z lvl=info msg="Cache snapshot (start)" log_id=0BkKXKwW000 engine=tsm1 trace_id=0EAr7Rql000 op_name=tsm1_cache_snapshot op_event=start
Mar 14 09:49:35 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:35.703533Z lvl=info msg="Cache snapshot (end)" log_id=0BkKXKwW000 engine=tsm1 trace_id=0EAr7Rql000 op_name=tsm1_cache_snapshot op_event=end op_elapsed=0.068ms
Mar 14 09:49:35 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:35.703550Z lvl=info msg="Error writing snapshot" log_id=0BkKXKwW000 engine=tsm1 error="error opening new segment file for wal (1): write /var/lib/influxdb/wal/icinga/autogen/151/_02519.wal: no space left on device"
Mar 14 09:49:36 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:36.305281Z lvl=info msg="Write failed" log_id=0BkKXKwW000 service=write shard=151 error="engine: error writing WAL entry: write /var/lib/influxdb/wal/icinga/autogen/151/_02519.wal: no space left on device"
The interesting thing however is that there was still enough disk space available:
$ df -h /var/lib/influxdb/
Filesystem Type Size Used Avail Use% Mounted on
/dev/vglxc/inf-monix02-p ext4 50G 37G 13G 75% /
The initial size of that partition was 30GB and was dynamically increased to 50GB (this is a LXC container so I was able to resize the root partition online). Maybe InfluxDB still had the original disk size in memory? Let's test this theory and restart InfluxDB:
# systemctl restart influxdb
And try the backup script again:
# ./backup-influxdb.sh
Clearing /backup
Thu Mar 14 09:57:32 CET 2019: Starting Dump of all databases
2019/03/14 09:57:32 backing up metastore to /backup/meta.00
2019/03/14 09:57:32 No database, retention policy or shard ID given. Full meta store backed up.
2019/03/14 09:57:32 Backing up all databases in portable format
2019/03/14 09:57:32 backing up db=
2019/03/14 09:57:32 backing up db=_internal rp=monitor shard=146 to /backup/_internal.monitor.00146.00 since 0001-01-01T00:00:00Z
[...]
2019/03/14 09:58:37 backing up db=icinga rp=autogen shard=133 to /backup/icinga.autogen.00133.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:41 backing up db=icinga rp=autogen shard=142 to /backup/icinga.autogen.00142.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:46 backing up db=icinga rp=autogen shard=151 to /backup/icinga.autogen.00151.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:48 backing up db=mtr rp=autogen shard=37 to /backup/mtr.autogen.00037.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:48 backing up db=mtr rp=autogen shard=44 to /backup/mtr.autogen.00044.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:48 backing up db=mtr rp=autogen shard=53 to /backup/mtr.autogen.00053.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=62 to /backup/mtr.autogen.00062.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=71 to /backup/mtr.autogen.00071.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=80 to /backup/mtr.autogen.00080.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=89 to /backup/mtr.autogen.00089.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=98 to /backup/mtr.autogen.00098.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=107 to /backup/mtr.autogen.00107.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=116 to /backup/mtr.autogen.00116.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=125 to /backup/mtr.autogen.00125.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:50 backing up db=mtr rp=autogen shard=134 to /backup/mtr.autogen.00134.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:50 backing up db=mtr rp=autogen shard=143 to /backup/mtr.autogen.00143.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:50 backing up db=mtr rp=autogen shard=152 to /backup/mtr.autogen.00152.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:50 backup complete:
2019/03/14 09:58:50 /backup/20190314T085732Z.meta
2019/03/14 09:58:50 /backup/20190314T085732Z.s146.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s147.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s148.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s149.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s150.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s153.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s154.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s155.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s2.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s10.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s18.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s26.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s34.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s43.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s52.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s61.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s70.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s79.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s88.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s97.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s106.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s115.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s124.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s133.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s142.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s151.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s37.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s44.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s53.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s62.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s71.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s80.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s89.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s98.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s107.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s116.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s125.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s134.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s143.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.s152.tar.gz
2019/03/14 09:58:50 /backup/20190314T085732Z.manifest
real 1m18.625s
user 8m43.804s
sys 0m48.936s
Thu Mar 14 09:58:50 CET 2019: Finished Dump of all databases
Thu Mar 14 09:58:50 CET 2019: Backup script finished.
This time it worked!
At the end of the backup the following disk space was used:
$ df -h /var/lib/influxdb/
Filesystem Type Size Used Avail Use% Mounted on
/dev/vglxc/inf-monix02-p ext4 50G 38G 13G 76% /
I ran the backup script a couple of times and the error mentioned at the begin didn't show up anymore. So InfluxDB seems to set and memorize the disk capacity at the start of the daemon. Unfortunately I was not able to find proof for this theory. Neither in the documentation nor using "show stats" or "show diagnostics".
The InfluxDB backup script can be found on my public scripts repository on GitHub.
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder