How to delete (old) monitoring data in InfluxDB with WHERE condition

Written by - 0 comments

Published on - last updated on February 20th 2023 - Listed in Icinga Influx Database


In most of my monitoring setups I am using Icinga 2 as monitoring core and InfluxDB as metrics / time series database. Over the years a lot of data is accumulated. Almost every service on each host collects (performance) data and this uses disk space.

Here's an example of such a InfluxDB server with a pretty steady data growth:

Of course over time new servers come and old servers go. Whilst they are removed from Icinga, their data is still kept in InfluxDB. Time to do a clean up. Let's look at two possibilities.

DELETE FROM measurement WHERE condition

The more "manual" way to delete data is to use the DELETE FROM statement directly on the measurement (table). With a WHERE condition we can delete all data matching a specific hostname:

> use icinga
Using database icinga

> SELECT * FROM check_mssql WHERE "hostname" = 'mssqlserver001' LIMIT 10
name: check_mssql
time                crit hostname      max metric          min service                                 unit value   warn
----                ---- --------      --- ------          --- -------                                 ---- -----   ----
1542102730000000000      mssqlserver001     select              MSSQL sqlprod001 Page Life Expectancy        819     
1542102738000000000 5    mssqlserver001     connection_time     MSSQL sqlprod003 Connection                  0.02    1
1542102744000000000 5    mssqlserver001     connection_time     MSSQL sqlprod002 Connection                  0.01    1
1542102746000000000 5    mssqlserver001     connection_time     MSSQL sqlprod004 Connection                  0.04    1
1542102751000000000      mssqlserver001     select              MSSQL sqlprod004 Page Writes per Second      1698104
1542102757000000000      mssqlserver001     select              MSSQL sqlprod001 Page Writes per Second      5971021
1542102761000000000      mssqlserver001     select              MSSQL sqlprod001 Buffer Cache Hit            100     
1542102772000000000 5    mssqlserver001     connection_time     MSSQL sqlprod001 Connection                  0.02    1
1542102782000000000      mssqlserver001     select              MSSQL sqlprod004 Page Life Expectancy        819     
1542102785000000000      mssqlserver001     select              MSSQL sqlprod002 Page Writes per Second      2401   

> DELETE FROM check_mssql WHERE "hostname" =~ /^mssqlserver001$/

> SELECT * FROM check_mssql WHERE "hostname" = 'mssqlserver001' LIMIT 10
> [empty result]

This works fine, but involves running through each measurement (table). Wouldn't it be nice to discover where this particular hostname is in use across all measurements? Ha! This exists is called a SERIES!

DROP SERIES WHERE condition

By using SERIES, the same hostname can be found across all measurements inside the same database. The SHOW SERIES query nicely represents this. Let's take a look at another host (oldserver001):

> use icinga
Using database icinga

> SHOW SERIES WHERE "hostname" = 'oldserver001'
key
---
check_win_disk_usage,hostname=oldserver001,metric=bytes_read,service=Disk\ IO\ C
check_win_disk_usage,hostname=oldserver001,metric=bytes_read,service=Disk\ IO\ E
check_win_disk_usage,hostname=oldserver001,metric=bytes_write,service=Disk\ IO\ C
check_win_disk_usage,hostname=oldserver001,metric=bytes_write,service=Disk\ IO\ E
check_win_net_usage,hostname=oldserver001,metric=bytes_in,service=Network\ IO\ vmxnet3\ Ethernet\ Adapter
check_win_net_usage,hostname=oldserver001,metric=bytes_out,service=Network\ IO\ vmxnet3\ Ethernet\ Adapter
hostalive,hostname=oldserver001,metric=pl
hostalive,hostname=oldserver001,metric=rta
nscp,hostname=oldserver001,metric=uptime,service=Uptime
nscp_CPULOAD,hostname=oldserver001,metric=15\ min\ avg\ Load,service=CPU\ Usage
nscp_CPULOAD,hostname=oldserver001,metric=60\ min\ avg\ Load,service=CPU\ Usage
nscp_MEMUSE,hostname=oldserver001,metric=Memory\ usage,service=Memory\ Usage
nscp_USEDDISKSPACE,hostname=oldserver001,metric=C:\\ Used\ Space,service=Diskspace\ C
nscp_USEDDISKSPACE,hostname=oldserver001,metric=E:\\ Used\ Space,service=Diskspace\ E

The SHOW SERIES query can also be further limited to a specific measurement:

> SHOW SERIES FROM check_win_disk_usage WHERE "hostname" = 'oldserver001'
key
---
check_win_disk_usage,hostname=oldserver001,metric=bytes_read,service=Disk\ IO\ C
check_win_disk_usage,hostname=oldserver001,metric=bytes_read,service=Disk\ IO\ E
check_win_disk_usage,hostname=oldserver001,metric=bytes_write,service=Disk\ IO\ C
check_win_disk_usage,hostname=oldserver001,metric=bytes_write,service=Disk\ IO\ E

Let's drop all the series across all measurements used by this specific hostname:

> DROP SERIES WHERE "hostname" = 'oldserver001'

Are they gone now?

> SHOW SERIES WHERE "hostname" = 'oldserver001'
> [empty result]

Yep, series and the data related to the series is now gone.

Using a for loop to bulk delete series

By using a for loop in Bash, a list of hostnames can be parsed and series, related to these hostnames, are deleted. 

Example:

root@influx:~# for entry in oldserver001 oldserver002 oldserver055 ; do echo "drop series where hostname = '${entry}'" | influx -username influxuser -password secret -database icinga ; sleep 5; done

Of course you can also read the entries from a file, e.g.:

root@influx:~# for entry in $(cat /tmp/hoststodelete) ; do echo "drop series where hostname = '${entry}'" | influx -username influxuser -password secret -database icinga ; sleep 5; done

The key here is to use the Influx Query (IQL) inside the echo function and pipe it to influx itself. If you use authentication, the influx command needs to be appended with the relevant authentication information (see example above). The database selection (-database) is also a must.

Data deletion is not replicated

Watch out if you are using an InfluxDB subscription (replication): the data deletion (using DELETE or DROP SERIES) is not replicated to the slave server(s). You will have to run the same command on the slave server(s), too.

Temporary disk usage increase while deleting data

Make sure you have enough free disk space available before you launch the DELETE or DROP SERIES commands. The disk usage will temporarily increase during these operations and disk space will be freed once the commands are finished.

Temporary disk usage increase while running DROP SERIES on InfluxDB


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Observability   Office   OpenSearch   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder