Creating an InfluxDB asynchronous replication using subscription service

Written by - 1 comments

Published on - last updated on February 20th 2023 - Listed in Database Influx Monitoring Icinga


Today a full disk corrupted my InfluxDB 0.8 metrics database for Icinga2 monitoring and I was unable to recover the data.
I found quite some issues with the same errors I found in the logs but all of them were fixed in more recent versions. Luckily this database is not in production yet so this is probably (and forcibly) the day to use a newer InfluxDB version.

InfluxDB 0.8 came from the Ubuntu repositories itself. And it featured a cluster setup! Unfortunately newer versions do not support a cluster setup anymore, unless you buy the license for the Enterprise Edition of InfluxDB. That was the reason why I thought I'd just stay on 0.8. But, as mentioned, a lot of bugfixes and improvements happened since then. 

However I didn't want to give up on the cluster or to have at least a standby InfluxDB that the graphs can still be shown, even if the primary monitoring server is down. This is when I came across subscriptions.

According to the documentation, the receiving InfluxDB copies data to known subscribers. Imagine this like a mail server sending a newsletter to registered subscribers (this comparison really helps, doesn't it? You're welcome!). In order to do that, the subscriber service needs to be enabled in the InfluxDB config (usually /etc/influxdb/influxdb.conf):

[subscriber]
  # Determines whether the subscriber service is enabled.
  enabled = true

Restart InfluxDB after this setting change.

On the master server you need to define the known subscribers:

root@influx01:/# influx
Connected to http://localhost:8086 version 1.6.4
InfluxDB shell version: 1.6.4
> SHOW SUBSCRIPTIONS
> CREATE SUBSCRIPTION "icinga-replication" ON "icinga"."autogen" DESTINATIONS ALL 'http://10.10.1.2:8086'

So what does it do?

Obviously a new subscription with the unique ID "icinga-replication" is created. It covers the database "icinga" and sets a retention policy of "autogen".
As destination the transport over http was chosen and endpoint is 10.10.1.2:8086, which is InfluxDB running on host influx02.

By showing the subscriptions (SHOW SUBSCRIPTIONS), this can be confirmed:

> SHOW SUBSCRIPTIONS
name: icinga
retention_policy name               mode destinations
---------------- ----               ---- ------------
autogen          icinga-replication ALL  [http://10.10.1.2:8086]

From now on every record written into the influx01 instance is copied to to influx02 and will show up with such entries on influx02:

influx02 influxd[5045]: [httpd] 10.10.1.1 - icinga [13/Nov/2018:14:23:41 +0100] "POST /write?consistency=&db=icinga&precision=ns&rp=autogen HTTP/1.1" 204 0 "-" "InfluxDBClient" 564b790a-e747-11e8-8b0c-000000000000 13708

But what about authentication?

When authentication is enabled (which should always be the case on a database), the following error message appears in the log file on the master:

influx01 influxd[7232]: ts=2018-11-13T10:18:11.142719Z lvl=info msg="{\"error\":\"unable to parse authentication credentials\"}\n" log_id=0Bk8uDzG000 service=subscriber

In this case, the subscription needs to be added with credentials in the URL string:

> CREATE SUBSCRIPTION "icinga-replication" ON "icinga"."autogen" DESTINATIONS ALL 'http://dbuser:password@10.10.1.2:8086'

Note: I used the same subscription ID again (icinga-replication). In order to do so, the existing subscription must be removed:

> DROP SUBSCRIPTION "icinga-replication" ON "icinga"."autogen"

And what about data prior to the subscription "replication"?

The subscription service only copies new incoming data. This means that older data is not copied over to the subscribers. In this case you need to transfer the data using dump/restore or even by syncing the data (/var/lib/influxdb/data) and wal (/var/lib/influxdb/wal) directories.

In my case I stopped InfluxDB on both influx01 and influx02, rsynced the content of /var/lib/influxdb/data and /var/lib/influxdb/wal from influx01 to influx02 and then started InfluxDB again. First on host influx02, then on influx01. Now data is in sync (until a network issue or similar happens).

As this graphing database is not production critical I can live with that situation.


Add a comment

Show form to leave a comment

Comments (newest first)

droidrider from France wrote on Dec 22nd, 2018:

Thank you very much for this article.
It work smoothly with InfluxDB 1.7.1.


RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Observability   Office   OpenSearch   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder