How to change check source in an Icinga 2 distributed master-master setup

Written by - 0 comments

Published on - last updated on January 23rd 2024 - Listed in Icinga Monitoring Network


In my new Icinga 2 architecture I run a distributed setup using a master-master configuration. Both master nodes reside in two different data centers but are connected through internal LAN. Almost all host and service objects are within the "master" zone. And both master nodes (called icinga1 and icinga2) are used as endpoints for this master zone.

root@icinga1:~# cat /etc/icinga2/zones.conf
object Endpoint "icinga1" {
  host = "icinga1"
}

object Endpoint "icinga2" {
  host = "icinga2"
}

object Zone "master" {
    endpoints = [ "icinga1", "icinga2" ]
}

object Zone "global-templates" {
    global = true
}

object Zone "director-global" {
    global = true
}

Icinga automatically distributes checks across both both endpoints, therefore balancing the checks. Sometimes the checks are executed on icinga1, sometimes on icinga2. For most of the checks, this turned out to be ok.
But I came across certain checks where I needed to specifically tell Icinga from where/on which node the check must be executed. In this scenario I needed to ping the interface of the central firewall to determine differences in latency between the two locations.

Icinga 2 master-master setup

In my previous Icinga setup I used a master-satellite setup to "balance" the checks based on the physical location of the servers to achieve a "different view" of both locations. But in the master-master setup, this is balanced and the graphs contain mixed results over both locations.

So the question is: How can I force a check to be executed on a certain node?

First I tried to create two additional zones called "locationa" and "locationb" and assigned endpoint "icinga1" to "locationa" and endpoint "icinga2" to "locationb" in zones.conf:

object Zone "locationa" {
    endpoints = [ "icinga1" ]
}

object Zone "locationb" {
    endpoints = [ "icinga2" ]
}

And then moved the two service objects into the new zone folders (/etc/icinga2/zones.d/locationa and /etc/icinga2/zones.d/locationb).
But a check config showed that this didn't work and resulted in the following error:

# /etc/init.d/icinga2 checkconfig
 * checking Icinga2 configuration                                                                                     
information/cli: Icinga application loader (version: r2.8.2-1)
information/cli: Loading configuration file(s).
information/ConfigItem: Committing config item(s).
information/ApiListener: My API identity: icinga1
critical/config: Error: Endpoint 'icinga2' is in more than one zone.
Location: in /etc/icinga2/zones.conf: 5:1-5:30
/etc/icinga2/zones.conf(3): }
/etc/icinga2/zones.conf(4):
/etc/icinga2/zones.conf(5): object Endpoint "icinga2" {
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/icinga2/zones.conf(6):   host = "icinga2"
/etc/icinga2/zones.conf(7): }

critical/config: Error: Endpoint 'icinga1' is in more than one zone.
Location: in /etc/icinga2/zones.conf: 1:0-1:29
/etc/icinga2/zones.conf(1): object Endpoint "icinga1" {
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/etc/icinga2/zones.conf(2):   host = "icinga1"
/etc/icinga2/zones.conf(3): }

critical/config: 2 errors
 * checking Icinga2 configuration. Check '/var/log/icinga2/startup.log' for details.

So back to step one and I started from scratch: RTFM. And indeed, I came across this: "Pin Checks in a Zone".

In case you want to pin specific checks to their endpoints in a given zone you’ll need to use the command_endpoint attribute. This is reasonable if you want to execute a local disk check in the master Zone on a specific endpoint then.

Wow. That sounds exactly what I need. So I added the "command_endpoint" in the two config files:

# cat /etc/icinga2/zones.d/master/network/FW/firewall-locationa.conf
object Host "firewall-locationa" {
  import "dummy-host"
}

# check ping
object Service "PING FW Interface VLAN X" {
  command_endpoint = "icinga1"
  import "generic-service"
  host_name = "firewall-locationa"
  check_command = "ping"
  vars.ping_address = "192.168.99.1"
}

# cat /etc/icinga2/zones.d/master/network/FW/firewall-locationb.conf
object Host "firewall-locationb" {
  import "dummy-host"
}

# check ping
object Service "PING FW Interface VLAN X" {
  command_endpoint = "icinga2"
  import "generic-service"
  host_name = "firewall-locationb"
  check_command = "ping"
  vars.ping_address = "192.168.99.1"
}

Check config didn't report any errors, so I went ahead.

The check "PING FW Interface VLAN X" on host "firewall-locationb" worked immediately and I could see "check source" was set to "icinga2" in the UI.
But the same check on "firewall-locationa" resulted in an UNKNOWN state and output: Endpoint does not accept commands.

But this is actually quite easy to fix. The "command_endpoint" uses the Icinga 2 API in the background. Because the node icinga2 is actually a slave (although called master-master setup, the second master is setup like a satellite, simply receiving all configs), it is already configured to accept commands in the API feature:

root@icinga2:~# cat /etc/icinga2/features-enabled/api.conf
/**
 * The API listener is used for distributed monitoring setups.
 */
object ApiListener "api" {
  accept_config = true
  accept_commands = true
}

But this line (accept_commands) was missing on node icinga1. Once I added this and restarted Icinga 2, the check for for host "firewall-locationa" was working too.

With these configs I have now the same ping check running to the same destination but from two different sources. Thanks to the graphs of the ping checks I can now see the differences of RTA and Packet Losses.


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Observability   Office   OpenSearch   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder