Host object in Nagios marked as down, check_ping shows network unreachable

Written by - 0 comments

Published on - Listed in Nagios Monitoring Network


On a Nagios 4.x installation, a particular host was marked as DOWN.

Yet the remote host was perfectly up and working. Why would Nagios think that the host is down though?

Taking a closer look at the host definition showed the following configuration:

define host{
        use                     remote-host
        hostgroups              group1,group2
        host_name               www.example.com
}

Note: www.example.com is obviously an anonymized place holder.

If you've been a long time Nagios user, the first thing which catches your eye is the missing "address" in this host definition. The official Nagios object definition documentation clearly marks the address field as required keyword:

Nagios host definition

However, there's a catch with the address field: It can actually be omitted. In this situation, the host_name will be used as address. The same documentation mentions:

Note: If you do not specify an address directive in a host definition, the name of the host will be used as its address. A word of caution about doing this, however - if DNS fails, most of your service checks will fail because the plugins will be unable to resolve the host name.

That means that www.example.com will be used as the host's address. To manually check what happens in the background of a host check, we need to look at the "remote-host" template:

# remote-host template
define host{
  name                  remote-host    ; The name of this host template
  use                   generic-host    ; This template inherits other values from the generic-host template
  check_period          24x7            ; By default, Linux hosts are checked round the clock
  check_interval        3               ; Actively check the host every 5 minutes
  retry_interval        1               ; Schedule host check retries at 1 minute intervals
  max_check_attempts    2              ; Check each Linux host 10 times (max)
  check_command         check-host-alive ; Default command to check Linux hosts
  notification_period   24x7       ; Linux admins hate to be woken up, so we only notify during the day
[...]
  register              0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

The check_command of this host (template) object shows the check-host-alive. Now let's find out what this command is doing by taking a look at the command definition:

# 'check-host-alive' command definition
define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}

The check-host-alive uses the check_ping monitoring plugin in the background to determine whether the remote host is up or down. By executing the plugin manually, we should obtain the same result as Nagios shows in the user interface:

root@nagios:~# /usr/lib/nagios/plugins/check_ping -H www.example.com -w 3000.0,80% -c 5000.0,100% -p 5
CRITICAL - Network Unreachable (www.example.com)

Indeed, the same "Network Unreachable" error is shown. But why does this happen? A DNS resolving of the target name shows why: The target (www.example.com) can be resolved to both IPv4 and IPv6 addresses. Today's servers are often using IPv6 by default, unless otherwise configured or completely disabled.

To enforce a communication with IPv4, the check_ping plugin can be told to use IPv4 with the -4 parameter:

root@nagios:~# /usr/lib/nagios/plugins/check_ping -H www.example.com -w 3000.0,80% -c 5000.0,100% -p 5 -4
PING OK - Packet loss = 0%, RTA = 1.16 ms|rta=1.160000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0

The ping now responds enforcing IPv4. 

The solution for Nagios in this situation? Either add the IPv4 address in the host definition or adjust the check-host-alive command (append -4 to the command_line).


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Office   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder