When Nagios notification mails show wrong host alias

Written by - 4 comments

Published on - Listed in Nagios Monitoring


In the last days I received several Nagios notifications with a wrong host alias. The bad thing: The host alias is also used in the subject. So at first sight it looks like there is a problem on a possible business critical machine but actually its only a service on a test server. This creates confusion and leads to errors.

The affected service which was in a warning state was "Disk Space /" on SERVER21. The host alias for SERVER21 is SERVER21-DEVL and the IP is 192.168.1.21. But instead the notification looked like this:

Subject: ** PROBLEM Service Alert: SERVER31-UAT/Disk Space / is WARNING **
Service: Disk Space /
Host: SERVER31-UAT
Address: 192.168.1.21
State: WARNING

As you can see, the notification mail uses the host alias rather than the real servername in the subject and in the mail body. Only the IP address is correct. But where does this wrong entry come from? The host definition of SERVER21 is correct:

define host{
        host_name               SERVER21
        alias                   SERVER21-DEVL
        address                 192.168.1.21
        }

After doing some grep-research, I came across the file retention.dat in the Nagios var folder. Here I found that multiple hosts have the wrong alias:

host {
host_name=SERVER21
alias=SERVER31-UAT
display_name=SERVER21

...

host {
host_name=SERVER33
alias=SERVER31-UAT
display_name=SERVER33

That's the source! The Nagios notifications use the current state of a host/service from this file (retention.dat) and also use the values used in it. I completely deleted retention.dat and restarted Nagios - a new retention.dat will be created but Nagios will re-check all your hosts and services and scheduled downtimes, comments, etc. will be lost.
It may also work if you stop Nagios, correct the entries in retention.dat manually and then start Nagios again but I haven't tested that.
A Nagios restart is necessary in any case. If one only changes rentention.dat, Nagios will overwrite the values again as they seem to be stored in RAM.

This problem is described more detailled in this Nagios user mailing list thread: Macro values don't seem to be consistent.



Add a comment

Show form to leave a comment

Comments (newest first)

David Fowler from Winter Garden, FL wrote on Jun 28th, 2012:

Thanks! Obsolete aliases were driving me bug-nuts.


Anders from Norway wrote on May 15th, 2012:

Great, thanks for sharing. That was getting annoying ;)


Claudio from Switzerland wrote on Jul 27th, 2011:

Hi DelGurth,
Yes, this could really be it. I will have to test with the newer version. The affected version is running 3.2.3.
Thanks for the link!


DelGurth from wrote on Jul 27th, 2011:

Perhaps this is the cause of this problem?

http://www.nagios.org/projects/nagioscore/history/core-3x

- Reverted 'Fix for retaining host display name and alias, as well as service display name' as configuration information stored incorrectly over a reload

Seems like it if you read

http://comments.gmane.org/gmane.network.nagios.user/70695


RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Office   OpenSearch   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder