I've been using check_ilo2_health.pl successfully for a couple of months now to monitor HP server hardware. Even though it cannot be used with ILO1 (who uses that still?) and many hardware checks are missing in ILO2, it works very well with newer ILO3 and ILO4 servers. Yes, I'd prefer to use check_hpasm but this would require to install additional HP software on the servers which is not wanted/possible at my employer. So check_ilo2_health is a great alternative.
Yesterday I came across a newer ILO4 firmware version (1.20) which I updated on two Gen8 servers. As soon as that was done, the Icinga monitoring making the hardware checks with the mentioned check_ilo2_health, sent a warning notification:
ILO2_HEALTH WARNING - Power_Supply_1: Good, In Use, Power_Supply_2: Unknown
I was aware that this particular server only had one power supply. That's why I didn't use the "-o" parameter to not check the power redundancy. But the plugin still decided to exit with a warning. The reason for this must come from the ILO firmware upgrade. check_ilo2_health queries an xml file retrieved by https from the ILO IP. Maybe the output has changed in the new version...
To be sure I contacted the maintainer of the plugin, Alexander Greiner-Bär, and sent him the verbose output. He quickly responded to me, that the output indeed has changed. Here's the relevant output comparing from version 1.10 to 1.20:
1.10:
<POWER_SUPPLIES>
<SUPPLY>
<LABEL VALUE = "Power Supply 1"/>
<STATUS VALUE = "OK"/>
</SUPPLY>
<SUPPLY>
<LABEL VALUE = "Power Supply 2"/>
<STATUS VALUE = "Not Installed"/>
</SUPPLY>
</POWER_SUPPLIES>
1.20:
<POWER_SUPPLIES>
<POWER_SUPPLY_SUMMARY>
<PRESENT_POWER_READING VALUE = "100 Watts"/>
<POWER_MANAGEMENT_CONTROLLER_FIRMWARE_VERSION VALUE = "3.0"/>
<POWER_SYSTEM_REDUNDANCY VALUE = "Not Redundant"/>
<HP_POWER_DISCOVERY_SERVICES_REDUNDANCY_STATUS VALUE = "N/A"/>
<HIGH_EFFICIENCY_MODE VALUE = "N/A"/>
</POWER_SUPPLY_SUMMARY>
<SUPPLY>
<LABEL VALUE = "Power Supply 1"/>
<PRESENT VALUE = "Yes"/>
<STATUS VALUE = "Good, In Use"/>
<PDS VALUE = "Yes"/>
<HOTPLUG_CAPABLE VALUE = "Yes"/>
<MODEL VALUE = "656362-B21"/>
<SPARE VALUE = "660184-001"/>
<SERIAL_NUMBER VALUE = "XXXXXXXXXXXXX"/>
<CAPACITY VALUE = "460 Watts"/>
<FIRMWARE_VERSION VALUE = "1.03"/>
</SUPPLY>
<SUPPLY>
<LABEL VALUE = "Power Supply 2"/>
<PRESENT VALUE = "No"/>
<STATUS VALUE = "Unknown"/>
<PDS VALUE = "Other"/>
<HOTPLUG_CAPABLE VALUE = "Yes"/>
<MODEL VALUE = "N/A"/>
<SPARE VALUE = "N/A"/>
<SERIAL_NUMBER VALUE = "N/A"/>
<CAPACITY VALUE = "N/A Watts"/>
<FIRMWARE_VERSION VALUE = "N/A"/>
</SUPPLY>
</POWER_SUPPLIES>
Not only does version 1.20 contain a lot of more information about the power supplies, the "Status Value" also changed from "Not Installed" to "Unknown" which caused the plugin to fail.
Alexander made an immediate fix in the newest version 1.56 of check_ilo2_health considering the new status values (thanks!!). And now it works:
./check_ilo2_health-156.pl -H ILOIP -u USER -p XXX -3 -a -c -n -t 120
ILO2_HEALTH-156 OK - No faults detected
check_ilo2_health.pl version 1.56 can be downloaded here.
Rainer E. from wrote on Apr 24th, 2013:
Thanks, this blog solved the iLO issue that caused me sleepless nights for weeks already ;-)
Claudio from Switzerland wrote on Mar 21st, 2013:
I know its not easy to find the new versions or an official download website for check_ilo2_health. Besides monitoringexchange.org there seems to be no download possibility. I'll add a download link at the bottom of my post.
Osvaldo from Chile wrote on Mar 20th, 2013:
Where i can download version 1.56?
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder