check_smart 6.3 now allows to completely ignore SMART attributes

Written by - 0 comments

Published on - Listed in Monitoring Hardware


The monitoring plugin check_smart, a perl script to monitor the S.M.A.R.T. table and health status of hard and solid state drives, is out with a new version.

Version 6.3 added a new parameter -E / --exclude-all which "kind of" supplements the already existing -e / --exclude parameter. However there's one major difference:

The already existing -e parameter accepted a comma-separated list of SMART attributes to ignore/exclude from the checks and these attributes should not alert. But the values of these attributes are still (silently) parsed and added to performance data, allowing to create historical graphs.

The additional -E parameter completely removes the listed attribute(s) completely from both check/alert and performance data. The usage of this parameter might be a rare case (more data to compare over the past is generally good) but it can help in some problematic scenarios. Such a scenario is described in issue 41 on the GitHub repository of check_smart.

Credits go to Michael Krahe - thanks for the contribution!

And here are live example to compare -e and -E:

# ./check_smart -i ata -d /dev/sdb
OK: Drive  WDC WDS240G2G0A-00JH30 S/N XXXXXXXXXXX: no SMART errors detected. |Reallocated_Sector_Ct=0 Power_On_Hours=2801 Power_Cycle_Count=5 End-to-End_Error=0 Reported_Uncorrect=0 Command_Timeout=0 Temperature_Celsius=41 UDMA_CRC_Error_Count=0 Unknown_SSD_Attribute=16505765891843 Available_Reservd_Space=100 Media_Wearout_Indicator=15969 Total_LBAs_Written=18521 Total_LBAs_Read=940

By adding -e a couple of attributes are ignored if they would alert, but they still show up in the performance data:

# ./check_smart -i ata -d /dev/sdb -e "Reallocated_Sector_Ct,Power_On_Hours,Power_Cycle_Count,End-to-End_Error,Reported_Uncorrect,Command_Timeout,Temperature_Celsius,Unknown_SSD_Attribute,Available_Reservd_Space,Media_Wearout_Indicator"
OK: Drive  WDC WDS240G2G0A-00JH30 S/N XXXXXXXXXXX: no SMART errors detected. |Reallocated_Sector_Ct=0 Power_On_Hours=2801 Power_Cycle_Count=5 End-to-End_Error=0 Reported_Uncorrect=0 Command_Timeout=0 Temperature_Celsius=40 UDMA_CRC_Error_Count=0 Unknown_SSD_Attribute=16505765891843 Available_Reservd_Space=100 Media_Wearout_Indicator=15969 Total_LBAs_Written=18522 Total_LBAs_Read=940

By using -E (or even used together -e, although it doesn't make a lot of sense), these attributes completely disappear:

# ./check_smart -i ata -d /dev/sdb -E "Reallocated_Sector_Ct,Power_On_Hours,Power_Cycle_Count,End-to-End_Error,Reported_Uncorrect,Command_Timeout,Temperature_Celsius,Unknown_SSD_Attribute,Available_Reservd_Space,Media_Wearout_Indicator"
OK: Drive  WDC WDS240G2G0A-00JH30 S/N XXXXXXXXXXX: no SMART errors detected. |UDMA_CRC_Error_Count=0 Total_LBAs_Written=18522 Total_LBAs_Read=940



Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Office   OpenSearch   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder