The monitoring plugin check_smart, a perl script to monitor the S.M.A.R.T. table and health status of hard and solid state drives, is out with a new version.
Version 6.3 added a new parameter -E / --exclude-all which "kind of" supplements the already existing -e / --exclude parameter. However there's one major difference:
The already existing -e parameter accepted a comma-separated list of SMART attributes to ignore/exclude from the checks and these attributes should not alert. But the values of these attributes are still (silently) parsed and added to performance data, allowing to create historical graphs.
The additional -E parameter completely removes the listed attribute(s) completely from both check/alert and performance data. The usage of this parameter might be a rare case (more data to compare over the past is generally good) but it can help in some problematic scenarios. Such a scenario is described in issue 41 on the GitHub repository of check_smart.
Credits go to Michael Krahe - thanks for the contribution!
And here are live example to compare -e and -E:
# ./check_smart -i ata -d /dev/sdb
OK: Drive WDC WDS240G2G0A-00JH30 S/N XXXXXXXXXXX: no SMART errors detected. |Reallocated_Sector_Ct=0 Power_On_Hours=2801 Power_Cycle_Count=5 End-to-End_Error=0 Reported_Uncorrect=0 Command_Timeout=0 Temperature_Celsius=41 UDMA_CRC_Error_Count=0 Unknown_SSD_Attribute=16505765891843 Available_Reservd_Space=100 Media_Wearout_Indicator=15969 Total_LBAs_Written=18521 Total_LBAs_Read=940
By adding -e a couple of attributes are ignored if they would alert, but they still show up in the performance data:
# ./check_smart -i ata -d /dev/sdb -e "Reallocated_Sector_Ct,Power_On_Hours,Power_Cycle_Count,End-to-End_Error,Reported_Uncorrect,Command_Timeout,Temperature_Celsius,Unknown_SSD_Attribute,Available_Reservd_Space,Media_Wearout_Indicator"
OK: Drive WDC WDS240G2G0A-00JH30 S/N XXXXXXXXXXX: no SMART errors detected. |Reallocated_Sector_Ct=0 Power_On_Hours=2801 Power_Cycle_Count=5 End-to-End_Error=0 Reported_Uncorrect=0 Command_Timeout=0 Temperature_Celsius=40 UDMA_CRC_Error_Count=0 Unknown_SSD_Attribute=16505765891843 Available_Reservd_Space=100 Media_Wearout_Indicator=15969 Total_LBAs_Written=18522 Total_LBAs_Read=940
By using -E (or even used together -e, although it doesn't make a lot of sense), these attributes completely disappear:
# ./check_smart -i ata -d /dev/sdb -E "Reallocated_Sector_Ct,Power_On_Hours,Power_Cycle_Count,End-to-End_Error,Reported_Uncorrect,Command_Timeout,Temperature_Celsius,Unknown_SSD_Attribute,Available_Reservd_Space,Media_Wearout_Indicator"
OK: Drive WDC WDS240G2G0A-00JH30 S/N XXXXXXXXXXX: no SMART errors detected. |UDMA_CRC_Error_Count=0 Total_LBAs_Written=18522 Total_LBAs_Read=940
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office OpenSearch PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder