A new version of check_netio, a monitoring plugin to check network interfaces and their statistics (such as input/output and errors) on Linux, is available!
The newest release 1.6 introduces a new parameter -r which can be used in combination with the already existing parameter -t (to collect additional tcp statistics).
The problem with the -t parameter was, that the performance data output could become pretty large considering all the different TCP statistics found in /proc/net/netstat:
/usr/lib/nagios/plugins/check_netio.sh -i eth0 -t
NETIO OK - eth0: Receive 23284649421956 Bytes, Transmit 24036117884475
Bytes|NET_eth0_RX=23284649421956B;;;; NET_eth0_TX=24036117884475B;;;;
NET_eth0_ERR_RX=0;;;; NET_eth0_ERR_TX=0;;;; NET_eth0_DROP_RX=236;;;;
NET_eth0_DROP_TX=0;;;; SyncookiesSent=0;;;; SyncookiesRecv=0;;;;
SyncookiesFailed=10708;;;; EmbryonicRsts=11;;;; PruneCalled=1132207;;;;
RcvPruned=57163;;;; OfoPruned=210;;;; OutOfWindowIcmps=1756;;;;
LockDroppedIcmps=0;;;; ArpFilter=0;;;; TW=8844650;;;; TWRecycled=0;;;;
TWKilled=3966390849;;;; PAWSPassive=265;;;; PAWSActive=0;;;;
PAWSEstab=69;;;; DelayedACKs=145526473;;;; DelayedACKLocked=86600;;;;
DelayedACKLost=15639;;;; ListenOverflows=0;;;; ListenDrops=0;;;;
TCPPrequeued=2257462558;;;; TCPDirectCopyFromBacklog=1634542275;;;;
TCPDirectCopyFromPrequeue=910314455124;;;; TCPPrequeueDropped=0;;;;
TCPHPHits=6748083892;;;; TCPHPHitsToUser=77529122;;;;
TCPPureAcks=18521820705;;;; TCPHPAcks=2042634393;;;;
TCPRenoRecovery=0;;;; TCPSackRecovery=23122;;;; TCPSACKReneging=157;;;;
TCPFACKReorder=9886;;;; TCPSACKReorder=5568;;;; TCPRenoReorder=0;;;;
TCPTSReorder=18825;;;; TCPFullUndo=19886;;;; TCPPartialUndo=95443;;;;
TCPDSACKUndo=94;;;; TCPLossUndo=532376;;;; TCPLoss=6206;;;;
TCPLostRetransmit=117;;;; TCPRenoFailures=0;;;; TCPSackFailures=629;;;;
TCPLossFailures=241;;;; TCPFastRetrans=78667;;;;
TCPForwardRetrans=3195;;;; TCPSlowStartRetrans=11752;;;;
TCPTimeouts=795630;;;; TCPRenoRecoveryFail=0;;;;
TCPSackRecoveryFail=105;;;; TCPSchedulerFailed=19;;;;
TCPRcvCollapsed=11304426;;;; TCPDSACKOldSent=15653;;;;
TCPDSACKOfoSent=4;;;; TCPDSACKRecv=30684;;;; TCPDSACKOfoRecv=2;;;;
TCPAbortOnData=1363337089;;;; TCPAbortOnClose=104580;;;;
TCPAbortOnMemory=0;;;; TCPAbortOnTimeout=11042;;;;
TCPAbortOnLinger=0;;;; TCPAbortFailed=0;;;; TCPMemoryPressures=0;;;;
TCPSACKDiscard=0;;;; TCPDSACKIgnoredOld=1;;;;
TCPDSACKIgnoredNoUndo=23213;;;; TCPSpuriousRTOs=34;;;;
TCPMD5NotFound=0;;;; TCPMD5Unexpected=0;;;; TCPSackShifted=39551;;;;
TCPSackMerged=28792;;;; TCPSackShiftFallback=197072;;;;
TCPBacklogDrop=1682;;;; TCPMinTTLDrop=0;;;; TCPOFOQueue=209658;;;;
TCPOFODrop=22478;;;; TCPOFOMerge=4;;;; TCPChallengeACK=50266;;;;
TCPSYNChallenge=49967;;;; BusyPollRxPackets=0;;;;
TCPFromZeroWindowAdv=138540;;;; TCPToZeroWindowAdv=138540;;;;
TCPWantZeroWindowAdv=3576590;;;; TCPACKSkippedSynRecv=0;;;;
TCPACKSkippedPAWS=1;;;; TCPACKSkippedSeq=98514;;;;
TCPACKSkippedFinWait2=0;;;; TCPACKSkippedTimeWait=0;;;;
TCPACKSkippedChallenge=14;;;;
If check_netio is executed using NRPE, the output is cut (due to a output limit in NRPE) and depending where the output was cut, this could lead to performance data errors. See issue 11 for more details.
The new -r parameter was added for this purpose. This parameter awaits a comma-separated list of strings. Each of these strings will be compared and matched against the tcp statistics using regular expression. If a match happens, the tcp statistic will be added into the performance data. If not, that statistic will be skipped.
Here's a practical example where only statistics matching "loss" or "drop" should show up in the performance data:
$ ./check_netio.sh -i enp5s0 -t -r "loss,drop"
NETIO OK - enp5s0: Receive 27690651157 Bytes, Transmit 13173160148 Bytes|NET_enp5s0_RX=27690651157B;;;; NET_enp5s0_TX=13173160148B;;;; NET_enp5s0_ERR_RX=0;;;; NET_enp5s0_ERR_TX=0;;;; NET_enp5s0_DROP_RX=0;;;; NET_enp5s0_DROP_TX=0;;;; LockDroppedIcmps=0;;;; ListenDrops=0;;;; TCPLossUndo=119;;;; TCPLossFailures=12;;;; TCPLossProbes=5241;;;; TCPLossProbeRecovery=2666;;;; TCPBacklogDrop=0;;;; PFMemallocDrop=0;;;; TCPMinTTLDrop=0;;;; TCPDeferAcceptDrop=0;;;; TCPReqQFullDrop=0;;;; TCPOFODrop=0;;;;
This output is now definitely shortened enough so NRPE server can send the full output to the remote NRPE check plugin.
Besides helping to cope with the NRPE output limit, this is in general helpful for users to define themselves which statistics are relevant and should be graphed for long term statistics.
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Observability Office OpenSearch PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder