Monitoring plugin check_netio 1.6 released: Limit tcp statistic output in performance data

Written by - 0 comments

Published on - Listed in Monitoring Linux Network


A new version of check_netio, a monitoring plugin to check network interfaces and their statistics (such as input/output and errors) on Linux, is available!

The newest release 1.6 introduces a new parameter -r which can be used in combination with the already existing parameter -t (to collect additional tcp statistics).

The problem with the -t parameter was, that the performance data output could become pretty large considering all the different TCP statistics found in /proc/net/netstat:

/usr/lib/nagios/plugins/check_netio.sh -i eth0 -t
NETIO OK - eth0: Receive 23284649421956 Bytes, Transmit 24036117884475 Bytes|NET_eth0_RX=23284649421956B;;;; NET_eth0_TX=24036117884475B;;;; NET_eth0_ERR_RX=0;;;; NET_eth0_ERR_TX=0;;;; NET_eth0_DROP_RX=236;;;; NET_eth0_DROP_TX=0;;;; SyncookiesSent=0;;;; SyncookiesRecv=0;;;; SyncookiesFailed=10708;;;; EmbryonicRsts=11;;;; PruneCalled=1132207;;;; RcvPruned=57163;;;; OfoPruned=210;;;; OutOfWindowIcmps=1756;;;; LockDroppedIcmps=0;;;; ArpFilter=0;;;; TW=8844650;;;; TWRecycled=0;;;; TWKilled=3966390849;;;; PAWSPassive=265;;;; PAWSActive=0;;;; PAWSEstab=69;;;; DelayedACKs=145526473;;;; DelayedACKLocked=86600;;;; DelayedACKLost=15639;;;; ListenOverflows=0;;;; ListenDrops=0;;;; TCPPrequeued=2257462558;;;; TCPDirectCopyFromBacklog=1634542275;;;; TCPDirectCopyFromPrequeue=910314455124;;;; TCPPrequeueDropped=0;;;; TCPHPHits=6748083892;;;; TCPHPHitsToUser=77529122;;;; TCPPureAcks=18521820705;;;; TCPHPAcks=2042634393;;;; TCPRenoRecovery=0;;;; TCPSackRecovery=23122;;;; TCPSACKReneging=157;;;; TCPFACKReorder=9886;;;; TCPSACKReorder=5568;;;; TCPRenoReorder=0;;;; TCPTSReorder=18825;;;; TCPFullUndo=19886;;;; TCPPartialUndo=95443;;;; TCPDSACKUndo=94;;;; TCPLossUndo=532376;;;; TCPLoss=6206;;;; TCPLostRetransmit=117;;;; TCPRenoFailures=0;;;; TCPSackFailures=629;;;; TCPLossFailures=241;;;; TCPFastRetrans=78667;;;; TCPForwardRetrans=3195;;;; TCPSlowStartRetrans=11752;;;; TCPTimeouts=795630;;;; TCPRenoRecoveryFail=0;;;; TCPSackRecoveryFail=105;;;; TCPSchedulerFailed=19;;;; TCPRcvCollapsed=11304426;;;; TCPDSACKOldSent=15653;;;; TCPDSACKOfoSent=4;;;; TCPDSACKRecv=30684;;;; TCPDSACKOfoRecv=2;;;; TCPAbortOnData=1363337089;;;; TCPAbortOnClose=104580;;;; TCPAbortOnMemory=0;;;; TCPAbortOnTimeout=11042;;;; TCPAbortOnLinger=0;;;; TCPAbortFailed=0;;;; TCPMemoryPressures=0;;;; TCPSACKDiscard=0;;;; TCPDSACKIgnoredOld=1;;;; TCPDSACKIgnoredNoUndo=23213;;;; TCPSpuriousRTOs=34;;;; TCPMD5NotFound=0;;;; TCPMD5Unexpected=0;;;; TCPSackShifted=39551;;;; TCPSackMerged=28792;;;; TCPSackShiftFallback=197072;;;; TCPBacklogDrop=1682;;;; TCPMinTTLDrop=0;;;; TCPOFOQueue=209658;;;; TCPOFODrop=22478;;;; TCPOFOMerge=4;;;; TCPChallengeACK=50266;;;; TCPSYNChallenge=49967;;;; BusyPollRxPackets=0;;;; TCPFromZeroWindowAdv=138540;;;; TCPToZeroWindowAdv=138540;;;; TCPWantZeroWindowAdv=3576590;;;; TCPACKSkippedSynRecv=0;;;; TCPACKSkippedPAWS=1;;;; TCPACKSkippedSeq=98514;;;; TCPACKSkippedFinWait2=0;;;; TCPACKSkippedTimeWait=0;;;; TCPACKSkippedChallenge=14;;;;

If check_netio is executed using NRPE, the output is cut (due to a output limit in NRPE) and depending where the output was cut, this could lead to performance data errors. See issue 11 for more details.

The new -r parameter was added for this purpose. This parameter awaits a comma-separated list of strings. Each of these strings will be compared and matched against the tcp statistics using regular expression. If a match happens, the tcp statistic will be added into the performance data. If not, that statistic will be skipped.

Here's a practical example where only statistics matching "loss" or "drop" should show up in the performance data:

$ ./check_netio.sh -i enp5s0 -t -r "loss,drop"
NETIO OK - enp5s0: Receive 27690651157 Bytes, Transmit 13173160148 Bytes|NET_enp5s0_RX=27690651157B;;;; NET_enp5s0_TX=13173160148B;;;; NET_enp5s0_ERR_RX=0;;;; NET_enp5s0_ERR_TX=0;;;; NET_enp5s0_DROP_RX=0;;;; NET_enp5s0_DROP_TX=0;;;; LockDroppedIcmps=0;;;; ListenDrops=0;;;; TCPLossUndo=119;;;; TCPLossFailures=12;;;; TCPLossProbes=5241;;;; TCPLossProbeRecovery=2666;;;; TCPBacklogDrop=0;;;; PFMemallocDrop=0;;;; TCPMinTTLDrop=0;;;; TCPDeferAcceptDrop=0;;;; TCPReqQFullDrop=0;;;; TCPOFODrop=0;;;;

This output is now definitely shortened enough so NRPE server can send the full output to the remote NRPE check plugin.

Besides helping to cope with the NRPE output limit, this is in general helpful for users to define themselves which statistics are relevant and should be graphed for long term statistics.


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Observability   Office   OpenSearch   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder