To use a simple monitoring of a MySQL server (whether this is MySQL itself, MariaDB, Percona or other fancy forks), the standard plugin is - obviously - check_mysql, which is part of the monitoring-plugins package. However check_mysql only makes a read query as it retrieves some basic statistics from the MySQL server.
To monitor write operations as well (e.g. to catch a full file system) I've been using a simple shell script in the past few years: check_mysql_write.sh .
This shell script (from now on mentioned as plugin) is executed from the monitoring server and establishes a remote connection to the given database server:
$ ./check_mysql_write.sh -H dbhost -P port -u dbuser -p dbpass -d database
Note: Of course this can also be executed locally on the DB server.
The plugin relies on a table monitoring in a dedicated database. Here are the statements to prepare this monitoring table:
MariaDB [(none)]> CREATE TABLE mymonitoring.monitoring ( id INT(1), mytime INT(13) );
MariaDB [(none)]> INSERT INTO mymonitoring.monitoring (id, mytime) VALUES (1, 1421421409);
The plugin check_mysql_write would then update the table with the current timestamp, with the following UPDATE query:
UPDATE monitoring SET mytime=$curtime WHERE id=1
This results in the following table with content:
MariaDB [mymonitoring]> select * from monitoring;
+------+------------+
| id | mytime |
+------+------------+
| 1 | 1558676432 |
+------+------------+
1 row in set (0.00 sec)
Unless there was an error during the UPDATE query, the plugin returns OK and - hurray - you have a working MySQL server.
However there's a problem when running multi master nodes, or a Galera cluster. As the plugin is executed on multiple database hosts of the same cluster, this table is the same across the cluster. And because there's only one row with a unique ID (1) in this table, this causes problems.
On a three node Galera cluster with 4 monitoring servers and a check interval of 10 seconds, this single row is updated at least 12 times within 10 seconds. Forcibly the write operations overlap and this creates a deadlock. The UPDATE query would then get the following error:
ERROR 1213 (40001) at line 1: Deadlock found when trying to get lock; try restarting transaction
And the plugin would return "CRITICAL: There was an error trying to write into mymonitoring.monitoring. Do a manual check.".
To overcome that deadlock situation, I adapted the structure of the monitoring table and the plugin itself. Instead of using a hard-coded unique ID, the table now has columns host and mytime. The host column is a VARCHAR type which has the values of the hostname on which the plugin runs. The new table is created like this:
MariaDB [(none)]> CREATE TABLE mymonitoring.monitoring ( host VARCHAR(100), mytime INT(13) );
The plugin now tries to update the timestamp of the row where the local hostname matches the hostname in the table:
UPDATE monitoring SET mytime=$curtime WHERE host = '$(hostname)'
Obviously if the plugin runs for the first time, there is no row with the local hostname and the plugin would fail. This is why the plugin itself creates the row at the very first run, if it can't find it in the table:
INSERT INTO monitoring (host, mytime) VALUES ('$(hostname)', $curtime)
Running the plugin is straightforward:
$ /usr/lib/nagios/plugins/check_mysql_write.sh -H galeranode1 -u monitoring -p secret -d mymonitoring
OK: Write query successful (UPDATE monitoring SET mytime=1558683558 WHERE host='icinga1')
Hurray, the write operation was successful. The plugins which run all over different monitoring hosts created the following entries in the monitoring table:
MariaDB [mymonitoring]> select * from monitoring;
+---------------------------+------------+
| host | mytime |
+---------------------------+------------+
| icinga1 | 1558680145 |
| icinga2 | 1558680107 |
| icinga3 | 1558680109 |
| icinga4 | 1558680131 |
| haproxy1 | 1558680149 |
| haproxy2 | 1558680148 |
+---------------------------+------------+
6 rows in set (0.00 sec)
The plugin check_mysql_write can now be used by everyone interested. You can find it on GitHub: https://github.com/Napsty/check_mysql_write .
Problems in Galera Clusters are not always easy to spot. Need help troubleshooting a Galera cluster? Contact us on Infiniroot.com.
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Observability Office OpenSearch PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder