MySQL replication out of sync: How to do a full database restore and solve error 1032 (could not execute update_rows)

Written by - 7 comments

Published on - last updated on January 31st 2022 - Listed in MySQL Database Database


After data corruption happened on a MySQL replication replica (slave) server, the databases needed to be fully synced again. In order to do this, the following steps are required:

  1. Lock the source (master) database server (applications writing to this MySQL server will fail during the lock)
  2. Retrieve the current source (master) status information
  3. Dump database(s) on (master) database server
  4. Unlock source (master) database server
  5. Transfer dump(s) to replica (slave) server
  6. Stop slave process on replica (slave) server
  7. Import database dump(s) on replica (slave) server
  8. Change replication settings by using the information retrieved in step 2 on replica (slave) server
  9. Start slave process on replica (slave) server

Note: The old replication terms were "master-slave replication", consisting of a master and one or more slave servers. The new terminology uses "source-replica replication", consisting of a source and one or more replica servers.

Monitoring replication status

The information, that the replica was out of sync, was retrieved by our monitoring, using the moniotring plugin check_mysql_slavestatus in the background:

***** Icinga  *****

 Notification Type: PROBLEM

 Service: MySQL Replication Status
 Host: mysql02
 Address: 10.10.50.49
 State: CRITICAL

 Date/Time: 2021-04-22 07:54:34 +0200

 Additional Info: CRITICAL: -h 10.10.50.49:-P 3306 Slave_SQL_Running: No

 Comment: []

After an analysis what happened, it was decided to "clear" the replica and do a full data-sync from the source.

Full replication (re-)sync

Note: A very good guide can be found on Remi Bergsma's article fixing an out-of-sync or corrupt MySQL slave.

Steps to do on the source server

On the source server, the replication state is reset and the tables must be locked to get a consistent state.

mysql> RESET MASTER;
Query OK, 0 rows affected (0.52 sec)

mysql> FLUSH TABLES WITH READ LOCK;
Query OK, 0 rows affected (0.02 sec)

mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      154 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

Save the information retrieved from the SHOW MASTER STATUS command. You'll need this later.

Then the database(s) can be saved using mysqldump:

root@mysql01:/backup# mysqldump --max_allowed_packet=512M --all-databases > mysqldump.sql

If a database contains large data fields, the --max_allowed_packet parameter should be used.

The dump should contain the CREATE DATABASE command:

root@mysql01:/backup# more mysqldump.sql
-- MySQL dump 10.13  Distrib 5.7.31, for Linux (x86_64)
--
-- Host: localhost    Database:
-- ------------------------------------------------------
-- Server version       5.7.31-log

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Current Database: `Jira`
--

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `Jira` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin */;

Now that the dump was successfully written, the tables can be unlocked:

mysql> UNLOCK TABLES;
Query OK, 0 rows affected (0.00 sec)

Now transfer the dump file to the replica server. If the dump is big, you should consider zipping it first:

root@mysql01:/backup# gzip mysqldump.sql

Steps to do on the replica server

Now with the dump transferred to the replica server, the slave process needs to be stopped and reset:

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.00 sec)

mysql> RESET SLAVE;
Query OK, 0 rows affected (0.00 sec)

To get a clean state and get rid of still remaining data corruption, I recommend to DROP all databases (except mysql internal databases information_schema, mysql, performance_schema and sys).

mysql> DROP DATABASE Jira;
Query OK, 0 rows affected (0.00 sec)

The database dump can now be loaded into the MySQL server:

root@mysql02:/backup# gunzip < mysqldump.sql.gz | mysql --max_allowed_packet=512M -u root

Once the dump was restored (without error), the replication settings need to be adjusted. Use the information retrieved from the master earlier:

mysql> CHANGE MASTER TO master_host='mysql01', master_user='repl', master_password='secret', master_log_file='mysql-bin.000001', master_log_pos=154;
Query OK, 0 rows affected, 2 warnings (0.02 sec)

mysql> START SLAVE;
Query OK, 0 rows affected (0.00 sec)

In most cases, the replication should catch up since the master's database was unlocked.

Error 1032 Could not execute Update_rows event on table

In some cases the replication can have a couple of hiccups. In this case an error 1032 showed up in the SHOW SLAVE STATUS command:

mysql> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: mysql01
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 40254923
               Relay_Log_File: mysql02-relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1032
                   Last_Error: Could not execute Update_rows event on table monitoring.monitoring; Can't find record in 'monitoring', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000001, end_log_pos 441

                 Skip_Counter: 0
          Exec_Master_Log_Pos: 154
              Relay_Log_Space: 40255304
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Update_rows event on table monitoring.monitoring; Can't find record in 'monitoring', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000001, end_log_pos 441
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: e1e2a203-eb81-11ea-8f82-0050568d9dca
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 210423 07:27:58
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

If you search for that particular error, most results will simply tell you to skip the error and start the slave again. Although this is technically correct, you should first make sure that you understand the SQL error. In this case the error is caused by yet another monitoring plugin check_mysql_write, which updates a record in the monitoring table in the monitoring database every 10 seconds. 

As this is only monitoring related and temporary data, this error can be skipped and the slave process started again:

mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> START SLAVE;
Query OK, 0 rows affected (0.00 sec)

But as I mentioned, check_mysql_write updates the table every 10 seconds, so there are a ton of these errors to skip. This can be solved doing a loop, by making sure the error is related to the monitoring database:

root@mysql02:~# while true; do if [[ $(mysql -e "SHOW SLAVE STATUS\G" | grep "Last_SQL_Error:" | grep -c "monitoring") -gt 0 ]]; then mysql -e "STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; START SLAVE;"; else break; fi; sleep 1; done

This took several seconds until finally all errors related to the monitoring database were skipped and finally the replication was running again.

***** Icinga  *****

Notification Type: RECOVERY

Service: MySQL Replication Status
Host: mysql02
Address: 10.10.50.49
State: OK

Date/Time: 2021-04-23 07:37:48 +0200

Additional Info: OK: Slave SQL running: Yes Slave IO running: Yes / master: mysql01 / slave is 0 seconds behind master

Comment: []

Exclude databases from replication

Certain databases don't need to be replicated. Such databases can be excluded from the replication by using the binlog_ignore_db configuration parameter. The monitoring database is a perfect example:

root@mysql01:~# grep "binlog_ignore_db" /etc/mysql/mysql.conf.d/mysqld.cnf
binlog_ignore_db        = monitoring

Multiple databases can be specified by defining binlog_ignore_db several times:

root@mysql01:~# grep "binlog_ignore_db" /etc/mysql/mysql.conf.d/mysqld.cnf
binlog_ignore_db        = monitoring
binlog_ignore_db        = anotherdatabase

Note: The other way around (do not replicate anything, except this database) is also possible! By using the binlog_do_db parameter.



Add a comment

Show form to leave a comment

Comments (newest first)

Amsvrid from France wrote on Jun 4th, 2021:

Thank you for the solution, this helped me :)


Claudio Kuenzler from Switzerland wrote on Jun 1st, 2021:

saurabh, there is no MariaDB 10.23 version. The newest as of today is 10.5. As mentioned before, you need to adjust your query to SHOW ALL SLAVES STATUS. See MariaDB documentation for more.


saurabh from wrote on Jun 1st, 2021:

Dear Sir,
I am using mariadb 10.23 version.How to use this command in terminal.Please guide me.

# while true; do if [[ $(mysql -e "SHOW SLAVE STATUS\G" | grep "Last_SQL_Error:" | grep -c "monitoring") -gt 0 ]]; then mysql -e "STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; START SLAVE;"; else break; fi; sleep 1; done


Claudio Kuenzler from Switzerland wrote on May 31st, 2021:

saurabh, it looks like you are using a more frequent MariaDB version so SHOW SLAVE STATUS will not work. See the article MySQL and MariaDB multi source replication and how to monitor them with check_mysql_slavestatus for more information. To make it short, you need to adjust your commands to your MariaDB version, which would most likely be SHOW ALL SLAVES STATUS.


saurabh from wrote on May 29th, 2021:

dear sir i have maria db installed at starting all my replication work perfectly but after audit this error shows, i tried many process but still get error , i saw your blog post i think it will work for me but i don't know how to use that command i simply paste all (and shows error that i pasted below)..sir please help me how to use that command.



Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Update_rows_v1 event on table template.ap_status; Can't find record in 'ap_status', Error_code: 1032; handle
r error HA_ERR_KEY_NOT_FOUND; the event's master log masterdb-bin.000008, end_log_pos 3263585
when i enter while loop command this error show below
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '[
[ $(mysql -e "SHOW SLAVE STATUS\G" | grep "Last_SQL_Error:" | grep -c "ap_st...' at line 1
MariaDB [(none)]>


Claudio Kuenzler from Switzerland wrote on May 29th, 2021:

saurabh, you need the mysql command. Usually this command comes from the mysql-client (or mariadb-client) package.


saurabh from wrote on May 29th, 2021:

sir as you mentioned in this post please guide me how to run this query i mentioned below .. i am getting-no command found ,please guide me sir
root@mysql02:~# while true; do if [[ $(mysql -e "SHOW SLAVE STATUS\G" | grep "Last_SQL_Error:" | grep -c "monitoring") -gt 0 ]]; then mysql -e "STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; START SLAVE;"; else break; fi; sleep 1; done


RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Observability   Office   OpenSearch   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder