I worked on some tool that checks slave status and starts it in some cases automatically. Here is strange MySQL(Percona in my case) behaviour that I stepped on.
Topology is: Master->Slave.(Multi-threaded replication).
Important slave configuration: Master_Info_File: mysql.slave_master_info,
Script logic is pretty simple:
- Connect to DB
- Get Slave status
- Check if it's my case
- Start slave
- Get Slave status & and return a result.
From first glance it looks correct and I started to test it and got some unexpected results: start slave got locked and script was waiting, at this point of time it was not clear... I started to investigate the issue.
See part of innodb status:
------- TRX HAS BEEN WAITING 20 SEC FOR THIS LOCK TO BE GRANTED:
RECORD
LOCKS space id 5 page no 3 n bits 88 index `PRIMARY` of table
`mysql`.`slave_worker_info` trx id 1687457 lock mode S waiting
------------------
---TRANSACTION
1687450, ACTIVE 262 sec
4
lock struct(s), heap size 1184, 10 row lock(s)
MySQL
thread id 55, OS thread handle 0x7f4748d43700, query id 1260 XX.XX.XX.XX
[username] Waiting for slave thread to start
text unavailable>
Thread #55 is a connect from my tool. First one is system slave thread. It looking as Deadlock, let me explain my thoughts:
By default at least python(MySQL-python)works with autocommit=OFF(I think most of MySQL libs for various scripting languages works so). When script gets slave status it holds a lock because master_info_repository=TABLE, and if you check engine of the tables(mysql.slave_master_info) they are InnoDB.
Connect #55 was trying to start slave, system tread waited for releasing lock `mysql`.`slave_worker_info` table, that was taken by #55.
Was fixed by enabling autocommit for connection.
#MySQL #Percona #Python