很多时候我们在mysql数据库中会经常出现事务之间阻塞的问题,也就是阻塞lock,oracle的话实际上之前提供的脚本已经可以很直观看出阻塞的问题,那么对于mysql数据库我们应如何快速查找定位问题根源?
之前分享了innotop工具和show engine innodb status都不能很好的解决我们的需求,所以今天主要基于几张事务表来写sql看能不能定位到。
数据库:mysql5.7.24 操作系统:centos7.3
1、数据准备
create database t default charset utf8 collate utf8_general_ci;use t;create table test_blocking(id int primary key, name varchar(12));insert into test_blocking select 1, 'hwb' from dual;insert into test_blocking select 2, 'hwb2' from dual;insert into test_blocking select 3, 'hwb3' from dual;
2、参数设置
为了实验效果,我们先将参数innodb_lock_wait_timeout设置为100,否则很快就会提示error 1205 (hy000): lock wait timeout exceeded; try restarting transaction
show variables like 'innodb_lock_wait_timeout';set global innodb_lock_wait_timeout=100 ;
1、第一个会话
select connection_id() from dual;set session autocommit=0;select * from test_blocking where id=1 for update;
2、第二个会话
在第二个连接会话中执行更新脚本
select connection_id() from dual;update test_blocking set name='kk' where id=1;
3、第三个会话–通过查询information_schema数据库下与事务相关的几个系统表
--查看哪个线程被哪个堵塞,waiting_thread_id代表等待线程,blocking_thread_id代表堵塞线程selectr.trx_id waiting_trx_id,r.trx_mysql_thread_id waiting_thread_id,r.trx_query waiting_query,b.trx_id blocking_trx_id,b.trx_mysql_thread_id blocking_thread_id,b.trx_query blocking_query,now( ) - r.trx_started blocking_time frominformation_schema.innodb_lock_waits winner join information_schema.innodb_trx b on b.trx_id = w.blocking_trx_idinner join information_schema.innodb_trx r on r.trx_id = w.requesting_trx_id; --查看源头sqlselecta.sql_text,c.id,d.trx_started,b.processlist_user,b.processlist_host fromperformance_schema.events_statements_current ajoin performance_schema.threads b on a.thread_id = b.thread_idjoin information_schema.processlist c on b.processlist_id = c.idjoin information_schema.innodb_trx d on c.id = d.trx_mysql_thread_id wherec.id = 304192 order byd.trx_started;
如下截图所示,第一个sql能够查到线程304193被线程304192阻塞了, 被阻塞的sql语句为“update test_blocking set name=’kk’ where id=1;”, 能够查到被阻塞了多长时间,但是无法查到源头sql语句。此时就需要第二个sql语句登场,找到源头语句。
ps:附一段查看阻塞线程更多信息的sql
selectp2.host blockedhost,p2.user blockeduser,r.trx_id blockedtrxid,r.trx_mysql_thread_id blockedthreadid,timestampdiff( second, r.trx_wait_started, current_timestamp ) waittime,r.trx_query blockedquery,l.lock_table blockedtable,m.lock_mode blockedlockmode,m.lock_type blockedlocktype,m.lock_index blockedlockindex,m.lock_space blockedlockspace,m.lock_page blockedlockpage,m.lock_rec blockedlockrec,m.lock_data blockedlockdata,p.host blocking_host,p.user blocking_user,b.trx_id blockingtrxid,b.trx_mysql_thread_id blockingthreadid,b.trx_query blockingquery,l.lock_mode blockinglockmode,l.lock_type blockinglocktype,l.lock_index blockinglockindex,l.lock_space blockinglockspace,l.lock_page blockinglockpage,l.lock_rec blockinglockrec,l.lock_data blockinglockdata,if( p.command = 'sleep', concat( p.time, ' seconds' ), 0 ) idel_in_trx frominformation_schema.innodb_lock_waits winner join information_schema.innodb_trx b on b.trx_id = w.blocking_trx_idinner join information_schema.innodb_trx r on r.trx_id = w.requesting_trx_idinner join information_schema.innodb_locks l on w.blocking_lock_id = l.lock_id and l.lock_trx_id = b.trx_idinner join information_schema.innodb_locks m on m.lock_id = w.requested_lock_id and m.lock_trx_id = r.trx_idinner join information_schema.processlist p on p.id = b.trx_mysql_thread_idinner join information_schema.processlist p2 on p2.id = r.trx_mysql_thread_id order bywaittime desc g;
这里不要太天真的认为第二个sql语句能够获取所有场景下的阻塞源头sql语句,实际业务场景,会话可能在执行一个存储过程或复杂的业务,有可能它执行完阻塞源头sql后,继续在执行其它sql语句,此时,你抓取的是这个连接会话最后执行的sql语句。