InnoDB的MySQL死锁问题 [英] MySQL deadlocking issue with InnoDB

查看:96
本文介绍了InnoDB的MySQL死锁问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个中央数据库服务器和几个工作者"服务器,它们同时执行以下查询:

I have a central database server and several "worker" servers which are executing queries like this concurrently:

UPDATE job_queue 
SET
  worker = '108.166.81.112',
  attempts = attempts + 1,
  started = '2014-01-14 10:34:03',
  token = '13eb3e6a8c3e1becb34051e08f19fd62'
WHERE completed = '0000-00-00 00:00:00'
  AND (started = '0000-00-00 00:00:00' OR started < '2014-01-14 10:29:03')
  AND attempts < 2
ORDER BY priority DESC, inserted
LIMIT 1

有时,我的job_queue表被锁定,并且如果我运行"SHOW ENGINE INNODB STATUS",我会得到类似的东西:

Occasionally my job_queue table locks up and if I run "SHOW ENGINE INNODB STATUS", I get something like this:

------------------------
LATEST DETECTED DEADLOCK
------------------------
140114 10:34:15
*** (1) TRANSACTION:
TRANSACTION 0 46984514, ACTIVE 0 sec, process no 590, OS thread id 140366633146112 fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 20 lock struct(s), heap size 3024, 545 row lock(s)
MySQL thread id 677401, query id 19385205 10.179.103.110 root init
UPDATE job_queue SET worker='108.166.81.112', attempts=attempts+1, started='2014-01-14 10:34:03', token='13eb3e6a8c3e1becb34051e08f19fd62' WHERE completed='0000-00-00 00:00:00' AND (started='0000-00-00 00:00:00' OR started<'2014-01-14 10:29:03') AND attempts<2 ORDER BY priority DESC, inserted LIMIT 1
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 245767 n bits 128 index `PRIMARY` of table `database`.`job_queue` trx id 0 46984514 lock_mode X waiting
Record lock, heap no 34 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 3; hex 800210; asc    ;; 1: len 6; hex 000002cced25; asc      %;; 2: len 7; hex 000003c00f1970; asc       p;; 3: len 30; hex 4f3a31343a2243425343616368654170704a6f62223a363a7b733a31393a; asc O:14:"CBSCacheAppJob":6:{s:19:;...(truncated); 4: len 1; hex 80; asc  ;; 5: len 8; hex 800012513c58bf24; asc    Q<X $;; 6: len 8; hex 800012513c58cc17; asc    Q<X  ;; 7: len 14; hex 31302e3137392e3130332e313333; asc 10.179.103.133;; 8: len 1; hex 81; asc  ;; 9: len 8; hex 800012513c58cc32; asc    Q<X 2;; 10: len 0; hex ; asc ;; 11: len 30; hex 353264393033616162656634346239626536306463346438666432303066; asc 52d903aabef44b9be60dc4d8fd200f;...(truncated);

*** (2) TRANSACTION:
TRANSACTION 0 46984485, ACTIVE 17 sec, process no 590, OS thread id 140366633547520 starting index read, thread declared inside InnoDB 500
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1216, 2 row lock(s), undo log entries 1
MySQL thread id 676723, query id 19385209 10.179.103.133 root init
UPDATE job_queue SET worker='10.179.103.133', attempts=attempts+1, started='2014-01-14 10:34:03', token='efd21d0d34f44badbc30386db4dd252e' WHERE completed='0000-00-00 00:00:00' AND (started='0000-00-00 00:00:00' OR started<'2014-01-14 10:29:03') AND attempts<2 ORDER BY priority DESC, inserted LIMIT 1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 245767 n bits 128 index `PRIMARY` of table `database`.`job_queue` trx id 0 46984485 lock_mode X locks rec but not gap
Record lock, heap no 34 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 3; hex 800210; asc    ;; 1: len 6; hex 000002cced25; asc      %;; 2: len 7; hex 000003c00f1970; asc       p;; 3: len 30; hex 4f3a31343a2243425343616368654170704a6f62223a363a7b733a31393a; asc O:14:"CBSCacheAppJob":6:{s:19:;...(truncated); 4: len 1; hex 80; asc  ;; 5: len 8; hex 800012513c58bf24; asc    Q<X $;; 6: len 8; hex 800012513c58cc17; asc    Q<X  ;; 7: len 14; hex 31302e3137392e3130332e313333; asc 10.179.103.133;; 8: len 1; hex 81; asc  ;; 9: len 8; hex 800012513c58cc32; asc    Q<X 2;; 10: len 0; hex ; asc ;; 11: len 30; hex 353264393033616162656634346239626536306463346438666432303066; asc 52d903aabef44b9be60dc4d8fd200f;...(truncated);

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 57 n bits 120 index `PRIMARY` of table `database`.`job_queue` trx id 0 46984485 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 3; hex 800001; asc    ;; 1: len 6; hex 000002ccdab1; asc       ;; 2: len 7; hex 000003c0352b3f; asc     5+?;; 3: len 30; hex 4f3a31323a224175746f50696c6f744a6f62223a363a7b733a31383a2200; asc O:12:"AutoPilotJob":6:{s:18:" ;...(truncated); 4: len 1; hex 82; asc  ;; 5: len 8; hex 800012513c58af57; asc    Q<X W;; 6: len 8; hex 800012513c58bf22; asc    Q<X ";; 7: len 14; hex 3130382e3136362e38312e313132; asc 108.166.81.112;; 8: len 1; hex 81; asc  ;; 9: len 8; hex 800012513c58bf23; asc    Q<X #;; 10: len 0; hex ; asc ;; 11: len 30; hex 616331376430346339326163613366323330646164323239363764336266; asc ac17d04c92aca3f230dad22967d3bf;...(truncated);

*** WE ROLL BACK TRANSACTION (1)
------------
TRANSACTIONS
------------
Trx id counter 0 46989905
Purge done for trx's n:o < 0 46986227 undo n:o < 0 0
History list length 24
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 0, not started, process no 590, OS thread id 140366628529920
MySQL thread id 703864, query id 20047015 localhost root
SHOW ENGINE INNODB STATUS
---TRANSACTION 0 46989894, not started, process no 590, OS thread id 140366636758784
MySQL thread id 702822, query id 20046897 10.179.1.63 root
---TRANSACTION 0 46986223, ACTIVE 39782 sec, process no 590, OS thread id 140366626322176
25 lock struct(s), heap size 3024, 710 row lock(s), undo log entries 9
MySQL thread id 677706, query id 19994505 10.179.103.114 root
Trx read view will not see trx with id >= 0 46986224, sees < 0 46986224

任何人进一步写入表,然后对我超时,直到我重新启动MySQL服务器(或手动杀死死锁的作业):

Any further writes to the table then time out on me until I restart my MySQL server (or manually kill the deadlocked jobs):

PHP Fatal error:  Lock wait timeout exceeded; try restarting transaction(Query: "UPDATE job_queue SET worker='108.166.81.250', attempts=attempts+1, started='2014-01-14 21:27:45', token='369eae55a7f0eacad3b678a3410de8e4' WHERE completed='0000-00-00 00:00:00' AND (started='0000-00-00 00:00:00' OR started<'2014-01-14 21:22:45') AND attempts<2 ORDER BY priority DESC, inserted LIMIT 1") in /utilities/Database.php on line 53

有人可以向我解释为什么该查询导致死锁吗?我的印象是,InnoDB表上的所有查询都是原子发生的.有任何想法吗?

Can anyone explain to me why this query causes a deadlock? I was under the impression that all queries on an InnoDB table happened atomically. Any ideas?

推荐答案

这导致死锁,因为UPDATE查询正在锁定表中的所有行,并且根据所使用的索引(或缺少索引),两个不同会话可能会将它们锁定在略有不同的顺序中.请记住,UPDATEDELETESELECT ... FOR UPDATE将锁定它们遇到的所有行,无论这些行是否符合所有WHERE条件.因此,在使用它们时,应通过使用索引(最好是主键)并避免模糊或广泛选择的条件来努力确保它们遇到的行尽可能少.

This is causing a deadlock because the UPDATE query is locking all rows in the table, and depending on the indexes used (or lack thereof), two different sessions will potentially lock them in slightly different order. Remember that UPDATE, DELETE, and SELECT ... FOR UPDATE will lock all rows they encounter, whether those rows match all WHERE conditions or not. So when using them, you should try hard to be sure that they encounter as few rows as possible, by using indexes (ideally the primary key) and avoiding vague or wide-selecting conditions.

我对工作队列的建议非常普遍:尽可能少地锁定并始终以确定的顺序锁定.因此,通常:

My suggestion for work queues is pretty much universal: Lock as little as possible as rarely as possible and always in a deterministic order. So, generally:

  1. 使用非锁定读取(常规SELECT)通过查找您的工人知道该怎么做且目前无人认领的东西(lease_owner IS NULL AND lease_expiry IS NULL或类似的东西)来查找要完成的工作.
  2. 选择一个工作项(如果您愿意,也可以选择几个,但是一个要简单得多,通常可以实现完全令人满意的性能).
  3. 更新您的工作项(要声明它,但无论如何也需要更新):
  1. Use non-locking reads (regular SELECT) to find work to do by looking for stuff your worker knows how to do and is currently unclaimed (lease_owner IS NULL AND lease_expiry IS NULL -- or similar).
  2. Choose one work item (or a few if you dare, but one is far simpler and normally allows for perfectly acceptable performance).
  3. Update your work item (to claim it, but for any case it needs updating as well):
  1. 打开交易.
  2. 使用SELECT ... FOR UPDATE锁定您选择的工作项-如果不再无人认领,请中止并选择另一个.
  3. 使用您的工作人员ID和您的租约到期时间来更新您选择的工作项目.
  4. 立即提交交易.
  1. Open a transaction.
  2. Lock your chosen work item with SELECT ... FOR UPDATE -- If it is no longer unclaimed, abort and choose another.
  3. Update your chosen work item with your worker id and an expiry time for your lease on it.
  4. Commit your transaction immediately.

  • 开始处理您租用的工作项.
  • 在其他过程中,另一位民意调查者寻找废弃的工作并将其取消认领(通过与上述相同的更新过程).
  • 使用此设计,您可以轻松获得非常高的吞吐量(每秒数千个作业),并且基本上没有争用,也没有订购问题.选择不太可能与其他民意测验者冲突的工作的优化是简单有效的(例如,选择工作ID或类似的模数,以避免工作饿死).关键是要记住工作选择上的冲突是-只需中止然后重试,一切都会很快进行.

    You can easily get very high throughput with this design (thousands of jobs per second), and with essentially no contention and no ordering issues. Optimizations to choose work that is less likely to conflict with other pollers are simple and effective (e.g. modulus on Job ID or similar, chosen to avoid starvation of jobs). The key is to remember that conflict on job selection is okay -- just abort and try again and everything moves along very quickly.

    对工作队列项/作业的所有锁定写操作只能在单行上并且只能通过主键 完成.

    All locking writes for work queue items/jobs should be done only on single rows and by primary key only.

    这篇关于InnoDB的MySQL死锁问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆