解释莫名其妙的僵局 [英] Explain inexplicable deadlock

查看:82
本文介绍了解释莫名其妙的僵局的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我根本看不到如何得到任何死锁,因为我没有使用显式锁定,所以只涉及一个表,每个表都有一个单独的过程可插入,选择和更新行,一次只能插入或更新一行,并且每个进程几乎很少(也许每分钟一次)运行.

First of all, I don't see how I could be getting any deadlock at all, since I am using no explicit locking, there's only one table involved, there's a separate process each to insert, select, and update rows, only one row is inserted or updated at a time, and each process only rarely (perhaps once a minute) runs at all.

这是一个电子邮件队列:

It's an email queue:

CREATE TABLE `emails_queue` (
  `id` varchar(40) NOT NULL,
  `email_address` varchar(128) DEFAULT NULL,
  `body` text,
  `status_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `status` enum('pending','inprocess','sent','discarded','failed') DEFAULT NULL,
  KEY `status` (`status`),
  KEY `status_time` (`status`,`status_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 

响应某些用户操作(大约每90秒一次),生成过程会对表进行插入,将状态设置为待处理".

The generating process, in response to some user action but roughly every 90 seconds, does an insert to the table, setting the status to "pending".

有一个监视过程,每分钟检查一次,以确保待处理"和失败"电子邮件的数量不多.只需不到一秒钟的运行时间,就不会给我带来任何麻烦.

There's a monitoring process that every minute checks that the number of "pending" and "failed" emails is not excessive. It takes less than a second to run and has never given me any trouble.

每分钟,发送过程会抓取所有待处理的电子邮件.它一次循环遍历一封电子邮件,将其状态设置为处理中",尝试发送,最后将其状态相应地设置为已发送",已丢弃"(它有决定不发送电子邮件的原因) )或失败"(由SMTP系统拒绝).

Every minute, the sending process grabs all the pending emails. It loops through and one email at a time, sets its status to "inprocess", tries to send it, and finally sets its status accordingly to "sent", "discarded" (it has reasons for deciding an email shouldn't go out), or "failed" (rejected by the SMTP system).

设置状态的声明不正常.

The statement for setting the status is unusual.

UPDATE emails_queue SET status=?, status_time=NOW() WHERE id=? AND status = ?

也就是说,仅当当前状态已经达到我认为的状态时,我才会更新状态.在使用这种机制之前,我不小心启动了两个发送过程,它们分别尝试发送相同的电子邮件.现在,如果发生这种情况,一个过程将成功地将电子邮件从待处理"移到处理中",而第二个过程将更新零行,发现有问题,然后跳过该电子邮件.

That is, I only update the status if the current status it already what I believe it to be. Before this mechanism, I accidentally kicked off two sending processes and they would each try to send the same email. Now, if that were to happen, one process would successfully move the email from "pending" to "inprocess", but the second one would update zero rows, realize there's a problem, and skip that email.

问题是,大约每100倍就有一次更新完全失败!我得到com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction

The problem is, about one time in 100, the update fails altogether! I get com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction

是什么?

这是唯一发生的表,也是唯一的查询,它仅在生产中发生(以最大程度地调查它).

This is the only table and only query that this happens to and it only happens in production (to maximize difficulty in investigating it).

看上去唯一不寻常的两件事是(1)更新参与WHERE子句的列,以及(2)status_time的(未使用的)自动更新.

The only two things that seem at all unusual are (1) updating a column that participates in the WHERE clause, and (2) the (unused) automatic updating of the status_time.

我正在寻找任何建议或诊断技术.

I'm looking for any suggestions or diagnostic techniques.

推荐答案

首先,死锁不依赖于显式锁定. MySQL的LOCK TABLE或使用非默认事务隔离模式都不需要具有死锁.如果您从不使用显式事务,则仍然会出现死锁.

Firstly, deadlocks do not depend on explicit locking. MySQL's LOCK TABLE or using non-default transaction isolation modes are NOT required to have a deadlock. You can still have deadlocks if you never use an explicit transaction.

死锁可以很容易地在单个表上发生.最常见的是来自单个热表.

Deadlocks can happen on a single table, quite easily. Most commonly it's from a single hot table.

如果您的所有事务仅执行一次单行插入,就可能甚至发生死锁.

Deadlocks can even happen if all your transactions just do a single row insert.

如果有的话,可能会发生死锁

A deadlock can happen if you have

  • 与数据库的多个连接(显然)
  • 任何内部涉及多个锁的操作.

不明显的是,在大多数情况下,单行插入或更新涉及多个锁.这样做的原因是在插入/更新期间还需要锁定辅助索引.

What is not obvious, is that most of the time, a single row insert or update involves more than one lock. The reason for this is that secondary indexes also need to be locked during inserts / updates.

SELECT不会锁定(假设您使用的是默认隔离模式,并且未使用FOR UPDATE),所以它们不会成为原因.

SELECTs won't lock (assuming you're using the default isolation mode, and aren't using FOR UPDATE) so they can't be the cause.

SHOW ENGINE INNODB STATUS是您的朋友.它将为您提供有关死锁的一堆(令人困惑的信息),尤其是最新的信息.

SHOW ENGINE INNODB STATUS is your friend. It will give you a bunch of (admittedly very confusing) information about deadlocks, specifically, the most recent one.

  • 您无法完全消除死锁,死锁将继续在生产中发生(即使在测试系统上,如果适当地施加压力也是如此)
  • 旨在实现极低的死锁.如果您的交易中有1%陷入僵局,则可能太多.
  • 如果您完全理解其含义,请考虑将事务的事务隔离级别更改为已提交提交
  • 确保您的软件可以正确处理死锁.

这篇关于解释莫名其妙的僵局的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆