具有多个选择线程的MySQL生产者使用者 [英] MySQL producer consumer with multiple select threads

查看:75
本文介绍了具有多个选择线程的MySQL生产者使用者的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要注意以下情况:

有一个表,其中包含要处理的不同作业",以及几个使用这些作业的工作线程. 因为我不想在完成后删除这些作业,所以我只为该记录设置一个完成"标志.

There is a table containing different "jobs" to process and several worker threads consuming those jobs. As I don't want to delete those jobs once finished, I'll just set a "complete" flag for that record.

实际上,我有以下工作流程(针对每个处理线程)

So in fact I've the following workflow (for each processing thread)

  1. 选择不完整的第一条记录
  2. 处理工作
  3. 设置完成"标志

如何防止其他线程消耗相同的作业(将其设置为"complete"会花费一些时间).同样,仅在第二步中更新完成"标志也会导致某些作业被处理两次,因为可能有大量线程在处理少量作业.

How do I prevent other threads from consuming the same job (as setting it to "complete" will take a while). Also just updating the "complete" flag in the 2nd step will cause some jobs to be processed twice as there might be a big number of threads processing a small ammount of jobs.

最简单的方法是仅锁定记录(是的,我正在使用InnoDB),以便其他线程无法仅读取该单一记录集.通过"FOR UPDATE"可以很容易地做到这一点,但是它将锁定整个表以供将来选择,而所有其他"FOR UPDATE"选择则必须等到第一个选择完成.

The easiest way would be to lock just the record (yes, I'm using InnoDB), so that other threads aren't able to read just that singe record set. This is easily be doable via an "FOR UPDATE", but that will lock the entire table for future selects and all other "FOR UPDATE" selects have to wait until the 1st one has completed.

谁能告诉我,如何在不延迟所有其他线程的情况下解决这种情况?因此,实际上,一个选择(也可以受LIMIT 1限制)应该只看到"非锁定的行...

Can anyone tell me, how to solve such a situation without delaying all other threads? So in fact a select (can also be limited by LIMIT 1) should only "see" non locked rows ...

表结构示例如下

JobID | completed
123   | 0
124   | 1
125   | 0

大约有10-50个线程在做一个简单的

with around 10-50 threads doing a simple

SELECT JobID from jobs WHERE completed = 0;
UPDATE jobs SET completed = 1 WHERE JobID = ?;

感谢您提供任何提示和技巧!

Thx for any tips and tricks!

推荐答案

我为您提供了一种实用的解决方案,该解决方案已在我的工作场所的一个项目中实现.扩展您的集合以包括更多案例,而不是仅使用0和1来表示不完整和已完成.

I have a practical solution for you, one that I have seen implemented in a project at my workplace. Instead of using just 0 and 1 for incomplete and completed, expand your set to include more cases.

让我们将该列称为状态.这是该列的不同值以及作业的相应状态.

Let's call that column status. Here are the different values of that column and the corresponding states of the job.

  1. 状态为0时,该工作线程尚未被任何工作线程拾取.
  2. 状态为1时,该工作已由工作线程处理,正在处理中.
  3. 状态为2时,作业失败. (您应该考虑处理失败的可能性.)
  4. 状态为3时,作业已完成.

您的线程应包含逻辑,使其仅选择状态为0的作业并将状态更改为1.这将不允许其他线程选择正在处理的作业.作业完成后,状态设置为3,如果作业失败,则状态设置为2.然后线程可以继续前进,并寻找另一个尚待完成的作业.

Your threads should contain logic such that it only picks up jobs for whom the status is 0 and changes the status to 1. This will disallow other threads to pick up those jobs which are under process. When the job completes, the status is set to 3 and if the job fails, the status is set to 2. Then the thread can move on and look for another job that is still to be completed.

您还可以要求线程考虑接受状态2的作业,但是您将必须定义逻辑以指定有限的重试次数.

You could also ask the threads to consider picking up jobs of status 2, but you will have to define logic to specify a finite number of retries.

在进行了漫长讨论之后,我们一起偶然发现了该解决方案.当工作"是一个需要一些时间才能完成的过程时,我的上述回答在更笼统的状态下很好.但这不是OP的问题.

After a long discussion, we stumbled upon the solution together. My above answer is good in a more generalized state when the 'job' is a process that takes some time to complete. But that wasn't the case in the OP's problem.

所以最终可行的解决方案是:

So the solution that eventually worked was this:

BEGIN 
SELECT * FROM Jobs WHERE JobID = (SELECT * FROM Jobs WHERE completed = 0 LIMIT 1) LOCK IN SHARE MODE;
UPDATE Jobs SET completed = 1 WHERE JobID = (PREVIOUS ID); 
COMMIT;

这篇关于具有多个选择线程的MySQL生产者使用者的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆