原子标记并返回数据库中的一组行 [英] Atomically mark and return a group of rows in database

查看:138
本文介绍了原子标记并返回数据库中的一组行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个后台服务,需要处理一系列作业,作为sqlserver表中的记录存储。该服务需要找到需要工作的最旧的20个作业( where status ='new'),标记它们( set status ='processing '),运行它们,然后更新作业。

I'm writing a background service that needs to process a series of jobs, stored as records in a sqlserver table. The service needs to find the oldest 20 jobs that need to be worked (where status = 'new'), mark them (set status = 'processing'), run them, and update the jobs afterward.

这是我需要帮助的第一部分。可能有多个线程同时访问数据库,我想确保mark& return查询以原子方式运行,或几乎是这样。

It's the first part I need help with. There could be multiple threads accessing the database at the same time, and I want to make sure that the "mark & return" query runs atomically, or nearly so.

这个服务将花费相对较少的时间访问数据库,如果一个作业运行两次,这不是世界末日,所以我可能能够接受一个小概率的作业运行不止一次,以增加代码的简单性。

This service will be spending comparatively little time accessing the database, and it's not the end of the world if a job gets run twice, so I might be able to accept a small probability of jobs running more than once for increased simplicity in the code.

这是最好的方法是什么?我对我的数据层使用linq-to-sql,但我认为我必须下降到t-sql为此。

What is the best way to do this? I'm using linq-to-sql for my data layer, but I assume I'll have to drop down into t-sql for this.

推荐答案

您的作业表是一个队列。写用户表备份队列是一个众所周知的错误倾向,因为它导致死锁和concurency问题。

Your table of jobs is a queue. Writing user tables backed up queues is a notoriously error prone as it leads to deadlocks and concurency issues.

最简单的事情是删除用户表,并使用一个真< a href =http://msdn.microsoft.com/en-us/library/ms190495.aspx =nofollow noreferrer>队列。这将给你的系统测试和验证的代码库的死锁免费concurency自由队列。问题是队列周围的整个范式从INSERT和DELETE / UPDATE更改为 SEND / RECEIVE 。另一方面使用内置队列,你会得到一些非常强大的免费的好东西,即激活相关项目锁定

The simplest thing would be to drop the user table and use a true queue instead. This will give you deadlock free concurency free queue on system tested and validated code base. The problem is that the whole paradigm around queues changes from INSERT and DELETE/UPDATE to SEND/RECEIVE. On the other hand with built-in queue you get some very powerfull free goodies, namely Activation and correlated items locking.

如果你想继续沿用户表支持的队列的路径,那么写入用户表队列的第二最重要的技巧是使用UPDATE ... OUTPUT :

If you want to continue down the path of user table backed queues then the second most important trick in writing user tables queues is to use UPDATE ... OUTPUT:

WITH cte AS (
  SELECT TOP(20) status, id, ...
  FROM table WITH (ROWLOCK, READPAST, UPDLOCK)
  WHERE status = 'new'
  ORDER BY enqueue_time)
UPDATE cte
  SET status = 'processing'
OUTPUT
  INSERTED.id, ...

CTE语法只是为了方便放置TOP和ORDER BY正确地,查询可以像esily一样使用派生表写入。你不能使用直接UPDATE ... TOP,因为UPDATE不支持ORDER BY,你需要这个来满足你的需求的最老的部分。需要锁定提示来促进并行处理线程之间的高完整性。

The CTE syntax is just for convenience of placing the TOP and ORDER BY properly, the query can be written using derived tables just as esily. You cannot use straight UPDATE ... TOP because UPDATE does not support an ORDER BY and you require this to satisfy the 'oldest' part of your requirement. The lock hints are needed to facilitate high concurency between parallel processing threads.

我说这是第二个最重要的技巧。最重要的是如何组织表。对于队列必须(status,enqueue_time)进行聚类。如果你没有正确地组织表,你会得到死锁。优先评论:在这种情况下,分割是相关的。

I said this is the second most important trick. The most important is how you organize the table. For a queue it must be clustered by (status, enqueue_time). If you don't organize the table properly you'll end up with deadlocks. Pre-emptive comment: fragmentation is irelevant in this scenario.

这篇关于原子标记并返回数据库中的一组行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆