将SQL Server用作具有多个客户端的数据库队列 [英] Using SQL Server as a DB queue with multiple clients

查看:93
本文介绍了将SQL Server用作具有多个客户端的数据库队列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个充当队列的表,我如何最好地配置表/查询,以便多个客户端同时处理队列?

Given a table that is acting as a queue, how can I best configure the table/queries so that multiple clients process from the queue concurrently?

例如,下表指示工作程序必须处理的命令。当工作完成后,它会将处理的值设置为true。

For example, the table below indicates a command that a worker must process. When the worker is done, it will set the processed value to true.

| ID | COMMAND | PROCESSED |
|  1 | ...     | true      |
|  2 | ...     | false     |
|  3 | ...     | false     |

客户端可能会获得一个命令,如下所示:

The clients might obtain one command to work on like so:

select top 1 COMMAND 
from EXAMPLE_TABLE 
with (UPDLOCK, ROWLOCK) 
where PROCESSED=false;

但是,如果有多个工人,则每个人都试图获取ID = 2的行。只有第一个会得到悲观的锁,其余的将等待。

However, if there are multiple workers, each tries to get the row with ID=2. Only the first will get the pessimistic lock, the rest will wait. Then one of them will get row 3, etc.

什么查询/配置允许每个工作者客户端获得不同的行并同时处理它们?

What query/configuration would allow each worker client to get a different row each and work on them concurrently?

编辑:

几个答案建议使用表本身来记录进程内状态的变化。我认为这在单个事务中是不可能的。 (例如,如果没有其他工作者在txn提交之前看不到它,那么更新状态的意义是什么?)也许建议是:

Several answers suggest variations on using the table itself to record an in-process state. I thought that this would not be possible within a single transaction. (i.e., what's the point of updating the state if no other worker will see it until the txn is committed?) Perhaps the suggestion is:

# start transaction
update to 'processing'
# end transaction
# start transaction
process the command
update to 'processed'
# end transaction

这是人们通常处理这个问题的方式吗?

Is this the way people usually approach this problem? It seems to me that the problem would be better handled by the DB, if possible.

推荐答案

我建议你去过将表格用作队列
正确实现的队列可以处理数千个并发用户和服务,每分钟高达1/2 Million入队/出队操作。直到SQL Server 2005的解决方案是麻烦的,涉及混合在一个事务中的 SELECT UPDATE 正确的组合锁定提示,如在gbn链接的文章。幸运的是,自从SQL Server 2005与OUTPUT子句出现以来,一个更优雅的解决方案是可用的,现在MSDN建议使用 OUTPUT子句

I recommend you go over Using tables as Queues. Properly implemented queues can handle thousands of concurrent users and service as high as 1/2 Million enqueue/dequeue operations per minute. Until SQL Server 2005 the solution was cumbersome and involved a mixing a SELECT and an UPDATE in a single transaction and give just the right mix of lock hints, as in the article linked by gbn. Luckly since SQL Server 2005 with the advent of the OUTPUT clause, a much more elegant solution is available, and now MSDN recommends using the OUTPUT clause:


您可以在应用程序
中使用OUTPUT作为队列,或保持
中间结果集。也就是说,
应用程序不断添加或
从表中删除行

You can use OUTPUT in applications that use tables as queues, or to hold intermediate result sets. That is, the application is constantly adding or removing rows from the table

基本上有3个部分的拼图你需要得到正确的为了这个工作在一个高度并发的方式:

Basically there are 3 parts of the puzzle you need to get right in order for this to work in a highly concurrent manner:

1)你需要出队原子。您必须找到该行,跳过任何锁定的行,并在单个原子操作中将其标记为dequeued,这是 OUTPUT 子句的作用:

1) You need to dequeue atomically. You have to find the row, skipp any locked rows, and mark it as 'dequeued' in a single, atomic operation, and this is where the OUTPUT clause comes into play:

with CTE as (
  SELECT TOP(1) COMMAND, PROCESSED
  FROM TABLE WITH (READPAST)
  WHERE PROCESSED = 0)
UPDATE CTE
  SET PROCESSED = 1
  OUTPUT INSERTED.*;

2)您必须使用最左侧的聚簇索引键 PROCESSED 列。如果 ID 被用作主键,则将其作为聚集键中的第二列。关于是否在 ID 列上保留非聚集键的争论是开放的,但我强烈希望具有任何辅助非聚集索引队列:

2) You must structure your table with the leftmost clustered index key on the PROCESSED column. If the ID was used a primary key, then move it as the second column in the clustered key. The debate whether to keep a non-clustered key on the ID column is open, but I strongly favor not having any secondary non-clustered indexes over queues:

CREATE CLUSTERED INDEX cdxTable on TABLE(PROCESSED, ID);

3)您不能通过任何其他方式查询此表,但必须通过Dequeue。尝试执行Peek操作或尝试将此表用作队列作为商店将很可能导致死锁,并会大大降低吞吐量。

3) You must not query this table by any other means but by Dequeue. Trying to do Peek operations or trying to use the table both as a Queue and as a store will very likely lead to deadlocks and will slow down throughput dramatically.

原子出队,READPAST提示在基于处理位搜索要离队的元素和聚集索引上的最左键的组合确保在高并发负载下的非常高的吞吐量。

The combination of atomic dequeue, READPAST hint at searching elements to dequeue and leftmost key on the clustered index based on the processing bit ensure a very high throughput under a highly concurrent load.

这篇关于将SQL Server用作具有多个客户端的数据库队列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆