为什么在原子语句上需要锁提示? [英] Why are lock hints needed on an atomic statement?

查看:83
本文介绍了为什么在原子语句上需要锁提示?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将锁应用于以下语句有什么好处?

What is the benefit of applying locks to the below statement?

类似地,如果不包含这些提示,我们会​​看到什么问题?也就是说,它们是否可以防止出现比赛情况,提高性能或其他方面的作用?询问他们是否被包括在内是为了防止某些我尚未考虑的问题,而不是我认为的比赛条件.

Similarly, what issue would we see if we didn't include these hints? i.e. Do they prevent a race condition, improve performance, or maybe something else? Asking as perhaps they're included to prevent some issue I've not considered rather than the race condition I'd assumed.

NB:这是在这里提出的一个问题的溢出: SQL FIFO队列的线程安全UPDATE TOP 1

NB: This is an overflow from a question asked here: SQL Threadsafe UPDATE TOP 1 for FIFO Queue

WITH nextRecordToProcess AS
(
    SELECT TOP(1) Id, StatusId
    FROM    DemoQueue
    WHERE   StatusId = 1 --Ready for processing
    ORDER BY DateSubmitted, Id 
)
UPDATE nextRecordToProcess
SET StatusId = 2 --Processing
OUTPUT Inserted.Id 

要求

  • SQL用于从队列中检索未处理的记录.
  • 要获取的记录应该是队列中状态为Ready(StatusId = 1)的第一条记录.
  • 可能有多个工作程序/会话正在处理此队列中的消息.
  • 我们要确保队列中的每个记录仅被提取一次(即由单个工作程序处理),并且确保每个工作程序按消息在队列中出现的顺序进行处理.
  • 一个工人比另一个工人更快地工作是可以的(即,如果工人A拿起记录1然后工人B拿起记录2,如果工人B在工人A完成处理记录1之前完成了记录2的处理就可以了).我们仅在获取记录的范围内关注.
  • 没有正在进行的交易;即我们只想从队列中提取记录;我们不需要将其保持锁定状态,直到我们将状态从Processing升级到Processed.
  • Requirement

    • The SQL is used to retrieve an unprocessed record from a queue.
    • The record to be obtained should be the first record in the queue with status Ready (StatusId = 1).
    • There may be multiple workers/sessions processing messages from this queue.
    • We want to ensure that each record in the queue is only picked up once (i.e. by a single worker), and that each worker processes messages in the order in which they appear in the queue.
    • It's OK for one worker to work faster than another (i.e. if Worker A picks up record 1 then Worker B picks up record 2 it's OK if worker B completes the processing of record 2 before Worker A has finished processing record 1). We're only concerned within the context of picking up the record.
    • There's no ongoing transaction; i.e. we just want to pick up the record from the queue; we don't need to keep it locked until we come back to progress the status from Processing to Processed.
    • CREATE TABLE Statuses
      (
          Id SMALLINT NOT NULL PRIMARY KEY CLUSTERED
          , Name NVARCHAR(32) NOT NULL UNIQUE
      )
      GO
      INSERT Statuses (Id, Name)
      VALUES (0,'Draft')
      , (1,'Ready')
      , (2,'Processing')
      , (3,'Processed')
      , (4,'Error')
      GO
      CREATE TABLE DemoQueue
      (
          Id BIGINT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED
          , StatusId SMALLINT NOT NULL FOREIGN KEY REFERENCES Statuses(Id)
          , DateSubmitted DATETIME --will be null for all records with status 'Draft'
      )
      GO
      

      建议声明

      在讨论队列的各个博客中以及引起该讨论的问题中,建议将上述声明更改为包含如下所示的锁定提示:

      Suggested Statement

      In the various blogs discussing queues, and in the question which caused this discussion, it's suggested that the above statement be changed to include lock hints as below:

      WITH nextRecordToProcess AS
      (
          SELECT TOP(1) Id, StatusId
          FROM    DemoQueue WITH (UPDLOCK, ROWLOCK, READPAST)
          WHERE   StatusId = 1 --Ready for processing
          ORDER BY DateSubmitted, Id 
      )
      UPDATE nextRecordToProcess
      SET StatusId = 2 --Processing
      OUTPUT Inserted.Id 
      

      我的理解

      我知道锁定是需要这些提示的好处的:

      My Understanding

      I understand that were locking required the benefits of these hints would be:

      • UPDLOCK:因为我们正在选择记录以更新其状态,所以我们需要确保在读取该记录之后但在更新之前,其他任何读取该记录的会话都无法通过以下方式读取该记录:进行更新的意图(或者,这样的语句必须等到我们执行更新并释放锁后,其他会话才能看到具有新值的记录).
      • ROWLOCK:在锁定记录的同时,我们要确保锁定仅影响所锁定的行;即因为我们不需要锁定很多资源/我们不想影响其他进程/我们希望其他会话能够读取队列中的下一个可用项目,即使该项目与锁定记录在同一页面中
      • READPAST:如果另一个会话已经在从队列中读取项目,而不是等待该会话释放它的锁,则我们的会话应选择队列中的下一个可用(未锁定)记录.

      即我们运行下面的代码,我认为这是有道理的:

      i.e. Were we running the below code I think this would make sense:

      DECLARE @nextRecordToProcess BIGINT
      
      BEGIN TRANSACTION
      
      SELECT TOP (1) @nextRecordToProcess = Id
      FROM    DemoQueue WITH (UPDLOCK, ROWLOCK, READPAST)
      WHERE   StatusId = 1 --Ready for processing
      ORDER BY DateSubmitted, Id 
      
      --and then in a separate statement
      
      UPDATE DemoQueue
      SET StatusId = 2 --Processing
      WHERE Id = @nextRecordToProcess
      
      COMMIT TRANSACTION
      
      --@nextRecordToProcess is then returned either as an out parameter or by including a `select @nextRecordToProcess Id`
      

      然而,当选择和更新发生在同一份声明中我会假设没有其他会话可以阅读我们的会话的读取和放大器之间的相同的记录;更新;因此不需要显式的锁定提示.

      However when the select and update occur in the same statement I'd have assumed that no other session could read the same record between our session's read & update; so there'd be no need for explicit lock hints.

      从根本上我对锁的工作方式有误解吗?还是这些提示的建议与其他一些相似但不同的用例有关?

      Have I misunderstood something fundamentally with how locks work; or is the suggestion for these hints related to some other similar but different use case?

      推荐答案

      John是正确的,因为它们是优化,但是在SQL世界中,这些优化可能意味着快速"与难以忍受的数据大小"之间的差异和/或作品"与无法使用的死锁混乱"之间的区别.

      John is right in as these are optimizations, but in SQL world these optimizations can mean the difference between 'fast' vs. 'unbearable size-of-data slow' and/or the difference between 'works' vs. 'unusable deadlock mess'.

      readpast提示很清楚.对于其他两个,我觉得我需要添加更多上下文:

      The readpast hint is clear. For the other two, I feel I need to add a bit more context:

      • ROWLOCK提示用于防止页面锁定粒度扫描.锁定粒度(行与页面)是在查询开始时预先确定的,并且基于查询将扫描的页数的估计值(第三粒度,表,仅在特殊情况下使用,不适用于此处) ).通常,出队操作永远不必扫描这么多页面,以便引擎考虑页面粒度.但是当引擎决定使用页面锁定粒度时,我已经见过疯狂"的情况,这会导致出队的阻塞和死锁
      • UPDLOCK是必需的,以防止出现升级锁死锁的情况.逻辑上将UPDATE语句拆分为搜索需要更新的行,然后更新这些行.搜索需要锁定它评估的行.如果该行符合条件(满足WHERE条件),则该行将被更新,并且update始终是互斥锁.因此,问题是如何在搜索过程中锁定行?如果您使用共享锁,则两个UPDATE将查看同一行(由于共享锁允许它们,它们可以进行查看),都决定该行合格,并且都尝试将锁升级为独占->死锁.如果您在搜索过程中使用排他锁,则不会发生死锁,但是即使该行不符合条件,UPDATE也会在所有其他读取的行上发生UPDATE冲突(更不用说不能在不破坏<的情况下提早释放排他锁两阶段锁定).这就是为什么存在一个U模式锁的原因,该模式与Shared兼容(以便对候选行的UPDATE求值不会阻止读取),但与另一个U模式不兼容(从而使两个UPDATE不会死锁).典型的基于CTE的出队需要此提示有两个原因:

      • ROWLOCK hint is to prevent page lock granularity scans. The lock granularity (row vs. page) is decided upfront when the query starts and is based on an estimate of the number pages that the query will scan (the third granularity, table, will only be used in special cases and does not apply here). Normally dequeue operations should never have to scan so many pages so that page granularity is considered by the engine. But I've seen 'in the wild' cases when the engine decided to use page lock granularity, and this leads to blocking and deadlocks in dequeue
      • UPDLOCK is needed to prevent the upgrade lock deadlock scenario. The UPDATE statement is logically split into a search for the rows that need to be updated and then update the rows. The search needs to lock the rows it evaluates. If the row qualifies (meets the WHERE condition) then the row is updated, and update is always an exclusive lock. So the question is how do you lock the rows during the search? If you use a shared lock then two UPDATE will look at the same row (they can, since the shared lock allows them), both decide the row qualifies and both try to upgrade the lock to exclusive -> deadlock. If you use exclusive locks during the search the deadlock cannot happen, but then UPDATE will conflict on all rows evaluated with any other read, even if the row does not qualifies (not to mention that Exclusive locks cannot be released early w/o breaking two-phase-locking). This is why there is an U mode lock, one that is compatible with Shared (so that UPDATE evaluation of candidate rows does not block reads) but is incompatible with another U (so that two UPDATEs do not deadlock). There are two reasons why the typical CTE based dequeue needs this hint:

      1. 由于是CTE,查询处理并不总是了解CTE中的SELECT是UPDATE的目标,因此应使用U模式锁和
      2. 出队操作将始终在相同的行之后进行更新(这些行已出队"),因此死锁频繁发生.

      这篇关于为什么在原子语句上需要锁提示?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆