查询不同索引时数据不匹配 [英] Data mismatch when querying with different indexes

查看:189
本文介绍了查询不同索引时数据不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我偶然发现了一个非常好奇的案子。我们有一个SQL Server 2012数据库和这样的表

  CREATE TABLE [dbo]。[ActiveTransactions] 

[Id] [BIGINT] IDENTITY(1,1)NOT NULL,
[Amount] [DECIMAL](12,4)NOT NULL,
[TypeId] [SMALLINT] NOT NULL,
[GameProviderId] [SMALLINT] NULL,
[UserId] [INT] NOT NULL,
[校验和] [NVARCHAR](150)NOT NULL,
[Date] [DATETIME2] (7)NOT NULL,
[ExternalKey] [VARCHAR](60)NULL,
[ExternalDescription] [NVARCHAR](1000)NULL,
[OperatorId] [SMALLINT] NULL,
[GameId] [NVARCHAR](50)NULL

此表有多个索引,但我想在这里讨论的两个是 PK_ActiveTransactions (主键,聚集):

  ALTER TABLE [dbo]。[ActiveTransactions] 
ADD CONSTRAINT [PK_ActiveTransactions]
PRIMARY KEY CLUSTERED([Id] DESC)

IX_Active Transactions_UserIdAmount (非聚集,非唯一):

  CREATE NONCLUSTERED INDEX [IX_ActiveTransactions_UserIdAmount] 
ON [dbo]。[ActiveTransactions]([UserId] ASC,[Id] DESC)
INCLUDE([Amount])

有一个查询依赖于我的解决方案的主要部分,并在某个进程启动时被调用。基本上每次在我的代码端调用 SomeMethod 时,它会启动SQL事务,然后执行程序(如下所示),从而锁定它选择的条目,然后计算一些东西和在该表中插入新行并提交事务。锁定过程执行此SQL语句

  SELECT TOP 1 
id,
金额,
TypeId ,
GameProviderId,
UserId,
[校验和],
[日期],
ExternalKey
来自ActiveTransactions WITH(UPDLOCK)
WHERE @ UserId = UserId
ORDER BY Id DESC

现在就是这种情况。当我查看此表中的某些条目时,似乎有多个(同时请求)条目为相同的 @UserId 选择了相同的条目。确切地说,有5个新条目(当时要求,因为它们具有相同的 [日期] 值,这是在代码侧计算的),所有这些都选择了相同的条目然后重新计算一些东西(所有5个计算相同的东西)并同时插入5个新行,而不是一个接一个地执行(这应该由 WITH(UPDLOCK) SELECT查询结束时的语句,我相信)。



然后我尝试做了这样的事情,我打开了三个新的查询窗口,我用 BEGIN TRAN 开始交易命令在一个窗口中然后在SELECT语句上面执行,在其他两个窗口中我做了同样的事情,当我提交第一个语句时,第二个查询在它之后立即获取它,在提交第二个语句之后,第三个获取它。 (一切都按预期工作),添加 WITH INDEX(INDEX_NAME) UPDLOCK 仍然存在)后,事情开始变得怪异在查询结束时。到第一个选择我指定 WITH INDEX(PK_ActiveTransactions)(主键)和另外两个我指定 WITH INDEX(IX_ActiveTransactions_UserIdAmount)。运行所有这三个命令后,加上第一个命令在同一个表中的 INSERT ,(第二个和第三个仍在等待第一个完成)但是当我提交第一个命令,第二个获取旧条目,第三个同时获取新条目。我认为这种行为可能导致上面解释的错误,但这怎么可能?



SQL Server会同时为同一个查询使用两个不同的执行计划(因此使用不同的索引)吗?这个表在一天结束时达到了大约10-15百万个条目,但每天早上大约在早上6点执行作业,这使得表只有1-2百万行。这会导致SQL Server意外切换索引吗?但无论如何我认为这是一个系列问题意味着即使提交后,索引也可能不包含其中的已提交数据。



上述问题仅发生了几次,我能够识别它们发生两次

解决方案

你需要检查你的表和索引上正在获取什么锁(见下方链接)。 SQL Server能够对索引和数据进行单独锁定。默认情况下,它不会锁定所有索引。

注意:下面是一个猜测。

查询#1永远不会获得对IX_ActiveTransactions_UserIdAmount的锁定,因此查询#2能够搜索索引并获取锁定并且然后等待释放行数据锁以完成其操作。释放此锁后,查询#2会抓取并保留它,同时执行其他代码。同时,查询#3仍在等待数据和索引锁定。一旦查询#2完成并且所有锁被释放,那么只有查询#3能够使用索引进行搜索并因此搜索最新数据。



摘要:

查询#1和查询#2都可以并行搜索表并返回相同的表行。查询#2必须等待查询#1完成才能获得更新锁。由于查询#1实际上并未修改最后一行,而是插入新行,因此不会为查询#2更改索引。


参见 https://www.mssqltips.com/sqlservertip/1485/using-sql-server-indexes-to-bypass-locks/ ,讨论问题的反转。



其他评论:

我认为它会更可靠,并可能为您的目的提供更好的性能来锁定用户表(如果存在)用于特定用户ID,而不是依赖索引锁定正常工作。


I stumbled upon with a very curious case. We have a SQL Server 2012 database and such a table

CREATE TABLE [dbo].[ActiveTransactions]
(
    [Id] [BIGINT] IDENTITY(1,1) NOT NULL,
    [Amount] [DECIMAL](12, 4) NOT NULL,
    [TypeId] [SMALLINT] NOT NULL,
    [GameProviderId] [SMALLINT] NULL,
    [UserId] [INT] NOT NULL,
    [Checksum] [NVARCHAR](150) NOT NULL,
    [Date] [DATETIME2](7) NOT NULL,
    [ExternalKey] [VARCHAR](60) NULL,
    [ExternalDescription] [NVARCHAR](1000) NULL,
    [OperatorId] [SMALLINT] NULL,
    [GameId] [NVARCHAR](50) NULL
)

This table has multiple indexes but the two which I want to talk about here are PK_ActiveTransactions (primary key, clustered) which goes:

ALTER TABLE [dbo].[ActiveTransactions] 
    ADD CONSTRAINT [PK_ActiveTransactions] 
    PRIMARY KEY CLUSTERED ([Id] DESC)

And IX_ActiveTransactions_UserIdAmount (nonclustered, non unique):

CREATE NONCLUSTERED INDEX [IX_ActiveTransactions_UserIdAmount] 
ON [dbo].[ActiveTransactions] ([UserId] ASC, [Id] DESC)
INCLUDE ([Amount])

There is a query on which depend major parts of my solution and is called when a certain process is started. Basically every time SomeMethod is called at my code side, it starts SQL transaction, then executes the procedure (shown below), thus locking the entry it selects, then calculates some stuff and inserts new row(s) in that table and commits the transaction. Locking procedure executes this SQL statement

SELECT TOP 1
        id ,
        Amount ,
        TypeId ,
        GameProviderId ,
        UserId ,
        [Checksum] ,
        [Date] ,
        ExternalKey
FROM    ActiveTransactions WITH ( UPDLOCK )
WHERE   @UserId = UserId
ORDER BY Id DESC

Now here is the case. When I was looking over some entries in this table it seemed so that there were multiple (requested at the same time) entries which had selected the same entry for the same @UserId. To be exact, there were 5 new entries (requested at the time, as they were having the same exact [Date] value which is calculated at code side) which all selected the same entry then recalculated some stuff (all 5 of them calculated the same thing) and inserted 5 new rows at the same time, instead of doing that one by one (which should be caused by WITH(UPDLOCK) statement at the end of the SELECT query, as I believe).

Then I tried doing such a thing, I opened three new query windows, I started a transaction with BEGIN TRAN command in one window and then executed above SELECT statement, in other two windows I did the same, When I committed the first statement the second query acquired it right after that, after committing the second statement third one acquired it. (Everything worked as expected), things started looking weird after adding WITH INDEX(INDEX_NAME) (UPDLOCK was still there) at the end of the query. To the first select I specified WITH INDEX(PK_ActiveTransactions) (primary key) and to the other two I specified WITH INDEX(IX_ActiveTransactions_UserIdAmount). After running all 3 of those commands, plus the INSERT in the same table with the first command,(second and third were still waiting for the first to be completed) but when I commited first command, the second acquired the old entry and the third acquired the new entry at the same time. I think that this behavior could have caused the bug explained above, but how is this possible ?

Would SQL Server use two different execution plans (thus using different indexes) for the same query at the same time? This table reaches somewhere 10-15 millions entries at the end of the day, but every morning approximately at 6 AM the jobs executes which leaves the table with 1-2 million rows only. could this cause SQL Server to switch indexes unexpectedly? But anyways I think that this is a series issue meaning that even after committing, indexes may not contain the committed data in them.

The above problem has happened just a few times, I was able to identify them happening twice

解决方案

You need to inspect what locks are being acquired on your table AND indexes (see link below). SQL Server is able to take separate locks on indexes as well as data. It does not lock all indexes by default.
Note: below is a guess.
Query #1 never acquires a lock on IX_ActiveTransactions_UserIdAmount, therefore query #2 is able to search the index and grab a lock on it and then wait for a row data lock to be released to complete its operation. Once this lock is released query #2 grabs it and holds it, while executing your other code.
Query #3, meantime, is still waiting for both data and index locks. Once query #2 is finished and all locks are released, ONLY then is query #3 able to use the index to do its search and therefore search up-to-date data.

In Summary:
Both query #1 and query #2 are able to search the table in parallel and return the same row. Query #2 does have to wait for query #1 to finish to get the update lock. Since query #1 does not actually modify the last row but rather insert a new one the index is not changed for the purposes of query #2.
See https://www.mssqltips.com/sqlservertip/1485/using-sql-server-indexes-to-bypass-locks/ for discussion about the inverse of your problem.

Additional Comments:
I think it would be more reliable and possibly give better performance for your purposes to lock on "Users" table (if such exists) for a specific user ID rather than rely on index locking working correctly.

这篇关于查询不同索引时数据不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆