在后台MS SQL中运行大型查询 [英] Running large queries in the backgroud MS SQL

查看:182
本文介绍了在后台MS SQL中运行大型查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用MS SQL Server 2008
i有一个不断使用的表(数据总是在改变和插入)
它现在包含〜70 Mill行,
尝试使用存储过程对表执行简单查询,该过程应该需要几天的时间,

I am using MS SQL Server 2008 i have a table which is constantly in use (data is always changing and inserted to it) it contains now ~70 Mill rows, I am trying to run a simple query over the table with a stored procedure that should properly take a few days,

我需要该表保持可用,现在我执行了存储程序,然后一段时间每一个简单的选择通过身份查询,我尝试在表上执行没有响应/运行太多的时间,我打破它

i need the table to keep being usable, now i executed the stored procedure and after a while every simple select by identity query that i try to execute on the table is not responding/running too much time that i break it

我应该做什么?
这里是我的存储过程的样子:

what should i do ? here is how my stored procedure looks like :

 SET NOCOUNT ON;
update SOMETABLE
set
[some_col] = dbo.ufn_SomeFunction(CONVERT(NVARCHAR(500), another_column))
WHERE 
[some_col] = 243

即使我在where子句(使用'和'逻辑..) p>

even if i try it with this on the where clause (with an 'and' logic..) :

ID_COL > 57000000 and ID_COL < 60000000 and

它仍然无法工作

BTW- SomeFunction做了一些简单的数学运算,在另一个表中查找包含大约300k项但是从不改变的行。

BTW- SomeFunction does some simple mathmatics actions and looks up rows in another table that contains about 300k items, but is never changed

我很乐意听到任何建议

推荐答案

从我的角度来看,你的服务器有一个严重的性能问题。即使我们假设查询中没有记录

From my perspective your server has a serious performance problem. Even if we assume that none of the records in the query

select some_col with (nolock) where id_col between 57000000 and 57001000

在内存中,不应该需要21秒从磁盘顺序读取几页如果它是一个自动标识,并且你没有做一些愚蠢的,如在索引定义中添加一个desc,则id_col上不应该被分割。)

was in memory, it shouldn't take 21 seconds to read the few pages sequentially from disk (your clustered index on the id_col should not be fragmented if it's an auto-identity and you didn't do something stupid like adding a "desc" to the index definition).

但是如果你不能/不会解决这个问题,我的建议是使更新在小包,如100-1000记录一次(取决于查找功能消耗多少时间)。一个更新/事务不应超过30秒。

But if you can't/won't fix that, my advice would be to make the update in small packages like 100-1000 records at a time (depending on how much time the lookup function consumes). One update/transaction should take no more than 30 seconds.

您会看到每个更新对其修改的所有记录保持排他锁定,直到事务完成。如果不使用显式事务,则每个语句在单个自动事务上下文中执行,因此在更新语句完成时会释放锁。

You see each update keeps an exclusive lock on all the records it modified until the transaction is complete. If you don't use an explicit transaction, each statement is executed in a single, automatic transaction context, so the locks get released when the update statement is done.

但是你仍然可能遇到死锁,这取决于其他进程做什么。如果他们一次修改多个记录,或者即使他们在多个行上收集并保持读取锁,也可能会导致死锁。

But you can still run into deadlocks that way, depending on what the other processes do. If they modify more than one record at a time, too, or even if they gather and hold read locks on several rows, you can get deadlocks.

为了避免死锁,您的更新语句需要对其将立即修改的所有记录进行锁定。执行此操作的方法是将单个更新语句(只有几行由id_col限制)放在可序列化的事务中,如

To avoid the deadlocks, your update statement needs to take a lock on all the records it will modify at once. The way to do this is to place the single update statement (with only the few rows limited by the id_col) in a serializable transaction like

IF @@TRANCOUNT > 0
  -- Error: You are in a transaction context already

SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

-- Insert Loop here to work "x" through the id range
  BEGIN TRANSACTION
    UPDATE SOMETABLE
      SET [some_col] = dbo.ufn_SomeFunction(CONVERT(NVARCHAR(500), another_column))
      WHERE [some_col] = 243 AND id_col BETWEEN x AND x+500 -- or whatever keeps the update in the small timerange
  COMMIT
-- Next loop

-- Get all new records while you where running the loop. If these are too many you may have to paginate this also:
BEGIN TRANSACTION
  UPDATE SOMETABLE
    SET [some_col] = dbo.ufn_SomeFunction(CONVERT(NVARCHAR(500), another_column))
    WHERE [some_col] = 243 AND id_col >= x
COMMIT

更新这将对给定记录采取更新/排他键范围锁定(但只有他们,因为您通过聚集索引键限制更新)。它将等待对同一记录的任何其他更新完成,然后获取它的锁定(导致所有其他事务的阻塞,但仍然只有给定的记录),然后更新记录并释放锁。

For each update this will take an update/exclusive key-range lock on the given records (but only them, because you limit the update through the clustered index key). It will wait for any other updates on the same records to finish, then get it's lock (causing blocking for all other transactions, but still only for the given records), then update the records and release the lock.

最后一个额外的语句很重要,因为它会将一个键范围锁定到infinity,从而防止在update语句运行时在范围的末尾插入。

The last extra statement is important, because it will take a key range lock up to "infinity" and thus prevent even inserts on the end of the range while the update statement runs.

这篇关于在后台MS SQL中运行大型查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆