如果在插入,更新,删除之前存在以进行优化 [英] IF EXISTS before INSERT, UPDATE, DELETE for optimization

查看:118
本文介绍了如果在插入,更新,删除之前存在以进行优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在很多情况下,您需要根据某些条件执行INSERT,UPDATE或DELETE语句.我的问题是对查询性能的影响是否在命令前添加了IF EXISTS.

There is quite often situation when you need to execute INSERT, UPDATE or DELETE statement based on some condition. And my question is whether the affect on the performance of the query add IF EXISTS before the command.

示例

IF EXISTS(SELECT 1 FROM Contacs WHERE [Type] = 1)
    UPDATE Contacs SET [Deleted] = 1 WHERE [Type] = 1

关于INSERT或DELETE呢?

What about INSERTs or DELETEs?

推荐答案

我不太确定,但是我得到的印象是,这个问题实际上是关于upsert的,它是以下原子操作:

I'm not completely sure, but I get the impression that this question is really about upsert, which is the following atomic operation:

  • 如果该行同时存在于源和目标中,则UPDATE目标;
  • 如果该行仅存在于源中,则将该行INSERT插入到目标中;
  • (可选)如果该行存在于目标中,但不是源中,则DELETE来自目标的行.
  • If the row exists in both the source and target, UPDATE the target;
  • If the row only exists in the source, INSERT the row into the target;
  • (Optionally) If the row exists in the target but not the source, DELETE the row from the target.

由开发人员转变为DBA的人经常天真地将它逐行编写,如下所示:

Developers-turned-DBAs often naïvely write it row-by-row, like this:

-- For each row in source
IF EXISTS(<target_expression>)
    IF @delete_flag = 1
        DELETE <target_expression>
    ELSE
        UPDATE target
        SET <target_columns> = <source_values>
        WHERE <target_expression>
ELSE
    INSERT target (<target_columns>)
    VALUES (<source_values>)

由于种种原因,这几乎是您最糟糕的事情:

This is just about the worst thing you can do, for several reasons:

  • 它具有竞争条件.该行可以在IF EXISTS与随后的DELETEUPDATE之间消失.

  • It has a race condition. The row can disappear between IF EXISTS and the subsequent DELETE or UPDATE.

这很浪费.对于每笔交易,您都需要执行额外的操作;也许是微不足道的,但这完全取决于您的索引编制程度.

It's wasteful. For every transaction you have an extra operation being performed; maybe it's trivial, but that depends entirely on how well you've indexed.

最糟糕的是-它遵循的是迭代模型,在单行级别上考虑这些问题.这将对整体性能产生最大(最不利)的影响.

Worst of all - it's following an iterative model, thinking about these problems at the level of a single row. This will have the largest (worst) impact of all on overall performance.

一个非常次要的(我强调是次要的)优化是无论如何都尝试UPDATE.如果该行不存在,则@@ROWCOUNT将为0,然后您可以安全地"插入:

One very minor (and I emphasize minor) optimization is to just attempt the UPDATE anyway; if the row doesn't exist, @@ROWCOUNT will be 0 and you can then "safely" insert:

-- For each row in source
BEGIN TRAN

UPDATE target
SET <target_columns> = <source_values>
WHERE <target_expression>

IF (@@ROWCOUNT = 0)
    INSERT target (<target_columns>)
    VALUES (<source_values>)

COMMIT

在最坏的情况下,这仍将为每个事务执行两次操作,但至少有一次执行的机会是 ,而且还消除了竞争条件(这种情况).

Worst-case, this will still perform two operations for every transaction, but at least there's a chance of only performing one, and it also eliminates the race condition (kind of).

但是真正的问题是,仍然需要对源代码中的每一行进行此操作.

But the real issue is that this is still being done for each row in the source.

在SQL Server 2008之前,您必须使用笨拙的三阶段模型在设置级别上处理此问题(仍然优于逐行):

Before SQL Server 2008, you had to use an awkward 3-stage model to deal with this at the set level (still better than row-by-row):

BEGIN TRAN

INSERT target (<target_columns>)
SELECT <source_columns> FROM source s
WHERE s.id NOT IN (SELECT id FROM target)

UPDATE t SET <target_columns> = <source_columns>
FROM target t
INNER JOIN source s ON t.d = s.id

DELETE t
FROM target t
WHERE t.id NOT IN (SELECT id FROM source)

COMMIT

正如我所说,性能在这方面是很糟糕的,但仍然比一次一行的方法好很多.但是,SQL Server 2008最终引入了 MERGE 语法,所以现在所有人要做的就是这个:

As I said, performance was pretty lousy on this, but still a lot better than the one-row-at-a-time approach. SQL Server 2008, however, finally introduced MERGE syntax, so now all you have to do is this:

MERGE target
USING source ON target.id = source.id
WHEN MATCHED THEN UPDATE <target_columns> = <source_columns>
WHEN NOT MATCHED THEN INSERT (<target_columns>) VALUES (<source_columns>)
WHEN NOT MATCHED BY SOURCE THEN DELETE;

就是这样.一个声明.如果您使用的是SQL Server 2008,并且需要执行INSERTUPDATEDELETE的任何顺序,具体取决于该行是否已经存在-即使只有一行-没有没有的借口不使用MERGE.

That's it. One statement. If you're using SQL Server 2008 and need to perform any sequence of INSERT, UPDATE and DELETE depending on whether or not the row already exists - even if it's just one row - there is no excuse not to be using MERGE.

如果您需要事后查找完成的操作,甚至可以将受MERGE影响的行OUTPUT放入表变量中.简单,快速且无风险.做吧.

You can even OUTPUT the rows affected by a MERGE into a table variable if you need to find out afterward what was done. Simple, fast, and risk-free. Do it.

这篇关于如果在插入,更新,删除之前存在以进行优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆