当将DISTINCT与参数一起使用时,选择语句性能下降 [英] select statment performance degradation when using DISTINCT with parameters

查看:123
本文介绍了当将DISTINCT与参数一起使用时,选择语句性能下降的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

赏金注意事项-开始:

参数嗅探(这是赏金前问题中唯一报告的想法)这不是问题所在,因为您可以在问题末尾的更新部分中阅读。问题确实与在使用distinct时sql server如何为参数化查询创建执行计划有关。
我上传了一个非常简单的数据库备份(适用于sql server 2008 R2)此处(您必须等待20秒才能下载)。针对此数据库,您可以尝试运行以下查询:

PARAMETERS SNIFFING (that is the only "idea" that was reported in pre-bounty questions) is not the issue here, as you can read in the "update" section at the end of the question. The problem is really related to how sql server creates execution plans for a parametrized query when distinct is used. I uploaded a very simple database backup (it works with sql server 2008 R2) here (you must wait 20 seconds before downloading). Against this DB you can try to run the following queries:

-- PARAMETRIZED QUERY

declare @IS_ADMINISTRATOR int
declare @User_ID int
set @IS_ADMINISTRATOR = 1 -- 1 for administrator 0 for normal
set @User_ID = 50

SELECT DISTINCT -- PLEASE REMEMBER DISTINCT MAKES THE DIFFERENCE!!!
  DOC.DOCUMENT_ID
FROM
  DOCUMENTS DOC LEFT OUTER JOIN
  FOLDERS FOL ON FOL.FOLDER_ID = DOC.FOLDER_ID LEFT OUTER JOIN
  ROLES ROL ON (FOL.FOLDER_ID = ROL.FOLDER_ID)   
WHERE
  1 = @IS_ADMINISTRATOR OR  ROL.USER_ID = @USER_ID

-- NON PARAMETRIZED QUERY

SELECT DISTINCT -- PLEASE REMEMBER DISTINCT MAKES THE DIFFERENCE!!! 
  DOC.DOCUMENT_ID
FROM
  DOCUMENTS DOC LEFT OUTER JOIN
  FOLDERS FOL ON FOL.FOLDER_ID = DOC.FOLDER_ID LEFT OUTER JOIN
  ROLES ROL ON (FOL.FOLDER_ID = ROL.FOLDER_ID)   
WHERE
  1 = 1 OR  ROL.USER_ID = 50

最后说明:我注意到DSTINCT是问题所在,我的目标是在两个查询中都达到相同的速度(或至少几乎相同的速度)。

Final note: I noticed DSTINCT is the problem, my goal is to achieve the same speed (or at least almost the same speed) in both queries.

赏金注意事项-END:

原始问题:

我注意到,两者之间的性能差异很大

I noticed that there is an heavy difference in performance between

-- Case A
select distinct * from table where id > 1

相比(这是我的Delphi应用程序生成的sql)

compared to (this is the sql generated by my Delphi application)

-- Case B1
exec sp_executesql N'select distinct * from table where id > @P1',N'@P1 int',1

等价于

-- Case B2
declare @P1 int
set @P1 = 1
select distinct * from table where id > @P1

A的执行速度比B1和B2快得多。在我删除DISTINCT的情况下,性能也将保持不变。

A performs much faster than B1 and B2. The performance becomes the same in case I remove DISTINCT.

您可以对此发表评论吗?

May you comment on this?

在这里我发布了一个简单的查询,我在3 INNER JOIN的查询中注意到了这一点。无论如何,这都不是一个复杂的查询。

Here i posted a trivial query, I noticed this on a query with 3 INNER JOIN. Anyway not a complex query.

注意:在A和B1 / B2的情况下,我期望具有完全相同的性能。

Note: I was expecting to have THE EXACT SAME PERFORMANCE, in cases A and B1/B2.

使用DISTINCT有一些警告吗?

So are there some caveats in using DISTINCT?

更新

我尝试使用 DBCC TRACEON(4136,-1)(禁用参数嗅探的标志)来禁用参数嗅探,但是没有任何变化。因此,在这种情况下,问题不与参数嗅探有关。知道吗?

I tried to disable parameter sniffing using DBCC TRACEON (4136, -1) (the flag to disable parameter sniffing) but nothing changes. So in this case the problem is NOT LINKED TO PARAMETERS SNIFFING. Any idea?

推荐答案

问题不是不是DISTINCT会导致参数的性能下降,而是查询的其余部分不会在参数化查询中被优化,因为优化器不会像使用1 = 1那样仅使用1 = @ IS_ADMINISTRATOR来优化所有联接。它不会优化没有 的连接,因为它需要根据连接的结果返回重复项。

The problem isn't that DISTINCT is causing a performance degradation with parameters, it's that the rest of the query isn't being optimized away in the parameterized query because the optimizer won't just optimize away all of the joins using 1=@IS_ADMINISTRATOR like it will with just 1=1. It won't optimize the joins away without distinct because it needs to return duplicates based on the result of the joins.

为什么?因为抛弃所有联接的执行计划对于@IS_ADMINISTRATOR = 1以外的任何值都是无效的,无论您是否正在缓存计划,它都不会生成该计划。

Why? Because the execution plan tossing out all of the joins would be invalid for any value other than @IS_ADMINISTRATOR = 1. It will never generate that plan regardless of whether you are caching plans or not.

这与我的2008服务器上的非参数化查询一样好:

This performs as well as the non parameterized query on my 2008 server:

-- PARAMETRIZED QUERY

declare @IS_ADMINISTRATOR int
declare @User_ID int
set @IS_ADMINISTRATOR = 1 -- 1 for administrator 0 for normal
set @User_ID = 50

IF 1 = @IS_ADMINISTRATOR 
BEGIN
SELECT DISTINCT -- PLEASE REMEMBER DISTINCT MAKES THE DIFFERENCE!!!
  DOC.DOCUMENT_ID
FROM
  DOCUMENTS DOC LEFT OUTER JOIN
  FOLDERS FOL ON FOL.FOLDER_ID = DOC.FOLDER_ID LEFT OUTER JOIN
  ROLES ROL ON (FOL.FOLDER_ID = ROL.FOLDER_ID)   
WHERE
  1 = 1
END
ELSE 
BEGIN
SELECT DISTINCT -- PLEASE REMEMBER DISTINCT MAKES THE DIFFERENCE!!!
  DOC.DOCUMENT_ID
FROM
  DOCUMENTS DOC LEFT OUTER JOIN
  FOLDERS FOL ON FOL.FOLDER_ID = DOC.FOLDER_ID LEFT OUTER JOIN
  ROLES ROL ON (FOL.FOLDER_ID = ROL.FOLDER_ID)   
WHERE
  ROL.USER_ID = @USER_ID
END

从查询计划中可以清楚地看到,在运行您的示例时, @IS_ADMINISTRATOR = 1 并未像 1 = 1那样得到优化。 。在您的非参数化示例中,JOINS已完全优化,它只返回DOCUMENTS表中的每个id(非常简单)。

What's clear from the query plan I see running your example is that @IS_ADMINISTRATOR = 1 does not get optimized out the same as 1=1. In your non-parameterized example, the JOINS are completely optimized out, and it just returns every id in the DOCUMENTS table (very simple).

还缺少其他优化方法当 @IS_ADMINISTRATOR<> 1 。例如, LEFT OUTER JOIN S自动更改为 INNER JOIN s 而没有 OR 子句,但它们与 那个或子句保持原样。

There are also different optimizations missing when @IS_ADMINISTRATOR <> 1. For instance, the LEFT OUTER JOINS are automatically changed to INNER JOINs without that OR clause, but they are left as-is with that or clause.

另请参见以下答案:对于集成SQL的喜欢的%用于动态SQL的替代方案。

See also this answer: SQL LIKE % FOR INTEGERS for a dynamic SQL alternative.

当然,这并不能真正解释您原始问题中的性能差异,因为您那里没有OR。我认为这是一个疏忽。

Of course, this doesn't really explain the performance difference in your original question, since you don't have the OR in there. I assume that was an oversight.

这篇关于当将DISTINCT与参数一起使用时,选择语句性能下降的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆