SQL Server为什么索引不与OR一起使用 [英] SQL Server why is index not used with OR

查看:73
本文介绍了SQL Server为什么索引不与OR一起使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在研究索引,试图理解它们的工作原理以及如何使用它们来提高性能,但是我却缺少一些东西.

I've been studying indexes and trying to understand how they work and how I can use them to boost performance, but I'm missing something.

我有下表:

人员:

| Id | Name | Email | Phone |
| 1  | John |  E1   |  P1   |
| 2  | Max  |  E2   |  P2   |

考虑到查询(大多数情况下)采用这种形式,我正在尝试找到索引 Email Phone 列的最佳方法

I'm trying to find the best way to index the columns Email and Phone considering that the queries will (most of the time) be of the form

[1] SELECT * FROM Person WHERE Email = '...' OR Phone = '...'
[2] SELECT * FROM Person WHERE Email = ...
[3] SELECT * FROM Person WHERE Phone = ...

我认为最好的方法是使用两列创建单个索引:

I thought the best approach would be to create a single index using both columns:

CREATE NONCLUSTERED INDEX [IX_EmailPhone]
ON [dbo].[Person]([Email], [PhoneNumber]);

但是,对于上面的索引,只有查询[2]受益于索引查找,其他查询则使用索引扫描.

However, with the index above, only the query [2] benefits from an index seek, the others use index scan.

我还尝试创建多个索引:一个包含两列,一个包含电子邮件,一个包含电子邮件.在这种情况下,[2]和[3]使用搜寻,但是[1]继续使用扫描.

I also tried to create multiple index: one with both columns, one for email, and one for email. In this case, [2] and [3] use seek, but [1] continues to use scan.

为什么数据库不能使用带有or的索引?考虑到查询,此表的最佳索引方法是什么?

Why can't the database use index with an or? What would be the best indexing approach for this table considering the queries?

推荐答案

为每列创建一个单独的索引.
通过使用提示,我们可以强制优化器使用/不使用索引,因此您可以检查执行计划,了解所涉及的性能并了解每个路径的含义.

Create a separate index for each column.
By using hints we can force the optimizer to use/not use the indexes, so you can check the execution plan, get a feeling of the performance involved and understand the meaning of each path.

在以下示例中,通过我的演示并考虑每个路径中涉及的工作-

Go through my demo and consider the work involved in each path for the following scenarios -

  1. 只有少数几行满足条件j = 123.
    只有几行满足条件k = 456.

  1. Only few rows satisfy the condition j=123.
    Only few rows satisfy the condition k=456.

大多数行都满足条件j = 123.
大多数行满足条件k = 456.

Most of the rows satisfy the condition j=123.
Most of the rows satisfy the condition k=456.

只有少数几行满足条件j = 123.
大多数行满足条件k = 456.

Only few rows satisfy the condition j=123.
Most of the rows satisfy the condition k=456.

尝试考虑为每种情况选择的路径.
请随时提问.

Try to think what path you would have chosen for each scenario.
Please feel free to ask questions.

演示

;with t(n) as (select 0 union all select n+1 from t where n < 999)

select      1+t0.n+1000*t1.n                                as i
           ,floor(rand(cast (newid() as varbinary))*1000)   as j
           ,floor(rand(cast (newid() as varbinary))*1000)   as k 

into        t

from        t t0,t t1 

option       (maxrecursion 0)
;

create index t_j on t (j);
create index t_k on t (k);

update statistics t (t_j)
update statistics t (t_k)


扫描

select      *
from        t (forcescan)
where       j = 123
        or  k = 456

  • 这很简单.
  • select      *
    from        t (forceseek)
    where       j = 123
            or  k = 456
    

    • 索引搜索" :正在为每个索引寻找相关值(123和456)
    • 合并联接" :正在将结果(行ID)连接起来(如在UNION ALL中一样)
    • 流聚合" :正在消除重复的行ID
    • "Rid查找" + 嵌套循环" :正在使用行ID从表(t)中检索行
      • "Index Seek": Each index is being seeked for the relevant values (123 and 456)
      • "Merge Join": The results (row IDs) are being concatenated (as in UNION ALL)
      • "Stream Aggregate": Duplicate row IDs are being eliminated
      • "Rid Lookup" + "Nested Loops": The row IDs are being used to retrieve the rows from the table (t)
      • 这篇关于SQL Server为什么索引不与OR一起使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆