索引SQL Server中的多个查找表 [英] Index over multiple lookup tables in SQL Server

查看:153
本文介绍了索引SQL Server中的多个查找表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 创建表格Foos(
Id int NOT NULL,
L1 int NOT NULL,
L2 int NOT NULL,
值int NOT NULL,
CONSTRAINT PK_Foos主键CLUSTERED )
);

CREATE TABLE Lookup1(
Id int NOT NULL,
名称nvarchar(50)NOT NULL,
CONSTRAINT PK_Lookup1 PRIMARY KEY CLUSTERED(Id ASC),
CONSTRAINT IX_Lookup1 UNIQUE NONCLUSTERED(名称ASC)
);

CREATE TABLE Lookup2(
Id int NOT NULL,
名称nvarchar(50)NOT NULL,
CONSTRAINT PK_Lookup2 PRIMARY KEY CLUSTERED(Id ASC),
CONSTRAINT IX_Lookup2 UNIQUE NONCLUSTERED(名称ASC)
);

CREATE NONCLUSTERED INDEX IX_Foos ON Foos(
L1 ASC,
L2 ASC,
价值ASC
);

ALTER TABLE Foos WITH CHECK ADD CONSTRAINT FK_Foos_Lookup1
FOREIGN KEY(L2)参考Lookup1(Id);

ALTER TABLE Foos CHECK CONSTRAINT FK_Foos_Lookup1;

ALTER TABLE Foos WITH CHECK ADD CONSTRAINT FK_Foos_Lookup2
FOREIGN KEY(L1)参考Lookup2(Id);

ALTER TABLE Foos CHECK CONSTRAINT FK_Foos_Lookup2;






BAD PLAN



以下SQL查询通过查找表获取Foos:

 选择顶部(1)f。*从Foos f 
加入Lookup1 l1在f.L1 = l1.Id
加入Lookup2 l2 on f.L2 = l2.Id
其中l1.Name ='a'和l2.Name ='b'
order by f.Value



没有充分利用 IX_Foos 索引,请参阅 http:// sqlfiddle .com /#!6 / cd5c1 / 1/0 数据计划
(它只是选择一个查找表。)






GOOD PLAN



但是,如果我重写查询:

  declare @ l1Id int =(从Lookup1中选择Id,其中Name ='a'); 
declare @ l2Id int =(从Lookup2中选择Id,其中Name ='b');

选择顶部(1)f。*从Foos f
其中f.L1 = @ l1Id和f.L2 = @ l2Id
order by f.Value



它的工作原理。它首先查找两个查找表,然后用于查找 IX_Foos 索引。



是否可以使用提示强制SQL Server在第一个查询(加入)中首先查找ids,然后将其用于 IX_Foos



因为如果 Foos 表格相当大,第一个查询(连接)锁定整个表:(



注意:内部连接查询来自LINQ,或者可以强制实体框架中的LINQ使用 declare 重写查询,因为在多个请求中进行查找在更复杂的查询中可能会有更长的往返延迟。



注意:在Oracle中它可以正常工作,似乎是SQL Server的问题。



注3:从Foos添加 TOP(1) select f。*时,锁定问题更加明显。 (例如,您只需要获取最小或最大值。)






更新:
根据@Hoots提示,我改变了IX_Lookup1和IX_Lookup2:

  CONSTRAINT IX_Lookup1 UNIQUE NONCLUSTERED(Name ASC,Id ASC)
CONSTRAINT IX_Lookup2 UNIQUE NONCLUSTERED(名称ASC,Id ASC)

它有帮助,但它仍然排序所有结果:





为什么要从 Foos 中匹配<$ c $的所有 10,000 行c> f.L1 和 f.L2 ,而不是只取第一行。 ( IX_Foos 包含值ASC ,所以它可以找到第一行而不处理所有10,000行,并对它们进行排序。)使用声明的变量的上一个计划正在使用 IX_Foos ,所以没有这样做。

解决方案

查看查询计划,SQL Server在两个版本的SQL中都使用相同的索引,它只是在第二个版本的sql中执行3个单独的SQL,而不是1,所以在不同的时间评估索引。



我已经检查,我认为解决方案是更改索引如下...



  CONSTRAINT IX_Lookup1 UNIQUE NONCLUSTERED(名称ASC,ID ASC)

  CONSTRAINT IX_Lookup2 UNIQUE NONCLUSTERED(名称ASC,ID ASC)

当它评估索引不会消失,需要从表数据中获取ID它将在索引中。这样可以将计划改变成你想要的,希望阻止你看到的锁定,但是我不会保证它的一面锁定不是我能够复制的东西。



更新:我现在看到这个问题...



第二块SQL实际上没有使用基于集合的操作。简化你在做什么...

 选择f。* 
从Foos f
其中f.L1 = 1
和f.L2 = 1
order by f.Value desc

其中只需要寻找一个简单的索引来获取已经订购的结果。



在SQL的第一位(如下所示) )您正在组合不同的数据集,它们仅在各个表项上具有索引。接下来的两位SQL使用相同的查询计划做同样的事情...

  select f。*  -  cost 0.7099 
从Foos f
加入Lookup1 l1在f.L1 = l1.Id
加入Lookup2 l2在f.L2 = l2.Id
其中l1.Name ='a'和l2.Name ='b'
order by f.Value

select f。* - cost 0.7099
from Foos f
inner join(SELECT l1.id l1Id,l2.id l2Id
从Lookup1 l1,Lookup2 l2
其中l1.Name ='a'和l2.Name ='b')查找(f.L1 = lookups.l1Id和f。 L2 = lookups.l2Id)
order by f.Value desc

把这两个都放在下面是因为你可以很容易地提到第二个版本,它不是基于设置,而是单数,并写下来这样...

 选择f。*  - 从Foos f 
内部连接(SELECT TOP 1 l1.id l1Id,l2.id l2Id
from Lookup1 l1,Lookup2 l2 $ b)的费用为0.095
$ b其中l1.Name ='a'和l2.Name ='b')查找(f.L1 = lookups.l1Id和f.L2 = lookups.l2Id)
order by f.Value desc

当然,您只能这样做,知道子查询将带回单个记录,无论前1是否被提及。然后将成本从0.7099降低到0.095。我只能总结一下,现在显然有一个记录输入,优化者现在知道事情的顺序可以由索引来处理,而不必手动排序。



注意:对于单独运行的查询,0.7099不是很大,即您几乎不会注意到,但如果它是更大的一组执行的一部分,您可以根据需要降低成本。我怀疑这个问题更多的是为什么,我认为这是为了针对单一寻求而设定的行动。


In SQL Server 2012, let's have three tables: Foos, Lookup1 and Lookup2 created with the following SQL:

CREATE TABLE Foos (
    Id int NOT NULL,
    L1 int NOT NULL,
    L2 int NOT NULL,
    Value int NOT NULL,
    CONSTRAINT PK_Foos PRIMARY KEY CLUSTERED (Id ASC)
);

CREATE TABLE Lookup1 (
    Id int NOT NULL,
    Name nvarchar(50) NOT NULL,
    CONSTRAINT PK_Lookup1 PRIMARY KEY CLUSTERED (Id ASC),
    CONSTRAINT IX_Lookup1 UNIQUE NONCLUSTERED (Name ASC)
);

CREATE TABLE Lookup2 (
    Id int NOT NULL,
    Name nvarchar(50) NOT NULL,
    CONSTRAINT PK_Lookup2 PRIMARY KEY CLUSTERED (Id ASC),
    CONSTRAINT IX_Lookup2 UNIQUE NONCLUSTERED (Name ASC)
);

CREATE NONCLUSTERED INDEX IX_Foos ON Foos (
    L1 ASC,
    L2 ASC,
    Value ASC
);

ALTER TABLE Foos WITH CHECK ADD CONSTRAINT FK_Foos_Lookup1 
    FOREIGN KEY(L2) REFERENCES Lookup1 (Id);

ALTER TABLE Foos CHECK CONSTRAINT FK_Foos_Lookup1;

ALTER TABLE Foos WITH CHECK ADD CONSTRAINT FK_Foos_Lookup2 
    FOREIGN KEY(L1) REFERENCES Lookup2 (Id);

ALTER TABLE Foos CHECK CONSTRAINT FK_Foos_Lookup2;


BAD PLAN:

The following SQL query to get Foos by the lookup tables:

select top(1) f.* from Foos f
join Lookup1 l1 on f.L1 = l1.Id
join Lookup2 l2 on f.L2 = l2.Id
where l1.Name = 'a' and l2.Name = 'b' 
order by f.Value

does not fully utilize the IX_Foos index, see http://sqlfiddle.com/#!6/cd5c1/1/0 and the plan with data. (It just chooses one of the lookup tables.)


GOOD PLAN:

However if I rewrite the query:

declare @l1Id int = (select Id from Lookup1 where Name = 'a');
declare @l2Id int = (select Id from Lookup2 where Name = 'b');

select top(1) f.* from Foos f
where f.L1 = @l1Id and f.L2 = @l2Id 
order by f.Value

it works as expected. It firstly lookup both lookup tables and then uses to seek the IX_Foos index.

Is it possible to use a hint to force the SQL Server in the first query (with joins) to lookup the ids first and then use it for IX_Foos?

Because if the Foos table is quite large, the first query (with joins) locks the whole table:(

NOTE: The inner join query comes from LINQ. Or is it possible to force LINQ in Entity Framework to rewrite the queries using declare. Since doing the lookup in multiple requests could have longer roundtrip delay in more complex queries.

NOTE2: In Oracle it works ok, it seems like a problem of SQL Server.

NOTE3: The locking issue is more apparent when adding TOP(1) to the select f.* from Foos .... (For instance you need to get only the min or max value.)


UPDATE: According to the @Hoots hint, I have changed IX_Lookup1 and IX_Lookup2:

CONSTRAINT IX_Lookup1 UNIQUE NONCLUSTERED (Name ASC, Id ASC)
CONSTRAINT IX_Lookup2 UNIQUE NONCLUSTERED (Name ASC, Id ASC)

It helps, but it is still sorting all results:

Why is it taking all 10,000 rows from Foos that are matching f.L1 and f.L2, instead of just taking the first row. (The IX_Foos contains Value ASC so it could find the first row without processing all 10,000 rows and sort them.) The previous plan with declared variables is using the IX_Foos, so it is not doing the sort.

解决方案

Looking at the query plans, SQL Server is using the same indexes in both versions of the SQL you've put down, it's just in the second version of sql it's executing 3 seperate pieces of SQL rather than 1 and so evaluating the indexes at different times.

I have checked and I think the solution is to change the indexes as below...

CONSTRAINT IX_Lookup1 UNIQUE NONCLUSTERED (Name ASC, ID ASC)

and

CONSTRAINT IX_Lookup2 UNIQUE NONCLUSTERED (Name ASC, ID ASC)

when it evaluates the index it won't go off and need to get the ID from the table data as it will have it in the index. This changes the plan to be what you want, hopefully preventing the locking you're seeing but I'm not going to guarantee that side of it as locking isn't something I'll be able to reproduce.

UPDATE: I now see the issue...

The second piece of SQL is effectively not using set based operations. Simplifying what you've done you're doing...

select f.*
from Foos f
where f.L1 = 1
  and f.L2 = 1
order by f.Value desc

Which only has to seek on a simple index to get the results that are already ordered.

In the first bit of SQL (as shown below) you're combining different data sets that has indexes only on the individual table items. The next two bits of SQL do the same thing with the same query plan...

select f.* -- cost 0.7099
from Foos f
join Lookup1 l1 on f.L1 = l1.Id
join Lookup2 l2 on f.L2 = l2.Id
where l1.Name = 'a' and l2.Name = 'b' 
order by f.Value

select f.* -- cost 0.7099
from Foos f
inner join (SELECT l1.id l1Id, l2.id l2Id
            from Lookup1 l1, Lookup2 l2
            where l1.Name = 'a' and l2.Name='b') lookups on (f.L1 = lookups.l1Id and f.L2=lookups.l2Id)
order by f.Value desc

The reason I've put both down is because you can hint in the second version quite easily that it's not set based but singular and write it down as this...

select f.* -- cost 0.095
from Foos f
inner join (SELECT TOP 1 l1.id l1Id, l2.id l2Id
            from Lookup1 l1, Lookup2 l2
            where l1.Name = 'a' and l2.Name='b') lookups on (f.L1 = lookups.l1Id and f.L2=lookups.l2Id)
order by f.Value desc

Of course you can only do this knowing that the sub query will bring back a single record whether the top 1 is mentioned or not. This then brings down the cost from 0.7099 to 0.095. I can only summise that now that there is explicitly a single record input the optimiser now knows the order of things can be dealt with by the index rather than having to 'manually' order them.

Note: 0.7099 isn't very large for a query that runs singularly i.e. you'll hardly notice but if it's part of a larger set of executions you can get the cost down if you like. I suspect the question is more about the reason why, which I believe is down to set based operations against singular seeks.

这篇关于索引SQL Server中的多个查找表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆