基于字典的查询的查询索引 [英] Query Index for Dictionary Based Queries
问题描述
查询和索引以下内容的最有效方法是什么:
What would be the most efficient way to query and index for the following:
SELECT * Persons.LastName A-D
SELECT * Persons.LastName E-L
SELECT * Persons.LastName M-R
SELECT * Persons.LastName S-Z
我正在使用以下非常低效且难以索引的内容:
I'm using the following which is terribly inefficient and difficult to index:
WHERE LastName LIKE '[a-d]%'
有没有更好的方法来解决这个问题?我认为这对于 Filtered Index 来说可能是一个很好的场景,但是 where 子句需要更加灵活.
Any ideas on a better way to go about this ? I think this might be a good scenario for Filtered Index, however the where clause would need to be more sargable.
谢谢
推荐答案
正如 Sam 所说,LIKE '[a-d]%'
是 SARGable(几乎).几乎是因为没有优化的Predicate
(更多信息见下文).
As Sam says, LIKE '[a-d]%'
is SARGable (well almost). Almost because of a not optimized Predicate
(see below for more info).
示例 1:如果您在 AdventureWorks2008R2 数据库中运行此查询
Example #1: if you run this query in AdventureWorks2008R2 database
SET STATISTICS IO ON;
SET NOCOUNT ON;
PRINT 'Example #1:';
SELECT p.BusinessEntityID, p.LastName
FROM Person.Person p
WHERE p.LastName LIKE '[a-a]%'
然后,您将获得基于Index Seek
运算符的执行计划(优化谓词:绿色矩形,非优化谓词:红色矩形):SET STATISTICS IO
的输出是
then, you will get an execution plan based on Index Seek
operator (optimized predicate: green rectangle, non-optimized predicate:red rectangle):
The output for SET STATISTICS IO
is
Example #1:
Table 'Person'. Scan count 1, logical reads 7
这意味着服务器必须从缓冲池中读取 7 页.此外,在这种情况下,索引 IX_Person_LastName_FirstName_MiddleName
包括 SELECT
、FROM
和 WHERE
子句所需的所有列:LastName和业务实体 ID.如果表有聚集索引,则所有非聚集索引都将包含来自聚集索引键的列(BusinessEntityID 是 PK_Person_BusinessEntityID 聚集索引的键).
This means the server have to read 7 pages from buffer pool. Also, in this case, the index IX_Person_LastName_FirstName_MiddleName
includes all columns required by SELECT
, FROM
and WHERE
clauses: LastName and BusinessEntityID. If table has a clustered index then all non clustered indices will include the columns from clustered index key (BusinessEntityID is the key for PK_Person_BusinessEntityID clustered index).
但是:
1) 您的查询必须显示所有列,因为 SELECT *
(这是一种不好的 做法):BusinessEntityID、LastName、FirstName、MiddleName、PersonType,..., 修改日期.
1) Your query have to show all columns because of SELECT *
(it's a bad practice): BusinessEntityID, LastName, FirstName, MiddleName, PersonType, ..., ModifiedDate.
2) 索引(前面示例中的IX_Person_LastName_FirstName_MiddleName
)不包括所有必需的列.这就是为什么,对于这个查询,这个索引是一个非覆盖索引.
2) The index (IX_Person_LastName_FirstName_MiddleName
in previous example) doesn't includes all required columns. This is the reason why, for this query, this index is a non-covering index.
现在,如果您执行下一个查询,那么您将获得差异.[实际] 执行计划(SSMS,Ctrl + M):
Now, if you execute the next queries then you will get diff. [actual] execution plans (SSMS, Ctrl + M):
SET STATISTICS IO ON;
SET NOCOUNT ON;
PRINT 'Example #2:';
SELECT p.*
FROM Person.Person p
WHERE p.LastName LIKE '[a-a]%';
PRINT @@ROWCOUNT;
PRINT 'Example #3:';
SELECT p.*
FROM Person.Person p
WHERE p.LastName LIKE '[a-z]%';
PRINT @@ROWCOUNT;
PRINT 'Example #4:';
SELECT p.*
FROM Person.Person p WITH(FORCESEEK)
WHERE p.LastName LIKE '[a-z]%';
PRINT @@ROWCOUNT;
结果:
Example #2:
Table 'Person'. Scan count 1, logical reads 2805, lob logical reads 0
911
Example #3:
Table 'Person'. Scan count 1, logical reads 3817, lob logical reads 0
19972
Example #4:
Table 'Person'. Scan count 1, logical reads 61278, lob logical reads 0
19972
执行计划:
另外:查询将为您提供在Person.Person"上创建的每个索引的页数:
Plus: the query will give you the number of pages for every index created on 'Person.Person':
SELECT i.name, i.type_desc,f.alloc_unit_type_desc, f.page_count, f.index_level FROM sys.dm_db_index_physical_stats(
DB_ID(), OBJECT_ID('Person.Person'),
DEFAULT, DEFAULT, 'DETAILED' ) f
INNER JOIN sys.indexes i ON f.object_id = i.object_id AND f.index_id = i.index_id
ORDER BY i.type
name type_desc alloc_unit_type_desc page_count index_level
--------------------------------------- ------------ -------------------- ---------- -----------
PK_Person_BusinessEntityID CLUSTERED IN_ROW_DATA 3808 0
PK_Person_BusinessEntityID CLUSTERED IN_ROW_DATA 7 1
PK_Person_BusinessEntityID CLUSTERED IN_ROW_DATA 1 2
PK_Person_BusinessEntityID CLUSTERED ROW_OVERFLOW_DATA 1 0
PK_Person_BusinessEntityID CLUSTERED LOB_DATA 1 0
IX_Person_LastName_FirstName_MiddleName NONCLUSTERED IN_ROW_DATA 103 0
IX_Person_LastName_FirstName_MiddleName NONCLUSTERED IN_ROW_DATA 1 1
...
现在,如果你比较 Example #1
和 Example #2
(都返回 911 行)
Now, if you compare Example #1
and Example #2
(both returns 911 rows)
`SELECT p.BusinessEntityID, p.LastName ... p.LastName LIKE '[a-a]%'`
vs.
`SELECT * ... p.LastName LIKE '[a-a]%'`
然后你会看到两个差异:
then you will see two diff.:
a) 7 次逻辑读取与 2805 次逻辑读取和
a) 7 logical reads vs. 2805 logical reads and
b) Index Seek
(#1) vs. Index Seek
+ Key Lookup
(#2).
b) Index Seek
(#1) vs. Index Seek
+ Key Lookup
(#2).
您可以看到 SELECT *
(#2) 查询的性能最差(7 页与 2805 页).
You can see that the performance for SELECT *
(#2) query is far worst (7 pages vs. 2805 pages).
现在,如果你比较 Example #3
和 Example #4
(都返回 19972 行)
Now, if you compare Example #3
and Example #4
(both returns 19972 rows)
`SELECT * ... LIKE '[a-z]%`
vs.
`SELECT * ... WITH(FORCESEEK) LIKE '[a-z]%`
然后你会看到两个差异:
then you will see two diff.:
a) 3817 次逻辑读取(#3)与 61278 次逻辑读取(#4)和
a) 3817 logical reads (#3) vs. 61278 logical reads (#4) and
b) 聚集索引扫描
(PK_Person_BusinessEntityID 有 3808 + 7 + 1 + 1 + 1 = 3818 页)对比 Index Seek
+ Key Lookup代码>.
b) Clustered Index Scan
(PK_Person_BusinessEntityID has 3808 + 7 + 1 + 1 + 1 = 3818 pages) vs. Index Seek
+ Key Lookup
.
您可以看到 Index Seek
+ Key Lookup
(#4) 查询的性能最差(3817 页与 61278 页). 在这种情况下,您可以看到 IX_Person_LastName_FirstName_MiddleName
上的 Index Seek
以及 PK_Person_BusinessEntityID
上的 Key Lookup
(clustered index) 会给你一个比clustered index scan"更低的性能.
You can see that the performance for Index Seek
+ Key Lookup
(#4) query is far worst (3817 pages vs. 61278 pages). In this case, you can see that and Index Seek
on IX_Person_LastName_FirstName_MiddleName
plus a Key Lookup
on PK_Person_BusinessEntityID
(clustered index) will give you a lower performance than a 'Clustered Index Scan'.
由于SELECT *
,所有这些糟糕的执行计划都是可能的.
And all these bad execution plans are possible because of SELECT *
.
这篇关于基于字典的查询的查询索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!