为什么 SQL Server 查询优化器有时会忽略明显的聚集主键? [英] Why does SQL Server query optimizer sometimes overlook obvious clustered primary key?

查看:15
本文介绍了为什么 SQL Server 查询优化器有时会忽略明显的聚集主键?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在挠头.

我以 id 作为聚集整数主键在一个表上运行简单的 select count(id) 并且 SQL 优化器完全忽略了查询执行计划中的主键,支持日期字段上的索引...... ???

I run as simple select count(id) on a table with id as clustered integer primary key and the SQL Optimizer totally ignores the primary key in it's query execution plan, in favor of an index on a date field.... ???

实际表格:

CREATE TABLE [dbo].[msgr](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [dt] [datetime2](3) NOT NULL CONSTRAINT [DF_msgr_dt]  DEFAULT (sysdatetime()),
    [uid] [int] NOT NULL,
    [msg] [varchar](7000) NOT NULL CONSTRAINT [DF_msgr_msg]  DEFAULT (''),
    [type] [tinyint] NOT NULL,
    [cid] [int] NOT NULL CONSTRAINT [DF_msgr_cid]  DEFAULT ((0)),
    [via] [tinyint] NOT NULL,
    [msg_id] [bigint] NOT NULL,
 CONSTRAINT [PK_msgr] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

请问这是什么原因?

推荐答案

1) 在我看来,这里的关键点在于对于聚簇表(具有聚簇索引的表 = 主数据结构=是存储表数据的数据结构=聚簇索引是表本身)每个非聚簇索引也包括聚簇索引的键.这意味着

1) In my opinion, the key point here is that for clustered tables (tables which have a clustered index = the main data structure = is that data structure that store table data = clustered index is the table itself) the every non-clustered index include also the the key of clustered index. This means that

CREATE [UNIQUE] NONCLUSTERED INDEX bla 
ON [dbo].[msgr] (uid) 

CREATE [UNIQUE] NONCLUSTERED INDEX bla 
ON [dbo].[msgr] (uid) 
INCLUDE (id) -- id = key of clustered index

所以,对于这样的表,叶子页上非聚集索引的每条记录也包括聚集索引的键.这样,在每个非聚集索引和每个叶记录中,SQL Server 也存储了某种指向主数据结构的指针.

So, for such tables, every record from non-clustered indexes on the leaf pages includes also the key of clustered index. This way, within every non-clustered index and for every leaf record SQL Server store also some kind of pointer to the main data structure.

2) 这意味着 SELECT COUNT(id) FROM dbo.msgr 既可以使用 CI 执行,也可以使用 NCI 执行,因为两个索引都包含 id(聚集的键index) 列.

2) This means that SELECT COUNT(id) FROM dbo.msgr can be executed using CI but also using NCI because both indexes include the id (key of clustered index) column.

作为本主题中的次要说明,因为 IDENTITY 属性(对于 id 列)表示必填列(NOT NULL),COUNT(id)COUNT(*) 是一样的.此外,这意味着 COUNT(msg_id)(也是强制性的/NOT NULL)列与 COUNT(*) 相同.因此,SELECT COUNT(msg_id) FROM dbo.msgr 的执行计划很可能会使用相同的 NCI(例如 bla).

As a secondary note within this topic, because IDENTITY property (for id column) means a mandatory column (NOT NULL), COUNT(id) is the same thing as COUNT(*). Also, this means that COUNT(msg_id) (also a mandatory / NOT NULL) column is the same thing as COUNT(*). So, it's very likely that execution plan for SELECT COUNT(msg_id) FROM dbo.msgr will use the same NCI (for example bla).

3) 非聚集索引的大小小于聚集索引.这也意味着更少的 IO => 从性能的角度来看,使用 NCI 比使用 CI 更好.

3) Non-clustered indexes have smaller size than clustered index. This means also less IO => It's better from performance point of view to use the NCI than CI.

我会做以下简单的测试:

I would do following simple test:

SET STATISTICS IO ON;
GO

SELECT COUNT(id)
FROM dbo.[dbo].[msgr] WITH(INDEX=[bla]) -- It forces usage of NCI
GO

SELECT COUNT(id)
FROM dbo.[dbo].[msgr] WITH(INDEX=[PK_msgr]) -- It forces usage of CI
GO

SET STATISTICS IO OFF;
GO

如果msgr表中有很多数据,那么STATISTICS IO会显示不同的LIO(逻辑IO),减少 NCI 查询的 LIO.

If there is a lot of data within msgr table then STATISTICS IO will show different LIO (logical IO), with less LIO for NCI query.

这篇关于为什么 SQL Server 查询优化器有时会忽略明显的聚集主键?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆