为什么这个 Sql 语句(有 2 个表连接)需要 5 分钟才能完成? [英] Why does this Sql Statement (with 2 table joins) takes 5 mins to complete?

查看:44
本文介绍了为什么这个 Sql 语句(有 2 个表连接)需要 5 分钟才能完成?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下 sql 语句需要 5 分钟才能完成.我只是.别.得到.它 :(第一个表中有 6861534 行.第二个表少了一点..第三个表(包含 4 个地理场)与第一个相同.

The following sql statement takes 5 mins to complete. I. Just. Don't. Get. It :( First table has 6861534 rows in it. Second table has a little bit less .. and third table (which contains 4 GEOGRAPHY FIELDS) has the same as the first.

第三个表中的那些 GEOGRAPHY 字段......他们不应该与 sql 语句混淆......应该吗?可能是因为表太大(由于 GEOGRAPHY 字段),它有很大的页面大小或其他东西......从而破坏了 COUNT 所做的表扫描?

Those GEOGRAPHY fields in the 3rd table .. they shouldn't be messin' with the sql statement ... should it? Could it be because the table is so large (due to the GEOGRAPHY fields) that it has huge page sizes or something .. thus destroying the table scan a COUNT does?

SELECT COUNT(*)
FROM [dbo].[Locations] a
    inner join [dbo].[MyUSALocations] b on a.LocationId = b.LocationId
    inner join [dbo].[GeographyBoundaries] c on a.locationid = c.LocationId

根据要求,这里有一些关于 GeographyBoundaries 表的更多信息...

As requested, here's some more info about the GeographyBoundaries table...

/****** Object:  Index [PK_GeographyBoundaries]    Script Date: 11/16/2010 12:42:36 ******/
ALTER TABLE [dbo].[GeographyBoundaries] ADD  CONSTRAINT [PK_GeographyBoundaries] PRIMARY KEY CLUSTERED 
(
    [LocationId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

更新 #2 - 添加非聚集索引后

添加非聚集索引后,现在降到了 4 秒!这太棒了.但是为什么?

什么 Zee Frak?

What Zee Frak?

现在,当我只执行 ONE join 并强制 INDEX .. 它返回到 5 分钟.我这样做是为了

Now, when i just do ONE join and force the INDEX .. it goes back to 5 mins. I did this to

  • 确保 MyUSALocations 表没有与 Joins 混淆.
  • 确保 PK 做一些奇怪的事情.

.

SELECT COUNT(*)
FROM [dbo].[Locations] a 
        INNER JOIN [dbo].[GeographyBoundaries] c
            WITH (INDEX(PK_GeographyBoundaries)) ON a.locationid = c.LocationId

推荐答案

这不对.

我有两种可能:

1) 表格上的统计数据已过时.重建索引并更新统计信息.

1) Statistics is out of date on the tables. Rebuild indexes and update stats.

2)如您所说,Geography 表记录很大,跨越多页(不是一条记录跨越多页,因为它不能,但记录接近 8K 标记).在这种情况下,有趣的是,在聚集索引上创建另一个非聚集索引可能会有所帮助.

2) As you said, Geography table records are big spanning many pages (not that one record spanning multiple pages since it can't but the record is close to the 8K mark). In this case, funny enough, creating another non-clustered index on the clustered index might help.

更新

我很高兴它奏效了.现在解释一下.

I am pleased that it has worked. Now some explanation.

首先,如果有什么地方不对并且执行计划看起来很奇怪,请始终查看统计信息并重建索引.

First of all, if something is not really right and execution plan looks weird, always looks at statistics and rebuild indexes.

为聚集索引创建非聚集索引通常不会提供任何好处,但是当表有很多记录并且记录接近其 8K 限制时,它会有所帮助.如您所知,SQL 到磁盘加载记录时,会加载 8K 页.以类似的方式访问索引,它将加载一个 8K 页面.现在索引是一个 4 字节的整数,这意味着加载 2000 条记录的 ID,而如果它使用聚集索引,它将加载少量记录(请记住,我们只需要 JOIN 位的 ID).现在这是一个二分搜索,我不希望它有很大帮助.所以也许其他地方不太正确,但很难猜到没有看过系统.

Creating a non-clustered index for the clustered index usually should not provide any benefit but when the table has many records and the record is close to its 8K limit, it is helpful. As you know, SQL when it goes to the disk to load a record, it loads an 8K page. In a similar way going to indexes it will load an 8K page. Now with index being a 4-byte integer this means loading ID for 2000 records while it is going to load handful records if it uses clustered index (bear in mind all we need is the ID for the JOIN bit). Now with this being a binary search, I don't expect it to hugely help only a bit. So perhaps something else is not quite right, but difficult to guess not having seen the system.

这篇关于为什么这个 Sql 语句(有 2 个表连接)需要 5 分钟才能完成?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆