order by 会大大减慢查询速度 [英] order by slows query down massively

查看：128 发布时间：2021/6/15 19:30:55 sql-server performance

本文介绍了order by 会大大减慢查询速度的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用sql server 2014；((SP1-CU3) (KB3094221) 2015 年 10 月 10 日 x64

我有以下查询

SELECT * FROM dbo.table1 t1左连接 dbo.table2 t2 ON t2.trade_id = t1.tradeNo左加入 dbo.table3 t3 ON t3.TradeReportID = t1.tradeNo按 t1.tradeNo 订购

t1、t2 和 t3 分别有 ~70k、35k 和 73k 行.

当我省略 order by 时，此查询在 3 秒内执行，有 73k 行.

查询用了 8.5 分钟返回 ~50k 行(我已经停止了)

切换 LEFT JOIN 的顺序会有所不同:

SELECT * FROM dbo.table1 t1左加入 dbo.table3 t3 ON t3.TradeReportID = t1.tradeNo左连接 dbo.table2 t2 ON t2.trade_id = t1.tradeNo按 t1.tradeNo 订购

现在只需 3 秒即可运行.

我在表上没有任何索引.添加索引 t1.tradeNo 和 t2.trade_id 和 t3.TradeReportID 没有效果.仅使用一个左连接(两种情况)结合 order by 运行查询速度很快.

我可以交换 LEFT JOIN 的顺序，但这并不能解释为什么这种情况发生以及在什么情况下可能会再次发生>

预计执行计划是:(慢)

(感叹号详情)

切换左连接的顺序(快速):

我注意到明显不同，但我无法解释这些来解释性能差异

更新

似乎添加 order by 子句会导致执行计划使用 Table Spool(懒惰假脱机)而不是在快速查询中使用它.如果我通过 DBCC RULEOFF ('BuildSpool'); 关闭 table spool，这会修复"速度，但根据这篇文章这不是推荐的，并且不能按查询执行

更新 2

返回的列之一 (table3.Text] 的类型为 varchar(max)) - 如果更改为 nvarchar(512) 那么原始(慢)查询现在很快 - 即执行计划现在决定不使用表假脱机 - 还要注意，即使类型是 varchar(max) 字段值为 NULL每一行.这现在可以解决了，但我并不聪明

更新 3

执行计划中的警告声明

<块引用>

表达式中的类型转换 (CONVERT_IMPLICIT(nvarchar(50),[t2].[trade_id],0)) 可能会影响查询计划选择中的CardinalityEstimate"，...

t1.tradeNo 是 nvarchar(21) - 另外两个是 varchar(50) - 在将后两个更改为相同之后作为第一个问题消失了！(保留 UPDATE 2 中所述的 varchar(max) 不变)

鉴于此问题在 UPDATE 2 或 UPDATE 3 得到纠正时消失，我猜它是查询优化器的组合，该查询优化器使用临时表(表假脱机)用于具有无限大小的列 - 尽管 nvarchar(max) 没有数据的列.

解决方案

可能最好的解决方法是确保联接的两端具有相同的数据类型.不需要一个是 varchar 而另一个是 nvarchar.

这是 DB 中经常出现的一类问题.数据库对它要处理的数据的组成假设错误.您估计执行计划中显示的成本可能与您在实际计划中得到的成本相差甚远.我们都会犯错误，如果 SQL Server 能从它自己中吸取教训，那真的很好，但目前还没有.尽管一次又一次地被立即证明是错误的，它仍将估计 2 秒的返回时间.公平地说，我不知道有任何 DBMS 有机器学习可以做得更好.

查询速度快

最难的部分是通过对 table3 进行排序来预先完成的.这意味着它可以进行高效的合并连接，这反过来意味着它没有理由对假脱机偷懒.

慢的地方

拥有一个 ID 来指代存储为两种不同类型和数据长度的相同事物应该没有必要，而且对于 ID 来说永远不是一个好主意.在这种情况下，nvarchar 在一个地方 varchar 在另一个地方.当这导致无法获得基数估计时，这就是关键缺陷，原因如下:

优化器希望只需要 table3 中的几个唯一行.只有少数选项，如男"、女"、其他".这就是所谓的低基数".所以想象一下，由于某种奇怪的原因，tradeNo 实际上包含性别的 ID.请记住，是您具有情境化的人类技能，谁知道这不太可能.DB 对此视而不见.所以这是它期望发生的事情:当它第一次遇到Male"的 ID 时执行查询时，它会懒惰地获取关联的数据(如单词Male")并将其放入线轴.接下来，因为它已经排序，所以它只需要更多的雄性，并且只需重复使用它已经放入线轴的东西.

基本上，它计划以几个大块的形式从表 1 和 2 中获取数据，并停止一次或两次以从表 3 中获取新的详细信息.实际上，停止并不是偶然的.事实上，它甚至可能在每一行处停止，因为这里有很多不同的 ID.懒惰的线轴就像上楼一次得到一件小东西.如果你认为你只需要你的钱包，那就太好了.如果您要搬家，那就不太好，在这种情况下，您需要一个大盒子(急切线轴).

缩小表 3 中字段的大小有所帮助的可能原因是，这意味着它估计在预先进行完整排序时执行惰性假脱机的相对好处较少.使用 varchar 它不知道有多少数据，只知道可能有多少.需要改组的潜在数据块越大，需要做的体力工作就越多.

你可以做些什么来避免未来

使您的表架构和索引反映数据的真实形状.

如果一个 ID 可以是一个表中的 varchar，那么就不太可能需要 nvarchar 中可用的额外字符用于另一个表.避免对 ID 进行转换，并尽可能使用整数而不是字符.
问问自己这些表中是否需要填写tradeNo所有行.如果是这样，请使其在该表上不可为空.接下来问是否这些表中的任何一个的 ID 应该是唯一的，并在适当的索引.Unique 是最大基数的定义这样它就不会再犯那个错误了.

使用连接顺序向正确的方向轻推.

您在 SQL 中的联接顺序是向数据库发出的信号，表明您希望每个联接的功能有多强大/有多难.(有时作为人类，您了解更多.例如，如果查询 50 岁的宇航员，您知道过滤宇航员将是第一个应用的过滤器，但可能从搜索 50 岁上班族时的年龄开始.)重的东西应该来第一的.如果它认为它有更好的信息，它会忽略你，但在这种情况下，它依赖于你的知识.

如果一切都失败了

一个可能的解决方法是在 TradeReportId 的索引中 INCLUDE 您需要的 table3 中的所有字段.索引已经帮不上忙的原因是它们可以轻松识别如何重新排序，但它仍然没有在物理上完成.这是它希望使用惰性假脱机优化的工作，但如果包含数据，它就已经可用，因此无需优化.

using sql server 2014; ((SP1-CU3) (KB3094221) Oct 10 2015 x64

I have the following query

SELECT * FROM dbo.table1 t1

                    LEFT JOIN dbo.table2 t2 ON t2.trade_id = t1.tradeNo
                    LEFT JOIN dbo.table3 t3 ON t3.TradeReportID = t1.tradeNo                                                                                                
                    order by t1.tradeNo

there are ~70k, 35k and 73k rows in t1,t2 and t3 respectively.

When I omit the order by this query executes in 3 seconds with 73k rows.

As written the query took 8.5 minutes to return ~50k rows (I since stopped it)

Switching the order of the LEFT JOINs makes a difference:

SELECT * FROM dbo.table1 t1

                    LEFT JOIN dbo.table3 t3 ON t3.TradeReportID = t1.tradeNo                                                                                                
                    LEFT JOIN dbo.table2 t2 ON t2.trade_id = t1.tradeNo                     
                    order by t1.tradeNo

This now runs in 3 seconds.

I dont have any indexes on the tables. Adding indexes t1.tradeNo and t2.trade_id and t3.TradeReportID has no effect. Running the query with only one left join (both scenarios) in combination with the order by is fast.

Its fine for me to swap the order of the LEFT JOINs but this doesnt go far to explaining why this happens and under what scenarios it may happen again

The estimated exectuion plan is: (slow)

(exclamation mark details)

Switching the order of the left joins (fast):

which I note are markedly different but I cannot interpret these to explain the performance difference

UPDATE

It appears the addition of the order by clause results in the execution plan using the Table Spool (lazy spool) vs NOT using this in the fast query. If I turn off the table spool via DBCC RULEOFF ('BuildSpool'); this 'fixes' the speed but according to this post this isnt recommended and cannot do it per query anyway

UPDATE 2

One of the columns returned (table3.Text] has type varchar(max)) - If this is changed to nvarchar(512) then the original (slow) query is now fast - ie the execution plan now decides to not use the Table Spool - also note that even tho the type is varchar(max) the field values are NULL for every one of the rows. This is now fixable but I am none the wiser

UPDATE 3

Warnings in the execution plan stated

Type conversion in expression (CONVERT_IMPLICIT(nvarchar(50),[t2].[trade_id],0)) may affect "CardinalityEstimate" in query plan choice, ...

t1.tradeNo is nvarchar(21) - the other two are varchar(50) - after altering the latter two to the same as the first the problem disappears! (leaving varchar(max) as stated in UPDATE 2 unchanged)

Given this problem goes away when either UPDATE 2 or UPDATE 3 are rectified I would guess that its a combination of the query optimizer using a temp table (table spool) for a column that has an unbounded size - very interesting despite the nvarchar(max) column having no data.

解决方案

Your likely best fix is to make sure both sides of your joins have the same datatype. There's no need for one to be varchar and the other nvarchar.

This is a class of problems that comes up quite frequently in DBs. The database is assuming the wrong thing about the composition of the data it's about to deal with. The costs shown in your estimated execution plan are likely a long way from what you'd get in your actual plan. We all make mistakes and really it would be good if SQL Server learned from its own but currently it doesn't. It will estimate a 2 second return time despite being immediately proven wrong again and again. To be fair, I don't know of any DBMS which has machine-learning to do better.

Where your query is fast

The hardest part is done up front by sorting table3. That means it can do an efficient merge join which in turn means it has no reason to be lazy about spooling.

Where it's slow

Having an ID that refers to the same thing stored as two different types and data lengths shouldn't ever be necessary and will never be a good idea for an ID. In this case nvarchar in one place varchar in another. When that makes it fail to get a cardinality estimate that's the key flaw and here's why:

The optimizer is hoping to require only a few unique rows from table3. Just a handful of options like "Male", "Female", "Other". That would be what is known as "low cardinality". So imagine tradeNo actually contained IDs for genders for some weird reason. Remember, it's you with your human skills of contextualisation, who knows that's very unlikely. The DB is blind to that. So here is what it expects to happen: As it executes the query the first time it encounters the ID for "Male" it will lazily fetch the data associated (like the word "Male") and put it in the spool. Next, because it's sorted it expects just a lot more males and to simply re-use what it has already put in the spool.

Basically, it plans to fetch the data from tables 1 and 2 in a few big chunks stopping once or twice to fetch new details from table 3. In practice the stopping isn't occasional. In fact, it may even be stopping for every single row because there are lots of different IDs here. The lazy spool is like going upstairs to get one small thing at a time. Good if you think you just need your wallet. Not so good if you're moving house, in which case you'll want a big box (the eager spool).

The likely reason that shrinking the size of the field in table3 helped is that it meant it estimated less of a comparative benefit in doing the lazy spool over a full sort up front. With varchar it doesn't know how much data there is, just how much there could potentially be. The bigger the potential chunks of data that need shuffling, the more physical work needs doing.

What you can do to avoid in future

Make your table schema and indexes reflect the real shape of the data.

If an ID can be varchar in one table then it's very unlikely to need the extra characters available in nvarchar for another table. Avoid the need for conversions on IDs and also use integers instead of characters where possible.
Ask yourself if any of these tables need tradeNo to be filled in for all rows. If so, make it not nullable on that table. Next, ask if the ID should be unique for any of these tables and set it up as such in the appropriate index. Unique is the definition of maximum cardinality so it won't make that mistake again.

Nudge in the right direction with join order.

The order you have your joins in the SQL is a signal to the database about how powerful/difficult you expect each join to be. (Sometimes as a human you know more. e.g. if querying for 50 year old astronauts you know that filtering for astronauts would be the first filter to apply but maybe begin with the age when searching for 50 year office workers.) The heavy stuff should come first. It will ignore you if it thinks it has the information to know better but in this case it's relying on your knowledge.

If all else fails

A possible fix would be to INCLUDE all the fields you'll need from table3 in the index on TradeReportId. The reason the indexes couldn't help so much already is that they make it easy to identify how to re-sort but it still hasn't been physically done. That is work it was hoping to optimize with a lazy spool but if the data were included it would be already available so no work to optimize.

这篇关于order by 会大大减慢查询速度的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

order by 会大大减慢查询速度 [英] order by slows query down massively

问题描述

更新

更新 2

更新 3

查询速度快

慢的地方

你可以做些什么来避免未来

UPDATE

UPDATE 2

UPDATE 3

Where your query is fast

Where it's slow

What you can do to avoid in future

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

order by 会大大减慢查询速度 [英] order by slows query down massively

问题描述

更新

更新 2

更新 3

查询速度快

慢的地方

你可以做些什么来避免未来

UPDATE

UPDATE 2

UPDATE 3

Where your query is fast

Where it's slow

What you can do to avoid in future

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭