SQL Server 中的 INNER JOIN 与 LEFT JOIN 性能 [英] INNER JOIN vs LEFT JOIN performance in SQL Server

查看:44
本文介绍了SQL Server 中的 INNER JOIN 与 LEFT JOIN 性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经创建了在 9 个表上使用 INNER JOIN 的 SQL 命令,无论如何这个命令需要很长时间(超过五分钟).所以我的朋友建议我将 INNER JOIN 改为 LEFT JOIN 因为 LEFT JOIN 的性能更好,尽管我知道.改了之后,查询速度明显提高了.

I've created SQL command that uses INNER JOIN on 9 tables, anyway this command takes a very long time (more than five minutes). So my folk suggested me to change INNER JOIN to LEFT JOIN because the performance of LEFT JOIN is better, despite what I know. After I changed it, the speed of query got significantly improved.

我想知道为什么 LEFT JOIN 比 INNER JOIN 快?

I would like to know why LEFT JOIN is faster than INNER JOIN?

我的 SQL 命令如下所示:SELECT * FROM A INNER JOIN B ON ... INNER JOIN C ON ... INNER JOIN D 依此类推

My SQL command look like below: SELECT * FROM A INNER JOIN B ON ... INNER JOIN C ON ... INNER JOIN D and so on

更新:这是我的架构简介.

FROM sidisaleshdrmly a -- NOT HAVE PK AND FK
    INNER JOIN sidisalesdetmly b -- THIS TABLE ALSO HAVE NO PK AND FK
        ON a.CompanyCd = b.CompanyCd 
           AND a.SPRNo = b.SPRNo 
           AND a.SuffixNo = b.SuffixNo 
           AND a.dnno = b.dnno
    INNER JOIN exFSlipDet h -- PK = CompanyCd, FSlipNo, FSlipSuffix, FSlipLine
        ON a.CompanyCd = h.CompanyCd
           AND a.sprno = h.AcctSPRNo
    INNER JOIN exFSlipHdr c -- PK = CompanyCd, FSlipNo, FSlipSuffix
        ON c.CompanyCd = h.CompanyCd
           AND c.FSlipNo = h.FSlipNo 
           AND c.FSlipSuffix = h.FSlipSuffix 
    INNER JOIN coMappingExpParty d -- NO PK AND FK
        ON c.CompanyCd = d.CompanyCd
           AND c.CountryCd = d.CountryCd 
    INNER JOIN coProduct e -- PK = CompanyCd, ProductSalesCd
        ON b.CompanyCd = e.CompanyCd
           AND b.ProductSalesCd = e.ProductSalesCd 
    LEFT JOIN coUOM i -- PK = UOMId
        ON h.UOMId = i.UOMId 
    INNER JOIN coProductOldInformation j -- PK = CompanyCd, BFStatus, SpecCd
        ON a.CompanyCd = j.CompanyCd
            AND b.BFStatus = j.BFStatus
            AND b.ProductSalesCd = j.ProductSalesCd
    INNER JOIN coProductGroup1 g1 -- PK = CompanyCd, ProductCategoryCd, UsedDepartment, ProductGroup1Cd
        ON e.ProductGroup1Cd  = g1.ProductGroup1Cd
    INNER JOIN coProductGroup2 g2 -- PK = CompanyCd, ProductCategoryCd, UsedDepartment, ProductGroup2Cd
        ON e.ProductGroup1Cd  = g2.ProductGroup1Cd

推荐答案

LEFT JOIN 绝对不比 INNER JOIN 快.事实上,它更慢;根据定义,外连接(LEFT JOINRIGHT JOIN)必须完成 INNER JOIN 的所有工作以及 null 的额外工作- 扩展结果.预计还会返回更多行,从而进一步增加总执行时间,因为结果集较大.

A LEFT JOIN is absolutely not faster than an INNER JOIN. In fact, it's slower; by definition, an outer join (LEFT JOIN or RIGHT JOIN) has to do all the work of an INNER JOIN plus the extra work of null-extending the results. It would also be expected to return more rows, further increasing the total execution time simply due to the larger size of the result set.

(即使 LEFT JOIN 特定情况下由于一些难以想象的因素汇合而更快,在功能上不等同于 INNER JOIN,所以你不能简单地用另一个替换一个的所有实例!)

(And even if a LEFT JOIN were faster in specific situations due to some difficult-to-imagine confluence of factors, it is not functionally equivalent to an INNER JOIN, so you cannot simply go replacing all instances of one with the other!)

很可能您的性能问题出在其他地方,例如没有正确索引候选键或外键.9 张桌子需要加入很多,所以几乎可以在任何地方放慢速度.如果您发布自己的架构,我们或许可以提供更多详细信息.

Most likely your performance problems lie elsewhere, such as not having a candidate key or foreign key indexed properly. 9 tables is quite a lot to be joining so the slowdown could literally be almost anywhere. If you post your schema, we might be able to provide more details.

进一步思考这一点,我可以想到一种情况,在这种情况下,LEFT JOIN 可能比 INNER JOIN 更快,那就是:

Reflecting further on this, I could think of one circumstance under which a LEFT JOIN might be faster than an INNER JOIN, and that is when:

  • 有些表格非常很小(比如不到 10 行);
  • 表没有足够的索引来覆盖查询.
  • Some of the tables are very small (say, under 10 rows);
  • The tables do not have sufficient indexes to cover the query.

考虑这个例子:

CREATE TABLE #Test1
(
    ID int NOT NULL PRIMARY KEY,
    Name varchar(50) NOT NULL
)
INSERT #Test1 (ID, Name) VALUES (1, 'One')
INSERT #Test1 (ID, Name) VALUES (2, 'Two')
INSERT #Test1 (ID, Name) VALUES (3, 'Three')
INSERT #Test1 (ID, Name) VALUES (4, 'Four')
INSERT #Test1 (ID, Name) VALUES (5, 'Five')

CREATE TABLE #Test2
(
    ID int NOT NULL PRIMARY KEY,
    Name varchar(50) NOT NULL
)
INSERT #Test2 (ID, Name) VALUES (1, 'One')
INSERT #Test2 (ID, Name) VALUES (2, 'Two')
INSERT #Test2 (ID, Name) VALUES (3, 'Three')
INSERT #Test2 (ID, Name) VALUES (4, 'Four')
INSERT #Test2 (ID, Name) VALUES (5, 'Five')

SELECT *
FROM #Test1 t1
INNER JOIN #Test2 t2
ON t2.Name = t1.Name

SELECT *
FROM #Test1 t1
LEFT JOIN #Test2 t2
ON t2.Name = t1.Name

DROP TABLE #Test1
DROP TABLE #Test2

如果你运行这个并查看执行计划,你会看到INNER JOIN查询确实比LEFT JOIN花费更多,因为它满足两个以上标准.这是因为 SQL Server 想要对 INNER JOIN 进行哈希匹配,但对 LEFT JOIN 进行嵌套循环;前者通常要快得多,但是由于行数太少而且没有索引可以使用,因此哈希运算是整个过程中开销最大的部分.查询.

If you run this and view the execution plan, you'll see that the INNER JOIN query does indeed cost more than the LEFT JOIN, because it satisfies the two criteria above. It's because SQL Server wants to do a hash match for the INNER JOIN, but does nested loops for the LEFT JOIN; the former is normally much faster, but since the number of rows is so tiny and there's no index to use, the hashing operation turns out to be the most expensive part of the query.

您可以通过使用您最喜欢的编程语言编写程序来对具有 5 个元素的列表与具有 5 个元素的哈希表执行大量查找,从而看到相同的效果.由于大小,哈希表版本实际上更慢.但是将其增加到 50 个元素,或 5000 个元素,列表版本会慢到爬行,因为哈希表的 O(N) 与 O(1).

You can see the same effect by writing a program in your favourite programming language to perform a large number of lookups on a list with 5 elements, vs. a hash table with 5 elements. Because of the size, the hash table version is actually slower. But increase it to 50 elements, or 5000 elements, and the list version slows to a crawl, because it's O(N) vs. O(1) for the hashtable.

但是将此查询更改为在 ID 列而不是 Name 列上,您会看到一个非常不同的故事.在这种情况下,它会为两个查询执行嵌套循环,但是 INNER JOIN 版本能够用搜索替换聚集索引扫描之一 - 这意味着这实际上是 大量行数更快.

But change this query to be on the ID column instead of Name and you'll see a very different story. In that case, it does nested loops for both queries, but the INNER JOIN version is able to replace one of the clustered index scans with a seek - meaning that this will literally be an order of magnitude faster with a large number of rows.

所以结论或多或少是我上面几段提到的;这几乎可以肯定是索引或索引覆盖问题,可能与一个或多个非常小的表相结合.在这些情况下,SQL Server 可能有时会为 INNER JOIN 选择比 LEFT JOIN 更差的执行计划.

So the conclusion is more or less what I mentioned several paragraphs above; this is almost certainly an indexing or index coverage problem, possibly combined with one or more very small tables. Those are the only circumstances under which SQL Server might sometimes choose a worse execution plan for an INNER JOIN than a LEFT JOIN.

这篇关于SQL Server 中的 INNER JOIN 与 LEFT JOIN 性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆