为什么使用 IN(子查询)的查询比使用 IN(离散列表)的查询花费的时间更长 [英] Why does a query with an IN (subquery) take longer than a query with IN (discrete list)

查看:37
本文介绍了为什么使用 IN(子查询)的查询比使用 IN(离散列表)的查询花费的时间更长的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这一直困扰着我,为什么这个查询

This has always bugged me, why does this query

SELECT 
  * 
FROM
  `TABLE` 
WHERE `value` IN 
  (SELECT 
    val 
  FROM
    OTHER_TABLE 
  WHERE `date` < '2014-01-01')

运行速度比顺序运行此查询慢几个数量级

run orders of magnitude slower than sequentially running both this query

SELECT 
  `val` 
FROM
  OTHER_TABLE 
WHERE `date` < '2014-01-01' 

Result:
+----+
| val |
+-----+
| v1  |
| v2  |
| v3  |
| v7  |
| v12 |
+-----+

和这个查询:

SELECT 
  * 
FROM
  `TABLE` 
WHERE `value` IN ('v1', 'v2', 'v3', 'v7', 'v12')

推荐答案

来自文档:(重点是我加的)

From the docs: (emphasis added by me)

IN 的子查询优化不如 = 运算符有效或者对于 IN(value_list) 运算符.

Subquery optimization for IN is not as effective as for the = operator or for the IN(value_list) operator.

IN 子查询性能不佳的典型情况是子查询返回少量行,但外部查询返回大量行要与子查询结果进行比较的行数.

A typical case for poor IN subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.

问题在于,对于使用 IN 子查询的语句,优化器将其重写为相关子查询.考虑以下使用不相关子查询的语句:

The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery:

SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);

优化器将语句重写为相关子查询:

The optimizer rewrites the statement to a correlated subquery:

SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);

如果内部查询和外部查询分别返回 M 行和 N 行,则执行时间变成O(M×N),而不是O(M+N)它将用于不相关的子查询.

If the inner and outer queries return M and N rows, respectively, the execution time becomes on the order of O(M×N), rather than O(M+N) as it would be for an uncorrelated subquery.

暗示 IN 子查询可能比查询慢得多使用列出相同值的 IN(value_list) 运算符编写子查询将返回.

An implication is that an IN subquery can be much slower than a query written using an IN(value_list) operator that lists the same values that the subquery would return.

http://dev.mysql.com/doc/refman/5.7/en/subquery-restrictions.html

希望这能帮助其他可能好奇的人

Hopes this helps anyone else who might have been curious

这篇关于为什么使用 IN(子查询)的查询比使用 IN(离散列表)的查询花费的时间更长的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆