IN 谓词在 SQL 中是如何工作的? [英] How does the IN predicate work in SQL?

查看:24
本文介绍了IN 谓词在 SQL 中是如何工作的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在为这个问题准备好答案后,我发现我无法验证我的答案.>

在我的第一份编程工作中,我被告知 IN () 谓词中的查询将针对父查询中包含的每一行执行,因此使用 IN 应该避免.

例如,给定查询:

SELECT count(*) FROM Table1 WHERE Table1Id NOT IN (SELECT Table1Id FROM Table2 WHERE id_user = 1)

<前>表 1 行 |IN"执行次数----------------------------------10 |10100 |1001000 |100010000 |10000

这是正确的吗?IN 谓词实际上是如何工作的?

解决方案

您收到的关于为每一行执行子查询的警告是正确的——对于相关子查询.

SELECT COUNT(*) FROM Table1 a哪里 a.Table1id 不在 (SELECT b.Table1Id FROM Table2 b WHERE b.id_user = a.id_user);

请注意,子查询引用了外部查询的 id_user 列.Table1 每一行的id_user 的值可能不同.因此子查询的结果可能会有所不同,具体取决于外部查询中的当前行.RDBMS 必须多次执行子查询,对外部查询中的每一行执行一次.

您测试的示例是一个非相关子查询.大多数现代 RDBMS 优化器都应该能够判断子查询的结果何时依赖于外部查询的每一行中的值.在这种情况下,RDBMS 会运行一次子查询,缓存它的结果,然后将它重复用于外部查询中的谓词.

PS:在 SQL 中,IN() 被称为谓词",而不是语句.谓词是语言的一部分,它评估为真或假,但不一定作为语句独立执行.也就是说,您不能将其作为 SQL 查询运行:2 IN (1,2,3);"虽然这是一个有效的谓词,但它不是一个有效的陈述.

After prepairing an answer for this question I found I couldn't verify my answer.

In my first programming job I was told that a query within the IN () predicate gets executed for every row contained in the parent query, and therefore using IN should be avoided.

For example, given the query:

SELECT count(*) FROM Table1 WHERE Table1Id NOT IN (
SELECT Table1Id FROM Table2 WHERE id_user = 1)

Table1 Rows | # of "IN" executions
----------------------------------
      10    |       10
     100    |      100
    1000    |     1000
   10000    |    10000

Is this correct? How does the IN predicate actually work?

解决方案

The warning you got about subqueries executing for each row is true -- for correlated subqueries.

SELECT COUNT(*) FROM Table1 a 
WHERE a.Table1id NOT IN (
  SELECT b.Table1Id FROM Table2 b WHERE b.id_user = a.id_user
);

Note that the subquery references the id_user column of the outer query. The value of id_user on each row of Table1 may be different. So the subquery's result will likely be different, depending on the current row in the outer query. The RDBMS must execute the subquery many times, once for each row in the outer query.

The example you tested is a non-correlated subquery. Most modern RDBMS optimizers worth their salt should be able to tell when the subquery's result doesn't depend on the values in each row of the outer query. In that case, the RDBMS runs the subquery a single time, caches its result, and uses it repeatedly for the predicate in the outer query.

PS: In SQL, IN() is called a "predicate," not a statement. A predicate is a part of the language that evaluates to either true or false, but cannot necessarily be executed independently as a statement. That is, you can't just run this as an SQL query: "2 IN (1,2,3);" Although this is a valid predicate, it's not a valid statement.

这篇关于IN 谓词在 SQL 中是如何工作的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆