没有LIMIT的MySQL 5.7 RAND()和IF()会导致意外结果 [英] MySQL 5.7 RAND() and IF() without LIMIT leads to unexpected results
问题描述
我有以下查询
SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table LIMIT 20) t
返回如下内容:
这正是您所期望的.但是,一旦删除LIMIT 20
,我会收到非常意外的结果(返回的行多于20,因此将其切断以使其更易于阅读):
That's exactly what you would expect. However, as soon as I remove the LIMIT 20
I receive highly unexpected results (there are more rows returned than 20, I cut it off to make it easier to read):
SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table) t
旁注:
我正在使用MySQL 5.7.18-15-log,这是一个高度抽象的示例(实际查询要困难得多).
我试图了解正在发生的事情.我不需要提供解决方法的答案,而无需任何解释为什么原始版本无法正常工作.谢谢.
Side notes:
I'm using MySQL 5.7.18-15-log and this is a highly abstracted example (real query is much more difficult).
I'm trying to understand what is happening. I do not need answers that offer work arounds without any explanations why the original version is not working. Thank you.
更新:
除了使用LIMIT
之外,GROUP BY id
在第一种情况下也可以使用.
Update:
Instead of using LIMIT
, GROUP BY id
also works in the first case.
更新2:
根据zerkms的要求,我在第二个示例中添加了t.res = 0
和t.res + 1
Update 2:
As requested by zerkms, I added t.res = 0
and t.res + 1
to the second example
推荐答案
问题是由MySQL 5.7中引入的一项更改引起的,该更改涉及如何处理(子)查询中的派生表.
基本上,为了优化性能,当子查询返回不确定的结果时(例如在我的RAND()
中),某些子查询会在不同的时间和/或多次执行,从而导致意外的结果.
有两种简单(也很丑陋)的变通方法来使MySQL实现"(又称确定性结果)以下子查询:使用LIMIT <high number>
或GROUP BY id
都强制MySQL实现子查询并返回期望的结果. br>
最后一个选项是关闭optimizer_switch
变量中的derived_merge
:derived_merge=off
(确保所有其他参数保持不变).
The problem is caused by a change introduced in MySQL 5.7 on how derived tables in (sub)queries are treated.
Basically, in order to optimize performance, some subqueries are executed at different times and / or multiple times leading to unexpected results when your subquery returns non-deterministic results (like in my case with RAND()
).
There are two easy (and likewise ugly) workarounds to get MySQL to "materialize" (aka return deterministic results) these subqueries: Use LIMIT <high number>
or GROUP BY id
both of which force MySQL to materialize the subquery and return the expected results.
The last option is turn off derived_merge
in the optimizer_switch
variable: derived_merge=off
(make sure to leave all the other parameters as they are).
更多读数:
https://mysqlserverteam.com/derived-tables-in-mysql-5 -7/
子查询的rand()列re -针对MySQL 5.7/8.0与MySQL 5.6中的每个重复选择进行评估
Further readings:
https://mysqlserverteam.com/derived-tables-in-mysql-5-7/
Subquery's rand() column re-evaluated for every repeated selection in MySQL 5.7/8.0 vs MySQL 5.6
这篇关于没有LIMIT的MySQL 5.7 RAND()和IF()会导致意外结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!