没有LIMIT的MySQL 5.7 RAND()和IF()会导致意外结果 [英] MySQL 5.7 RAND() and IF() without LIMIT leads to unexpected results

查看:168
本文介绍了没有LIMIT的MySQL 5.7 RAND()和IF()会导致意外结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下查询

SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table LIMIT 20) t

返回如下内容:

这正是您所期望的.但是,一旦删除LIMIT 20,我会收到非常意外的结果(返回的行多于20,因此将其切断以使其更易于阅读):

That's exactly what you would expect. However, as soon as I remove the LIMIT 20 I receive highly unexpected results (there are more rows returned than 20, I cut it off to make it easier to read):

SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table) t

旁注:
我正在使用MySQL 5.7.18-15-log,这是一个高度抽象的示例(实际查询要困难得多).
我试图了解正在发生的事情.我不需要提供解决方法的答案,而无需任何解释为什么原始版本无法正常工作.谢谢.

Side notes:
I'm using MySQL 5.7.18-15-log and this is a highly abstracted example (real query is much more difficult).
I'm trying to understand what is happening. I do not need answers that offer work arounds without any explanations why the original version is not working. Thank you.

更新: 除了使用LIMIT之外,GROUP BY id在第一种情况下也可以使用.

Update: Instead of using LIMIT, GROUP BY id also works in the first case.

更新2: 根据zerkms的要求,我在第二个示例中添加了t.res = 0t.res + 1

Update 2: As requested by zerkms, I added t.res = 0 and t.res + 1 to the second example

推荐答案

问题是由MySQL 5.7中引入的一项更改引起的,该更改涉及如何处理(子)查询中的派生表.
基本上,为了优化性能,当子查询返回不确定的结果时(例如在我的RAND()中),某些子查询会在不同的时间和/或多次执行,从而导致意外的结果. 有两种简单(也很丑陋)的变通方法来使MySQL实现"(又称确定性结果)以下子查询:使用LIMIT <high number>GROUP BY id都强制MySQL实现子查询并返回期望的结果. br> 最后一个选项是关闭optimizer_switch变量中的derived_merge:derived_merge=off(确保所有其他参数保持不变).

The problem is caused by a change introduced in MySQL 5.7 on how derived tables in (sub)queries are treated.
Basically, in order to optimize performance, some subqueries are executed at different times and / or multiple times leading to unexpected results when your subquery returns non-deterministic results (like in my case with RAND()).
There are two easy (and likewise ugly) workarounds to get MySQL to "materialize" (aka return deterministic results) these subqueries: Use LIMIT <high number> or GROUP BY id both of which force MySQL to materialize the subquery and return the expected results.
The last option is turn off derived_merge in the optimizer_switch variable: derived_merge=off (make sure to leave all the other parameters as they are).

更多读数:
https://mysqlserverteam.com/derived-tables-in-mysql-5 -7/
子查询的rand()列re -针对MySQL 5.7/8.0与MySQL 5.6中的每个重复选择进行评估

Further readings:
https://mysqlserverteam.com/derived-tables-in-mysql-5-7/
Subquery's rand() column re-evaluated for every repeated selection in MySQL 5.7/8.0 vs MySQL 5.6

这篇关于没有LIMIT的MySQL 5.7 RAND()和IF()会导致意外结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆