MYSQL连接更新内部步骤 [英] MYSQL join update internal steps
问题描述
使用表和数据:
CREATE TABLE test.tem(a INT,b INT,INDEX(a), INDEX(b));
INSERT INTO test.tem VALUES(1,2),(1,1),(1,NULL),(2,3);
现在数据应该是:
+ ------ + ------ +
| a | b |
+ ------ + ------ +
| 1 | 2 |
| 1 | 1 |
| 1 | NULL |
| 2 | 3 |
+ ------ + ------ +
我想要将列b更新为min(b)组a列。
我已经知道一个正确的SQL是:
pre $ code> UPDATE TEM AS
JOIN(SELECT a,MIN(b)AS m FROM tem GROUP BY a)AS t2
USING(a)
SET t1.b = t2。米;
产生正确的结果是:
+ ------ + ------ +
| a | b |
+ ------ + ------ +
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 2 | 3 |
+ ------ + ------ +
然而,大约需要5分钟时间用一个SQL查询更新一个表中的450万条记录。
所以,我有一个自己的SQL: UPDATE test.tem t1
JOIN test.tem t2
ON t1.a = t2.a
SET t1。 b = t2.b
WHERE t1.b> t2.b
或t1.b IS NULL;
但它会得到不正确的结果:
+ ------ + ------ +
| a | b |
+ ------ + ------ +
| 1 | 1 |
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
+ ------ + ------ +
我认为原因是关于MYSQL如何在更新时工作。任何人都可以告诉我 不正确的结果出来了吗?如果有人可以修复我的 SQL,那也会有所帮助。 为解决方案
行:
您想要将 a
的行使用以下 JOIN
来做到这一点:
UPDATE test.tem t1
JOIN test.tem t2
ON t1.a = t2.a
SET t1.b = t2.b
WHERE t1.b> t2.b
或t1.b IS NULL;
与您可能认为的相反, JOIN
不会执行1-1 JOIN
。这实际上是自从正如我昨天所说的 JOIN / b>你的join子句中不使用主键(也不是非空的唯一键)。
实际上,将该查询重写为 SELECT
可能会帮助你理解问题:
SELECT t1.a as t1a, t1.b as t1b,t2.a as t2a,t2.b as t2b FROM tem t1 JOIN tem t2
ON t1.a = t2.a
WHERE t1.b> t2.b
或t1.b IS NULL;
+ ------ + --------- + ------ + -------- +
| T1A | T1B | T2A | T2B |
+ ------ + --------- + ------ + -------- +
| 1 | (null)| 1 | 2 |
| 1 | 2 | 1 | 1 |
| 1 | (null)| 1 | 1 |
| 1 | (null)| 1 | (null)|
+ ------ + --------- + ------ + -------- +
http:// sqlfiddle .com /#!2 / 856a7 / 8
现在您会看到,(1,null) code> match
(1,1)
,(1,2)和
(1,null)
。根据查询执行的(非确定性)顺序,这可能为 b
('m不确定,但是甚至可以更新它几次 )。在某种程度上,你很幸运地在测试中发现错误的结果!
我希望这解释了为什么你的查询不会产生预期的结果。由于多表 UPDATE
声明不允许 ORDER BY
和 GROUP BY
子句来查找好的结果,我没有看到许多其他的选择,通过子查询找到最低的第一个 ...
Use the table and data:
CREATE TABLE test.tem(a INT,b INT,INDEX (a),INDEX (b));
INSERT INTO test.tem VALUES(1,2),(1,1),(1,NULL),(2,3);
Now the data should be:
+------+------+
| a | b |
+------+------+
| 1 | 2 |
| 1 | 1 |
| 1 | NULL |
| 2 | 3 |
+------+------+
I want to update column b to the min(b) group by column a.
I have known one correct SQL is:
UPDATE tem AS t1
JOIN (SELECT a,MIN(b) AS m FROM tem GROUP BY a) AS t2
USING (a)
SET t1.b = t2.m;
Producing the correct result is:
+------+------+
| a | b |
+------+------+
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 2 | 3 |
+------+------+
However, it takes about 5 minutes to update in one table with 4.5 million records using this SQL query.
So, I have one SQL of myself:
UPDATE test.tem t1
JOIN test.tem t2
ON t1.a = t2.a
SET t1.b = t2.b
WHERE t1.b > t2.b
OR t1.b IS NULL;
But it gets the incorrect result:
+------+------+
| a | b |
+------+------+
| 1 | 1 |
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
+------+------+
I think the reason is about how MYSQL work when update. Can anyone tell me how the incorrect result come out? If someone could fix my SQL that would help too.
For the "query not updating correctly the rows":
You want to update column b
to the minimum of b
for all rows having the same a
You proposed to use the following JOIN
to do that:
UPDATE test.tem t1
JOIN test.tem t2
ON t1.a = t2.a
SET t1.b = t2.b
WHERE t1.b > t2.b
OR t1.b IS NULL;
Contrary to what you may think, that JOIN
will not perform a 1-1 JOIN
. It is in fact a many-to-many JOIN
since as I said yesterday you don't use primary key (nor non-null unique key) in your join clause.
In fact, rewriting that query as as SELECT
will probably help you to understand the problem:
SELECT t1.a as t1a, t1.b as t1b, t2.a as t2a,t2.b as t2b FROM tem t1 JOIN tem t2
ON t1.a = t2.a
WHERE t1.b > t2.b
OR t1.b IS NULL;
+------+---------+------+--------+
| T1A | T1B | T2A | T2B |
+------+---------+------+--------+
| 1 | (null) | 1 | 2 |
| 1 | 2 | 1 | 1 |
| 1 | (null) | 1 | 1 |
| 1 | (null) | 1 | (null) |
+------+---------+------+--------+
http://sqlfiddle.com/#!2/856a7/8
As you will see now, the row (1, null)
match (1, 1)
, (1, 2)
and (1, null)
. Depending the (non-deterministic) order of execution of the query, this might assign any of the three possible values for b
('m not sure about that, but maybe even updating it several times). To some extend, you have been lucky to find the "wrong" result while testing!
I hope this explain a little bit more why your query does not produce the expected result. Since multi-table UPDATE
statements don't allow ORDER BY
nor GROUP BY
clauses, as of myself, to find the "good" result, I don't see many other options than finding the minimum first through a sub-query...
这篇关于MYSQL连接更新内部步骤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!