如何在同一张表上更新具有不同值的多个重复项? [英] How to update multiple duplicates with different values on the same table?

查看:116
本文介绍了如何在同一张表上更新具有不同值的多个重复项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理的表具有多个行,它们对于 lat lon 具有相同的值。该示例显示 1 3 5 相同的位置,但名称属性不同。 hash 名称 lat lon ,因此不同。

The table I am dealing with has multiple rows which have the same values for lat and lon. The example shows that 1, 3, 5 have the same location but the name attribute differs. The hash is built from name, lat and lon and differs therefore.

BEFORE:

 id | name | lat | lon | flag | hash
----+------+-----+-----+------+------
 1  | aaa  | 16  | 48  |  0   | 2cd       <-- duplicate
 2  | bbb  | 10  | 22  |  0   | 3fc
 3  | ccc  | 16  | 48  |  0   | 8ba       <-- duplicate
 4  | ddd  | 10  | 23  |  0   | c33
 5  | eee  | 16  | 48  |  0   | 751       <-- duplicate

我需要在此表中标识重复,并要分配其中一个标记 1 primary )和标志 2 次要)给其他人。

I need to identify "duplicates" within this table and want to assign the flag 1 (primary) to one of them and the flag 2 (secondary) to the others. It is not important which "duplicate" is flagged as primary.

 AFTER:

 id | name | lat | lon | flag | hash
----+------+-----+-----+------+------
 1  | aaa  | 16  | 48  |  1   | 2cd       <-- updated
 2  | bbb  | 10  | 22  |  0   | 3fc
 3  | ccc  | 16  | 48  |  2   | 8ba       <-- updated
 4  | ddd  | 10  | 23  |  0   | c33
 5  | eee  | 16  | 48  |  2   | 751       <-- updated

我开始尝试使用 INNER JOIN 灵感来自这篇文章,这个视觉描述。有了这个,我可以为所有的副本分配相同的标志。

I started experimenting with INNER JOIN inspired by this post and this visual description. With this I am able to assign the same flag to all duplicates.

UPDATE table t1
INNER JOIN table_name t2
ON
  t1.lat = t2.lat
  AND t1.lon = t2.lon
  AND t1.hash != t2.hash
SET
  t1.flag = 2;

我还测试了 LEFT OUTER JOIN code> WHERE t2.id IS NULL 当只有两行时,它可以工作。但是,我无法想象我的头脑如何一个 JOIN 应该使用两个重复的。马克·哈里森也假定您正在加入不重复的列 在他的帖子开头这听起来好像不是一个好主意。

I also tested LEFT OUTER JOIN with WHERE t2.id IS NULL which could work when there are only two rows. However, I cannot think my head off how a JOIN should work with more then two duplicates. Mark Harrison also assumes "that you're joining on columns with no duplicates" at the beginning of his post which sound as if this is not a good idea.

如果感兴趣,我使用MySQL。

I am using MySQL if this is of interest.

推荐答案

不知道这是非常有效的,但它只需一个查询

Not sure this is very efficient, but it works in just one query:

UPDATE t
JOIN (
  SELECT MAX(t.id) AS maxid, lat, lon
  FROM t
  JOIN t AS duplicates
  USING (lat, lon)
  GROUP BY lat, lon
  HAVING COUNT(*) > 1
) AS maxima USING (lat, lon)
SET flag = IF(id = maxid, 1, 2);

这篇关于如何在同一张表上更新具有不同值的多个重复项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆