使用连接从表中删除重复行 [英] Remove duplicate rows from table with join

查看:47
本文介绍了使用连接从表中删除重复行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个表来包含国家的州(state_table)和城市(city_table)

I have two table to contain state (state_table) and city (city_table) of countries

city 表具有 state_id 以将其与 state_table 关联

The city table is having state_id to relate it with state_table

两个表都已经有数据了.

Both the tables are already having data in it.

现在的问题

City 表包含一个州内一个城市的多个条目.其他城市可能有也可能没有相同的城市名称

City table contains multiple entries of a city within one state. And another cities may or may not have the same city name as well

例如:cityone 将在城市表中与 stateone 出现 5 次,在 statetwo 中出现 2 次

e.g.: cityone will have 5 occurrence in the city table with stateone and 2 occurrence with statetwo

那么我将如何编写查询来为每个州保留一个城市并删除其余的?

So how will I write a query to keep one city for each state and delete the rest?

架构遵循

CREATE TABLE IF NOT EXISTS `city_table` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `state_id` int(11) NOT NULL,
  `city` varchar(25) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;


CREATE TABLE IF NOT EXISTS `state_table` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `state` varchar(15) NOT NULL,
  `country_id` smallint(5) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;

这是样本数据

id   state_id   city
1   1   city_one
2   1   city_two
3   1   city_one
4   1   city_two
5   2   city_one
6   3   city_three
7   3   city_one
8   3   city_three
9   4   city_four
10  4   city_five

原始表有 152,451 行

Original table has 152,451 rows

推荐答案

如果要删除具有相同 state_id 的重复城市(重复记录),可以通过按 分组来实现citystate_id 并使用 MINMAX 函数:

If you want to remove duplicate city with same state_id (duplicate records), you can do that by grouping them by city and state_id and using MIN or MAX function:

在删除查询之前,您的表看起来像

Before delete query your table was looking like

| ID | STATE_ID |       CITY |
------------------------------
|  1 |        1 |   city_one |
|  2 |        1 |   city_two |
|  3 |        1 |   city_one |
|  4 |        1 |   city_two |
|  5 |        2 |   city_one |
|  6 |        3 | city_three |
|  7 |        3 |   city_one |
|  8 |        3 | city_three |
|  9 |        4 |  city_four |
| 10 |        4 |  city_five |

您可以使用以下查询来删除重复记录:

You can use the following query to remove duplicate records:

DELETE city_table 
  FROM city_table
  LEFT JOIN 
  (SELECT MIN(id) AS IDs FROM city_table
   GROUP BY city,state_id
  )A
  ON city_table.ID = A.IDs
  WHERE A.ids IS NULL;

应用上述查询后,您的表格将如下所示:

After applying the above query your table will look like:

| ID | STATE_ID |       CITY |
------------------------------
|  1 |        1 |   city_one |
|  2 |        1 |   city_two |
|  5 |        2 |   city_one |
|  6 |        3 | city_three |
|  7 |        3 |   city_one |
|  9 |        4 |  city_four |
| 10 |        4 |  city_five |

查看这个 SQLFiddle

更多信息请参见 DELETE MySQL 语法.

See this SQLFiddle

For more see DELETE Syntax of MySQL.

这篇关于使用连接从表中删除重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆