删除 MySQL 数据库中的重复行 [英] Deleting Duplicate Rows in a MySQL Database

查看:52
本文介绍了删除 MySQL 数据库中的重复行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据库

my_table [id,name,address,phone] 有很多条目,我想删除重复数据,其中任何重复的 phone 都会导致删除.

my_table [id,name,address,phone] with a lot of entries and i would like to delete the Duplicate data where any just any duplicated phone will results in deleting.

这是我的尝试,但显示错误

Here is my try but shows error

在我的 sql 文件中

Inside my sql file

CREATE TABLE `my_table` (
  `id` int(10) NOT NULL default '0',
  `name` varchar(255) NOT NULL default '',
  `address` varchar(255) NOT NULL default '',
  `phone` varchar(255) NOT NULL default '',
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `my_table` VALUES (1, 'Albert', 'EGYPT', '202020');
INSERT INTO `my_table` VALUES (2, 'John', 'USA', '984731');
INSERT INTO `my_table` VALUES (3, 'Steve', 'Romabia', '202020');
INSERT INTO `my_table` VALUES (4, 'Albert', 'EGYPT', '343354');

很明显,id 1 和 3 的行具有相同的电话号码,然后将删除重复的结果

very clear that row of id 1 and 3 have same phone number then will remove duplicated so results be

INSERT INTO `my_table` VALUES (1, 'Albert', 'EGYPT', '202020');
INSERT INTO `my_table` VALUES (2, 'John', 'USA', '984731');
INSERT INTO `my_table` VALUES (3, 'Albert', 'EGYPT', '343354');

我的尝试如何

我刚刚在sql文件中添加了以下内容

I just have added to the sql file the following

1- 创建新表以获得不同

1- created new table to get distinct

CREATE TABLE my_temp(id VARCHAR(10), name VARCHAR(255), address VARCHAR(255), phone VARCHAR(255));
INSERT INTO my_temp(id,name,address,phone) SELECT DISTINCT id,name,address,phone FROM my_table;

2- 从真实表中删除条目 my_table

2- Delete entries from real table my_table

DELETE FROM my_table;

3- 将条目从 my_tamp 表返回到真正的 my_table

3- Getting entries back from my_tamp table to real my_table

INSERT INTO my_table(id,name,address,phone) SELECT id,name,address,phone FROM my_temp;

4- 删除无用的表 my_temp

DROP TABLE my_temp;

现在我的问题

它仍然会显示相同的内容

it will still show me the same

INSERT INTO `my_table` VALUES (1, 'Albert', 'EGYPT', '202020');
INSERT INTO `my_table` VALUES (2, 'John', 'USA', '984731');
INSERT INTO `my_table` VALUES (3, 'Steve', 'Romabia', '202020');
INSERT INTO `my_table` VALUES (4, 'Albert', 'EGYPT', '343354');

因为它不会考虑重复,因为它们的 id、name、address 不同

because it won't consider no duplicate since they differ in id,name,address

那么我如何调整我的方式,以便它删除重复项,如果有的话,只有在电话中有重复项而不考虑 ID、姓名、地址是否不同时

so how i can adjust my way so that it delete duplicate if any only if there is duplicate in phone without care of id,name,address if differ or not

提示

我已经调整了这部分

INSERT INTO my_temp(id,name,address,phone) SELECT DISTINCT phone FROM my_table;

但它会插入到 my_temp 表中

but it will insert into my_temp table

INSERT INTO `my_table` VALUES (1, 'null', 'null', '202020');
INSERT INTO `my_table` VALUES (2, 'null', 'null', '984731');
INSERT INTO `my_table` VALUES (3, 'null', 'null', '343354');

所以我将无法将数据返回到 my_table

so i won't be able to get the data back to my_table

推荐答案

我会这样做:

  1. 从现有表创建一个临时表:

  1. Create a temporary table from your existing table:

CREATE TEMPORARY TABLE data_to_keep LIKE table_with_dupes_in_it

  • 只用你想要的记录填充临时表:

  • Populate the temp table with just the records you want:

    INSERT INTO data_to_keep
    SELECT DISTINCT * FROM table_with_dupes_in_it
    

  • 清空桌子

  • Empty the table

    TRUNCATE TABLE table_with_dupes_in_it
    

  • 将临时表中的数据返回到原表

  • Return the data from the temp table to the original table

    INSERT INTO table_with_dupes_in_it
    SELECT * FROM data_to_keep;
    

  • 清理

  • Clean up

    DROP TEMPORARY TABLE data_to_keep
    

  • 请注意,如果有问题的表是这样的,这可能会占用大量内存和/或存储空间一个大的.如果它是一个大表,我倾向于使用真实表而不是临时表,以免占用数据库服务器上的过多内存.

    Be advised that this can eat up a huge amount of memory and/or storage if the table in question is a big one. If it's a big table I'd be inclined to use a real table instead of a temp table so as not to eat up excessive amounts of memory on your DB server.

    编辑添加:

    如果您只是担心部分重复(只有部分数据与之前输入的数据相同的行),那么您将需要使用 GROUP BY.当您使用 GROUP BY 时,您可以限制 MySQL 仅返回包含给定数据的行而不是所有数据.

    If you're just worried about partial dupes (rows where only some of the data is identical to previously entered data) then you will want to use GROUP BY. When you use GROUP BY, you can limit MySQL to return only one row that contains given data instead of all of them.

    SELECT *
    FROM table
    GROUP BY column_name
    

    您还应该考虑在不想保存重复数据的列上使用 UNIQUE 索引,这将首先防止用户插入重复数据.

    You also should considering using UNIQUE indexes on the columns you want to not hold duplicate data, this will prevent users from inserting duplicate data in the first place.

    这篇关于删除 MySQL 数据库中的重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆