删除 MySQL 数据库中的重复行 [英] Deleting Duplicate Rows in a MySQL Database
问题描述
我有以下数据库
my_table [id,name,address,phone]
有很多条目,我想删除重复数据,其中任何重复的 phone
都会导致删除.
my_table [id,name,address,phone]
with a lot of entries and i would like to delete the Duplicate data where any just any duplicated phone
will results in deleting.
这是我的尝试,但显示错误
Here is my try but shows error
在我的 sql 文件中
Inside my sql file
CREATE TABLE `my_table` (
`id` int(10) NOT NULL default '0',
`name` varchar(255) NOT NULL default '',
`address` varchar(255) NOT NULL default '',
`phone` varchar(255) NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `my_table` VALUES (1, 'Albert', 'EGYPT', '202020');
INSERT INTO `my_table` VALUES (2, 'John', 'USA', '984731');
INSERT INTO `my_table` VALUES (3, 'Steve', 'Romabia', '202020');
INSERT INTO `my_table` VALUES (4, 'Albert', 'EGYPT', '343354');
很明显,id 1 和 3 的行具有相同的电话号码,然后将删除重复的结果
very clear that row of id 1 and 3 have same phone number then will remove duplicated so results be
INSERT INTO `my_table` VALUES (1, 'Albert', 'EGYPT', '202020');
INSERT INTO `my_table` VALUES (2, 'John', 'USA', '984731');
INSERT INTO `my_table` VALUES (3, 'Albert', 'EGYPT', '343354');
我的尝试如何
我刚刚在sql文件中添加了以下内容
I just have added to the sql file the following
1- 创建新表以获得不同
1- created new table to get distinct
CREATE TABLE my_temp(id VARCHAR(10), name VARCHAR(255), address VARCHAR(255), phone VARCHAR(255));
INSERT INTO my_temp(id,name,address,phone) SELECT DISTINCT id,name,address,phone FROM my_table;
2- 从真实表中删除条目 my_table
2- Delete entries from real table my_table
DELETE FROM my_table;
3- 将条目从 my_tamp
表返回到真正的 my_table
3- Getting entries back from my_tamp
table to real my_table
INSERT INTO my_table(id,name,address,phone) SELECT id,name,address,phone FROM my_temp;
4- 删除无用的表 my_temp
DROP TABLE my_temp;
现在我的问题
它仍然会显示相同的内容
it will still show me the same
INSERT INTO `my_table` VALUES (1, 'Albert', 'EGYPT', '202020');
INSERT INTO `my_table` VALUES (2, 'John', 'USA', '984731');
INSERT INTO `my_table` VALUES (3, 'Steve', 'Romabia', '202020');
INSERT INTO `my_table` VALUES (4, 'Albert', 'EGYPT', '343354');
因为它不会考虑重复,因为它们的 id、name、address 不同
because it won't consider no duplicate since they differ in id,name,address
那么我如何调整我的方式,以便它删除重复项,如果有的话,只有在电话中有重复项而不考虑 ID、姓名、地址是否不同时
so how i can adjust my way so that it delete duplicate if any only if there is duplicate in phone without care of id,name,address if differ or not
提示
我已经调整了这部分
INSERT INTO my_temp(id,name,address,phone) SELECT DISTINCT phone FROM my_table;
但它会插入到 my_temp 表中
but it will insert into my_temp table
INSERT INTO `my_table` VALUES (1, 'null', 'null', '202020');
INSERT INTO `my_table` VALUES (2, 'null', 'null', '984731');
INSERT INTO `my_table` VALUES (3, 'null', 'null', '343354');
所以我将无法将数据返回到 my_table
so i won't be able to get the data back to my_table
推荐答案
我会这样做:
从现有表创建一个临时表:
Create a temporary table from your existing table:
CREATE TEMPORARY TABLE data_to_keep LIKE table_with_dupes_in_it
只用你想要的记录填充临时表:
Populate the temp table with just the records you want:
INSERT INTO data_to_keep
SELECT DISTINCT * FROM table_with_dupes_in_it
清空桌子
Empty the table
TRUNCATE TABLE table_with_dupes_in_it
将临时表中的数据返回到原表
Return the data from the temp table to the original table
INSERT INTO table_with_dupes_in_it
SELECT * FROM data_to_keep;
清理
Clean up
DROP TEMPORARY TABLE data_to_keep
请注意,如果有问题的表是这样的,这可能会占用大量内存和/或存储空间一个大的.如果它是一个大表,我倾向于使用真实表而不是临时表,以免占用数据库服务器上的过多内存.
Be advised that this can eat up a huge amount of memory and/or storage if the table in question is a big one. If it's a big table I'd be inclined to use a real table instead of a temp table so as not to eat up excessive amounts of memory on your DB server.
编辑添加:
如果您只是担心部分重复(只有部分数据与之前输入的数据相同的行),那么您将需要使用 GROUP BY.当您使用 GROUP BY 时,您可以限制 MySQL 仅返回包含给定数据的行而不是所有数据.
If you're just worried about partial dupes (rows where only some of the data is identical to previously entered data) then you will want to use GROUP BY. When you use GROUP BY, you can limit MySQL to return only one row that contains given data instead of all of them.
SELECT *
FROM table
GROUP BY column_name
您还应该考虑在不想保存重复数据的列上使用 UNIQUE 索引,这将首先防止用户插入重复数据.
You also should considering using UNIQUE indexes on the columns you want to not hold duplicate data, this will prevent users from inserting duplicate data in the first place.
这篇关于删除 MySQL 数据库中的重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!