删除具有唯一索引的重复项 [英] Removing duplicates with unique index
问题描述
我插入两个表A,B,C,D之间的字段,相信我在A,B,C,D上创建了一个唯一索引,以防止重复。不过,我以某种方式简单地做出了正常的索引。所以重复插入。它是2000万个记录表。
I inserted between two tables fields A,B,C,D, believing I had created a Unique Index on A,B,C,D to prevent duplicates. However I somehow simply made a normal index on those. So duplicates got inserted. It is 20 million record table.
如果我将现有索引从正常更改为唯一,或者只是为A,B,C添加一个新的唯一索引,D将会重复记录被删除或将会添加失败,因为唯一记录存在?我会测试它,但是它是30密耳的记录,我不想混淆表或复制它。
If I change my existing index from normal to unique or simply a add a new unique index for A,B,C,D will the duplicates be removed or will adding fail since unique records exist? I'd test it yet it is 30 mil records and I neither wish to mess the table up or duplicate it.
推荐答案
如果您的表中有重复项,您使用
If you have duplicates in your table and you use
ALTER TABLE mytable ADD UNIQUE INDEX myindex (A, B, C, D);
查询将失败,并显示错误1062(重复键)。
the query will fail with Error 1062 (duplicate key).
但是如果你使用 IGNORE
ALTER IGNORE TABLE mytable ADD UNIQUE INDEX myindex (A, B, C, D);
重复项将被删除。但是文档没有指定将保留哪一行:
the duplicates will be removed. But the documentation doesn't specify which row will be kept:
IGNORE
是标准SQL的MySQL扩展。它控制如果新表中的唯一键上有重复项,或
如果在启用严格模式时发生警告,则ALTER TABLE
将起作用。如果IGNORE
不是指定
,则如果发生重复键错误
,则副本将中止并回滚。如果指定了IGNORE
,则在唯一键上只使用一行,
重复。其他冲突行被删除。
不正确的值被截断到最接近的匹配可接受的
值。
从MySQL 5.7.4开始,ALTER TABLE的IGNORE子句被删除,
它的使用会产生错误。
IGNORE
is a MySQL extension to standard SQL. It controls howALTER TABLE
works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. IfIGNORE
is not specified, the copy is aborted and rolled back if duplicate-key errors occur. IfIGNORE
is specified, only one row is used of rows with duplicates on a unique key. The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.
As of MySQL 5.7.4, the IGNORE clause for ALTER TABLE is removed and its use produces an error.
如果您的版本是5.7.4或更高版本,您可以:
If your version is 5.7.4 or greater - you can:
- 将数据复制到临时表中(不需要
- 截断原始表格。
- 创建独特的索引。
- 然后复制$>
- Copy the data into a temporary table (it doesn't need to be technicaly temporary).
- Truncate the original table.
- Create the UNIQUE INDEX.
- And copy the data back with
INSERT IGNORE
(which is still available).
CREATE TABLE tmp_data SELECT * FROM mytable;
TRUNCATE TABLE mytable;
ALTER TABLE mytable ADD UNIQUE INDEX myindex (A, B, C, D);
INSERT IGNORE INTO mytable SELECT * from tmp_data;
DROP TABLE tmp_data;
如果您使用
IGNORE
修饰符,执行
INSERT
语句时发生的错误将被忽略。例如,没有IGNORE
,一行
复制现有的UNIQUE
索引或PRIMARY KEY
表
中的值导致重复键错误,并且该语句被中止。使用
IGNORE
,该行被丢弃,没有发生错误。忽略错误
会生成警告。
If you use the
IGNORE
modifier, errors that occur while executing theINSERT
statement are ignored. For example, withoutIGNORE
, a row that duplicates an existingUNIQUE
index orPRIMARY KEY
value in the table causes a duplicate-key error and the statement is aborted. WithIGNORE
, the row is discarded and no error occurs. Ignored errors generate warnings instead.
另见: INSERT ... SELECT语法和 IGNORE关键字和严格SQL模式的比较
这篇关于删除具有唯一索引的重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!