从表中删除没有唯一键的重复行 [英] Delete duplicate rows from table with no unique key

查看：158 发布时间：2017/7/20 23:59:11 sql postgresql duplicates duplicate-removal

本文介绍了从表中删除没有唯一键的重复行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何删除Postgres 9表中的重复行，这些行在每个字段上都是完全重复的，没有任何可以用作唯一键的单个字段，所以我不能只是 GROUP BY 列，并使用 NOT IN 语句。

我正在寻找一个单一的SQL语句，而不是一个解决方案，需要我创建临时表并插入记录。我知道如何做，但需要更多的工作来适应我的自动化过程。

表定义：

  jthinksearch => \d releases_labels; 
无记录的表discogs.releases_labels
列|类型|修饰符
 ------------ + --------- + ----------- 
 label |文字| 
 release_id |整数| 
 catno |文字| 
索引：
releases_labels_catno_idxbtree（catno）
releases_labels_name_idxbtree（label）
外键约束：
foreign_didFOREIGN KEY（release_id）参考release（id）

样本数据：

  jthinksearch => select * from releases_labels where release_id = 6155; 
 label | release_id | catno 
 -------------- + ------------ + ------------ 
经线记录| 6155 | WAP 39 CDR 
经线记录| 6155 | WAP 39 CDR

解决方案

如果你有能力重写整个表，这可能是最简单的方法：

  WITH已删除的AS（
 DELETE FROM discogs.releases_labels 
返回* 
）
 INSERT INTO discogs.releases_labels 
 SELECT DISTINCT * FROM Deleted

如果您需要专门定位重复记录，则可以使用内部 ctid 字段，该字段唯一标识一行：

$ b $

$ disc $ $ b GROUP BY label，release_id，catno
）

非常小心 CTID ;它随着时间的推移而改变。但是，您可以依靠在单一声明范围内保持不变。

How do I delete duplicates rows in Postgres 9 table, the rows are completely duplicates on every field AND there is no individual field that could be used as a unique key so I cant just GROUP BY columns and use a NOT IN statement.

I'm looking for a single SQL statement, not a solution that requires me to create temporary table and insert records into that. I know how to do that but requires more work to fit into my automated process.

Table definition:

jthinksearch=> \d releases_labels;
Unlogged table "discogs.releases_labels"
   Column   |  Type   | Modifiers
------------+---------+-----------
 label      | text    |
 release_id | integer |
 catno      | text    |
Indexes:
    "releases_labels_catno_idx" btree (catno)
    "releases_labels_name_idx" btree (label)
Foreign-key constraints:
    "foreign_did" FOREIGN KEY (release_id) REFERENCES release(id)

Sample data:

jthinksearch=> select * from releases_labels  where release_id=6155;
    label     | release_id |   catno
--------------+------------+------------
 Warp Records |       6155 | WAP 39 CDR
 Warp Records |       6155 | WAP 39 CDR

解决方案

If you can afford to rewrite the whole table, this is probably the simplest approach:

WITH Deleted AS (
  DELETE FROM discogs.releases_labels
  RETURNING *
)
INSERT INTO discogs.releases_labels
SELECT DISTINCT * FROM Deleted

If you need to specifically target the duplicated records, you can make use of the internal ctid field, which uniquely identifies a row:

DELETE FROM discogs.releases_labels
WHERE ctid NOT IN (
  SELECT MIN(ctid)
  FROM discogs.releases_labels
  GROUP BY label, release_id, catno
)

Be very careful with ctid; it changes over time. But you can rely on it staying the same within the scope of a single statement.

这篇关于从表中删除没有唯一键的重复行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从表中删除没有唯一键的重复行 [英] Delete duplicate rows from table with no unique key

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从表中删除没有唯一键的重复行 [英] Delete duplicate rows from table with no unique key

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭