SQL:删除重复项 [英] SQL: Remove duplicates
问题描述
如何从以下列方式设置的表中删除重复项?
How do I remove duplicates from a table that is set up in the following way?
unique_ID | worker_ID | date | type_ID
一个工作人员可以有多个与之关联的 type_ID,我想删除任何重复的类型.如果有重复,我想删除最新条目的类型.
A worker can have multiple type_ID's associated with them and I want to remove any duplicate types. If there is a duplicate, I want to remove the type with the most recent entry.
推荐答案
窗口函数 row_number() 的教科书候选:
A textbook candidate for the window function row_number():
;WITH x AS (
SELECT unique_ID
,row_number() OVER (PARTITION BY worker_ID,type_ID ORDER BY date) AS rn
FROM tbl
)
DELETE FROM tbl
FROM x
WHERE tbl.unique_ID = x.unique_ID
AND x.rn > 1
这也解决了 (worker_ID,type_ID)
上的一组欺骗共享相同 date
的情况.
请参阅简化的 关于 data.SE 的演示.
This also takes care of the situation where a set of dupes on (worker_ID,type_ID)
shares the same date
.
See the simplified demo on data.SE.
事实证明,这可以简化:在 SQL Server 中,您可以直接从 CTE 中删除:
Turns out, this can be simplified: In SQL Server you can delete from the CTE directly:
;WITH x AS (
SELECT unique_ID
,row_number() OVER (PARTITION BY worker_ID,type_ID ORDER BY date) AS rn
FROM tbl
)
DELETE x
WHERE rn > 1
这篇关于SQL:删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!