SQL Server 2008:删除重复行 [英] SQL Server 2008: delete duplicate rows

查看:95
本文介绍了SQL Server 2008:删除重复行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的表格中有重复的行,如何根据单列的值删除?



例如

  uniqueid,col2,col3 ... 
1,john,simpson
2,sally,roberts
1,johnny,simpson

删除任何重复的uniqueIds
以获得

1,John,Simpson
2,Sally,Roberts
/ pre>

解决方案

您可以从cte中获取 DELETE

  WITH cte AS(SELECT *,ROW_NUMBER()OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table)
DELETE FROM cte
WHERE RowRank> 1

ROW_NUMBER()函数分配一个数字到每一行。 PARTITION BY 用于开始对该组中的每个项目进行编号,在这种情况下,每个值 uniqueid 将会开始编号为1,从那里上升。 ORDER BY 确定数字进入的顺序。由于每个 uniqueid 从1开始编号, code> ROW_NUMBER()大于1的重复 uniqueid



要了解 ROW_NUMBER()函数的工作原理,只需尝试一下:

 code> SELECT *,ROW_NUMBER()OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table
ORDER BY uniqueid

您可以调整 ROW_NUMBER()函数的逻辑,以调整要保留或删除的记录。例如,您可能希望以多个步骤执行此操作,首先删除名称相同但名称不同的记录,您可以将姓氏添加到 PARTITION BY

  WITH cte AS(SELECT *,ROW_NUMBER OVER(PARTITION BY uniqueid,col3 ORDER BY col2)'RowRank'
FROM Table)
DELETE FROM cte
WHERE RowRank> 1


I have duplicate rows in my table, how can I delete them based on a single column's value?

Eg

uniqueid, col2, col3 ...
1, john, simpson
2, sally, roberts
1, johnny, simpson

delete any duplicate uniqueIds
to get 

1, John, Simpson
2, Sally, Roberts

解决方案

You can DELETE from a cte:

WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
             FROM Table)
DELETE FROM cte 
WHERE RowRank > 1

The ROW_NUMBER() function assigns a number to each row. PARTITION BY is used to start the numbering over for each item in that group, in this case each value of uniqueid will start numbering at 1 and go up from there. ORDER BY determines which order the numbers go in. Since each uniqueid gets numbered starting at 1, any record with a ROW_NUMBER() greater than 1 has a duplicate uniqueid

To get an understanding of how the ROW_NUMBER() function works, just try it out:

SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table
ORDER BY uniqueid

You can adjust the logic of the ROW_NUMBER() function to adjust which record you'll keep or remove.

For instance, perhaps you'd like to do this in multiple steps, first deleting records with the same last name but different first names, you could add last name to the PARTITION BY:

WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid, col3 ORDER BY col2)'RowRank'
             FROM Table)
DELETE FROM cte 
WHERE RowRank > 1

这篇关于SQL Server 2008:删除重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆