删除表中的所有重复行（如果行重复一次，则应删除这两行） [英] delete all duplicate rows in a table (if a row is duplicated once, both rows should be deleted)

查看：167 发布时间：2019/6/13 2:32:27 SQL-server-2005 SQL SQL-Server

本文介绍了删除表中的所有重复行（如果行重复一次，则应删除这两行）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的情况是：

我有两个表table1和table2。

两者都是相同的，除了table2没有主键或约束。

我需要将数据从table2复制到table1。

table1不接受重复的行。

table1中的主键基于5列col1，col2，col3，col4，col5。

table2有一堆重复的行。

我可以使用以下方法获取table2中的重复行：

My scenario is this:

I have two tables table1 and table2.
Both of them are identical except that table2 has no primary key or constraints.

I need to copy data from table2 to table1.

table1 does not accept duplicate rows.

The primary key in table1 is based on 5 columns col1, col2, col3, col4, col5.

table2 has a bunch of duplicated rows.

I am able to get the duplicated rows in table2 using:

SELECT count(*) as 'Occurance'col1,col2,col3,col4, col5
FROM table2
GROUP BY col1,col2,col3,col4, col5
HAVING count(*) > 1

我可以使用下面的查询删除一个副本：

And I am able to delete one duplicate using the query below:

WITH cte
AS (SELECT col1,col2,col3,col4, col5,
row_number() OVER(PARTITION BY col1,col2,col3,col4, col5 ORDER BY col1) AS [rn]
  FROM table2
)
DELETE cte WHERE [rn] <> 1

但我想要的是删除所有重复项找到了。

要说清楚有两行：

col1 = 1，col2 = 2，col3 = 3，col4 = 4 ，col5 = 5，col6 = TEXAS

col1 = 1，col2 = 2，col3 = 3，col4 = 4，col5 = 5，col6 =纽约

所以上面发布的第一个查询会给我一个结果：

col1 = 1，col2 = 2，col3 = 3，col4 = 4，col5 = 5，Occurance = 2

第二个查询将删除以上任何一个（即TEXAS或NEW YORK）。

我想删除TEXAS和纽约。

But what I want is to delete all the duplicates which are found.

To make it clear there are two rows:
col1 = 1, col2 = 2, col3 = 3, col4 = 4, col5 = 5, col6 = TEXAS
col1 = 1, col2 = 2, col3 = 3, col4 = 4, col5 = 5, col6 = NEW YORK

So the first query posted above will give me a result as:
col1 = 1, col2 = 2, col3 = 3, col4 = 4, col5 = 5, Occurance = 2

The second query will delete any one of the above (i.e. either TEXAS or NEW YORK).
I want to delete both TEXAS and NEW YORK.

推荐答案

我不确定为什么在将行复制到Table1之前需要删除行。

我建议从表2中获取DISTINCT行，然后将它们插入表1中

I am not sure why you need to delete rows before you copy them to Table1.

What I would suggest is get the DISTINCT rows from Table2 and then insert them in Table1

INSERT INTO Table1
SELECT DISTINCT Column1, Column2, Column3 Column4, Column5 FROM Table2

检查这个 DEMO [ ^ ]

您可以使用 SQL INSERT INTO SELECT Statement [ ^ ] with group by

check this DEMO[^]
you can use SQL INSERT INTO SELECT Statement[^] with group by

如果我理解你的要求。

需要删除table1上的主键约束。

If I understand your requirements correctly.
The primary key constraint on table1 needs to be dropped.

--data setup for testing
declare @table2 table(col1 int, col2 int, col3 int, col4 int, col5 int, col6 varchar(10));
declare @table1 table(col1 int, col2 int, col3 int, col4 int, col5 int, col6 varchar(10));
--values for @table2
insert into @table2 values(1,1,1,1,1,'TEXAS');
insert into @table2 values(1,1,1,1,1,'NEW YORK');
insert into @table2 values(1,1,1,1,1,'IOWA');
insert into @table2 values(1,1,1,1,2,'TEXAS');
insert into @table2 values(1,1,1,1,2,'NEW YORK');
insert into @table2 values(1,1,1,1,2,'IOWA');
insert into @table2 values(1,1,1,2,1,'TEXAS');
insert into @table2 values(1,1,1,2,1,'NEW YORK');
insert into @table2 values(1,1,1,2,1,'IOWA');
insert into @table2 values(1,1,2,1,1,'TEXAS');
insert into @table2 values(1,1,2,1,1,'NEW YORK');
insert into @table2 values(1,1,2,1,1,'IOWA');
insert into @table2 values(1,2,1,1,1,'TEXAS');
insert into @table2 values(1,2,1,1,1,'NEW YORK');
insert into @table2 values(1,2,1,1,1,'IOWA');
insert into @table2 values(2,1,1,1,1,'TEXAS');
insert into @table2 values(2,1,1,1,1,'NEW YORK');
insert into @table2 values(2,1,1,1,1,'IOWA');
--values for @table1
insert into @table1 values(1,1,1,1,2,'TEXAS');
insert into @table1 values(1,2,1,1,1,'TEXAS');

--insert into @table1 from @table2 but not exist in @table1
insert into @table1
select b.col1, b.col2, b.col3, b.col4, b.col5, b.col6
from @table2 b
where not exists(
    select a.col1, a.col2, a.col3, a.col4, a.col5, a.col6
    from @table1 a
    where a.col1 = b.col1
    and a.col2 = b.col2
    and a.col3 = b.col3
    and a.col4 = b.col4
    and a.col5 = b.col5
    and a.col6 = b.col6
);

这个可以在插入之前和之后运行

This one can be run before and after the insert

--remove from @table2 where already in @table1
with ToRemove as(
    select b.col1, b.col2, b.col3, b.col4, b.col5, b.col6
    from @table2 b
    where exists(
        select a.col1, a.col2, a.col3, a.col4, a.col5, a.col6
        from @table1 a
        where a.col1 = b.col1
        and a.col2 = b.col2
        and a.col3 = b.col3
        and a.col4 = b.col4
        and a.col5 = b.col5
        and a.col6 = b.col6
    )
)
delete from ToRemove;

我不知道CTE可以这种方式用于删除:)感谢那一个。

我建议你研究一下数据库中的一对多关系，也许会改变数据的存储方式。

读一下< a href =http://www.databaseprimer.com/pages/relationship_1tox/> http://www.databaseprimer.com/pages/relationship_1tox/ [ ^ ]

这篇关于删除表中的所有重复行（如果行重复一次，则应删除这两行）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

删除表中的所有重复行（如果行重复一次，则应删除这两行） [英] delete all duplicate rows in a table (if a row is duplicated once, both rows should be deleted)

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

删除表中的所有重复行（如果行重复一次，则应删除这两行） [英] delete all duplicate rows in a table (if a row is duplicated once, both rows should be deleted)

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭