选择重复记录 [英] Select duplicate records

查看：98 发布时间：2019/6/7 18:45:14 T-SQL

本文介绍了选择重复记录的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的查询如下

声明@tbl表（id int identity （1,1），sname varchar（10），标记int）

插入@tbl值（'A'，10），（'B'，10），（'A'，10 ），（'B'，40），（'C'，10），（'D'，15），（'A'，9）

选择row_number（）结束（sname分区），标记顺序标记）sno，sname，来自@tbl的标记

我的结果是：

sno sname marks

1 A 9

1 A 10

2 A 10

1 B 10

1 B 40

1 C 10

1 D 15

但我只想要第二和第三行，因为这些是重复记录。我想要所有重复的记录而不管重复。

请建议

谢谢

我的尝试：

重复记录消除

解决方案

鉴于 sno 总是为1（因为你要按 sname 进行分区 AND 标记）您可以将此简化为
 选择 distinct 1 as sno，sname，标记来自 @ tbl 
关键点在于它是您引入的实际上导致问题的row_number（）。

但我怀疑你想要 sno 是有意义的（如果我错了，请纠正我。

如果你想要它显示的数量然后使用
 选择 row_number（） （ 分区 按 sname 订单 by sname，marks） as sno，sname，marks 
 FROM （选择 distinct 1 as sno，sname，标记来自 @tbl ）A 
会给你
 sno sname mark 
 1 A 9 
 2 A 10 
 1 B 10 
 2 B 40 
 1 C 10 
 1 D 15 
如果你想给每一行编号，那么完全删除分区
 选择 row_number（） over （ order by sname，marks） as sno，sname，marks 
 FROM （ select distinct sname，标记来自 @ tbl ）A 
< pre lang =text> sno sname mark
1 A 9
2 A 10
3 B 10
4 B 40
5 C 10
6 D 15

------------------ OP OPMENT后编辑---------------

要获得重复的行，有两种方法可以尝试：

1.要获取sname和标记以及重复次数，请尝试使用
 选择 sname，marks，max（sno）作为 numberOfDups 来自 
（选择 row_number（） over （ partition by sname，marks order by sname，marks） as sno，sname ，标记
 FROM @ tbl ）A 
 其中 sno> 1 
 group by sname ，标记
2.要获得具有重复项的完整项目列表，请尝试以下操作：
; with cte as 
（
 select row_number（） over （ partition by sname，mark order by sname，marks） as sno，sname，标记
 来自 @ tbl 
） 
 选择 * 
 来自 cte 
 inner join （选择 sname，marks，max （sno） as numberOfDup 来自 cte 其中 sno> 1 group by sname，marks）B cte.sname = B.sname AND cte.marks = B.marks 
关键是在第二次查看CTE的子查询上的INNER JOIN。这不仅仅是一个简单的给我行sno> 1 - 连接确保虽然我们没有看到非重复的行，但我们做看所有有重复的行 - 即A 10，其中sno - 1 AND 2，而不仅仅是2

------------ ------在另一个OP评论之后编辑---------------

介绍必须采用的其他专栏帐户 - 连接需要很长时间才能开启条件

测试数据：
 声明 @ tbl table （id int identity （ 1 ， 1 ） ，sname varchar （ 10 ），标记 int ，其他 int ）
 ins ert 进入 @ tbl 值（' A'， 10 ， 1 ），（' B'， 10 ， 1 ），（' A'， 10 ， 1 ），（ ' B'， 40 ， 1 ），（' C'， 10 ， 2 ），（' D'， 15 ， 2 ），（' A'， 9 ， 1 ），（' A'， 10 ， 2 ）
我的原始解决方案将返回
 1 A 10 2 3 
 2 A 10 1 3 
 3 A 10 1 3 
考虑到其他专栏。我们可以使用校验和 [ ^ ]组合所有相关的列，例如
; cte1 as 
（
 选择校验和（sname，marks，other） as sno，sname，marks，other from @ tbl 
）
 选择 sname，标记，其他
 来自 cte1 
 内部 join （ select 校验和（sname，marks，other） as sno 来自 @tbl group by 校验和（sname，marks，other） count（*）> 1 ） cte1.sno = A.sno 
结果
 A 10 1 
 A 10 1

 
声明@tbl表（id int identity（1,1） ，sname varchar（10），标记int）
插入@tbl值（'A'，10），（'B'，10），（'A'，10），（'B'，40） ，（'C'，10），（'D'，15），（'A'，9）
 select * from（
 select row_number（）over（sname分区，标记按标记排序） ）Sno，sname，来自@tbl的标记），其中sno = 1 
 
 
 
 
 
 
 
声明@tbl表（id int identity（1,1），sname varchar（10），标记int）
插入@tbl值（'A'，10），（'B'，10） ，（'A'，10），（'B'，40），（'C'，10），（'D'，15），（'A'，9）
 
; with cte as 
（
 select row_number（）over（sname分区，标记按标记排序）sno，sname，来自@tbl的标记）
 select * from cte where sno = 1

 
 
Hi,



I have a query below



declare @tbl table (id int identity(1,1),sname varchar(10),marks int)

insert into @tbl values('A',10),('B',10),('A',10),('B',40),('C',10),('D',15),('A',9)

select row_number()over(partition by sname,marks order by marks)sno,sname,marks from @tbl



my result is:

sno	sname	marks

1	A	9

1	A	10

2	A	10

1	B	10

1	B	40

1	C	10

1	D	15



But I want to only second and third rows as these are duplicate records. i want all duplicate records irrespective of repeatation.



Please suggest



Thanks



What I have tried:



duplicates records elimination
 解决方案 
Given that sno is always going to be 1 (because you are partitioning by sname AND marks) you could simplify this to 
select distinct 1 as sno,sname, marks from @tbl
The key point being that it is your introduction of the row_number() that is actually causing the issue.



But I suspect you want sno to be something meaningful (correct me if I'm wrong)



If you want it to show the number of the mark per name then use
select row_number() over (partition by sname order by sname,marks) as sno, sname, marks
FROM (select distinct 1 as sno,sname, marks from @tbl) A
which will give you 
sno	sname	mark
1	A	9
2	A	10
1	B	10
2	B	40
1	C	10
1	D	15
If you want to give each row a number then remove the partition altogether 
select row_number() over (order by sname,marks) as sno, sname, marks
FROM (select distinct sname, marks from @tbl) A
sno	sname	mark
1	A	9
2	A	10
3	B	10
4	B	40
5	C	10
6	D	15




------------------ EDIT AFTER OP COMMENT ---------------



To only get the lines that are duplicated there are two ways you could try:



1. To get the sname and marks and the number of duplications try this 
select sname, marks, max(sno) as numberOfDups from 
(select row_number() over (partition by sname, marks order by sname,marks) as sno, sname, marks
FROM  @tbl) A
where sno > 1
group by sname, marks




2. To get the full list of items that have duplicates try this:
;with cte as
(
	select row_number() over (partition by sname, marks order by sname,marks) as sno, sname, marks
	from @tbl
)
select * 
from cte 
inner join (select sname, marks, max(sno) as numberOfDup from cte where sno > 1 group by sname, marks) B on cte.sname = B.sname AND cte.marks = B.marks
The key is the INNER JOIN on a sub-query looking at the CTE a second time. It's not just a simple "give me the rows where sno > 1" - the join ensures that while we don't see the non-duplicated rows we do see all the rows where there has been a duplicate - i.e. A 10 where sno - 1 AND 2, not just 2



------------------ EDIT AFTER ANOTHER OP COMMENT ---------------



Introducing other columns which must be taken into account - the joins take "much time for the on conditions



Test data:
declare @tbl table (id int identity(1,1),sname varchar(10),marks int, other int)
insert into @tbl values('A',10,1),('B',10,1),('A',10,1),('B',40,1),('C',10,2),('D',15,2),('A',9,1), ('A',10,2)
My original solution would return 
1	A	10	2	3
2	A	10	1	3
3	A	10	1	3
Which is incorrect when taking into account the other column.



We can use Checksum [^] to "combine" all of the columns that are relevent e.g.

;with cte1 as 
(
	select checksum(sname, marks, other) as sno, sname, marks, other from @tbl
)
select sname, marks, other
from cte1
inner join (select checksum(sname, marks, other) as sno from @tbl group by checksum(sname, marks, other) having count(*) > 1) A on cte1.sno = A.sno
results
A	10	1
A	10	1

declare @tbl table (id int identity(1,1),sname varchar(10),marks int)
insert into @tbl values('A',10),('B',10),('A',10),('B',40),('C',10),('D',15),('A',9)
select * from (
select row_number()over(partition by sname,marks order by marks)sno,sname,marks from @tbl) as a where sno=1






declare @tbl table (id int identity(1,1),sname varchar(10),marks int)
insert into @tbl values('A',10),('B',10),('A',10),('B',40),('C',10),('D',15),('A',9)

;with cte as 
(
select row_number()over(partition by sname,marks order by marks)sno,sname,marks from @tbl)
select * from cte where sno=1


                        这篇关于选择重复记录的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

选择重复记录 [英] Select duplicate records

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

选择重复记录 [英] Select duplicate records

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭