选择重复记录 [英] Select duplicate records
问题描述
我的查询如下
声明@tbl表(id int identity (1,1),sname varchar(10),标记int)
插入@tbl值('A',10),('B',10),('A',10 ),('B',40),('C',10),('D',15),('A',9)
选择row_number()结束(sname分区) ,标记顺序标记)sno,sname,来自@tbl的标记
我的结果是:
sno sname marks
1 A 9
1 A 10
2 A 10
1 B 10
1 B 40
1 C 10
1 D 15
但我只想要第二和第三行,因为这些是重复记录。我想要所有重复的记录而不管重复。
请建议
谢谢
我的尝试:
重复记录消除
鉴于sno
总是为1(因为你要按sname
进行分区 AND标记
)您可以将此简化为选择 distinct 1 as sno,sname,标记来自 @ tbl关键点在于它是您引入的实际上导致问题的row_number()。
但我怀疑你想要sno
是有意义的(如果我错了,请纠正我。
如果你想要它显示的数量然后使用选择 row_number() ( 分区 按 sname 订单 by sname,marks) as sno,sname,marks
FROM (选择 distinct 1 as sno,sname,标记来自 @tbl )A会给你
sno sname mark
1 A 9
2 A 10
1 B 10
2 B 40
1 C 10
1 D 15如果你想给每一行编号,那么完全删除分区
选择 row_number() over ( order by sname,marks) as sno,sname,marks
FROM ( select distinct sname,标记来自 @ tbl )A< pre lang =text> sno sname mark
1 A 9
2 A 10
3 B 10
4 B 40
5 C 10
6 D 15
------------------ OP OPMENT后编辑---------------
要获得重复的行,有两种方法可以尝试:
1.要获取sname和标记以及重复次数,请尝试使用选择 sname,marks,max(sno)作为 numberOfDups 来自
(< span class =code-keyword>选择 row_number()< span class =code-keyword> over ( partition by sname,marks order by sname,marks) as sno,sname ,标记
FROM @ tbl )A
其中 sno> 1
group by sname ,标记
2.要获得具有重复项的完整项目列表,请尝试以下操作:; < span class =code-keyword> with cte as
(
select row_number() over ( partition by sname,mark order by sname,marks) as sno,sname,标记
来自 @ tbl
)
选择 *
来自 cte
inner join (选择 sname,marks,max (sno) as numberOfDup 来自 cte 其中 sno> 1 group by sname,marks)B cte.sname = B.sname AND cte.marks = B.marks关键是在第二次查看CTE的子查询上的INNER JOIN。这不仅仅是一个简单的给我行sno> 1 - 连接确保虽然我们没有看到非重复的行,但我们做看所有有重复的行 - 即A 10,其中sno - 1 AND 2,而不仅仅是2
------------ ------在另一个OP评论之后编辑---------------
介绍必须采用的其他专栏帐户 - 连接需要很长时间才能开启条件
测试数据:声明 @ tbl table (id int identity ( 1 , 1 ) ,sname varchar ( 10 ),标记 int ,其他 int )
ins ert 进入 @ tbl 值(' A', 10 , 1 ),(' B', 10 , 1 ),(' < span class =code-string> A', 10 , 1 ),( ' B', 40 , 1 ),(' C', 10 , 2 ),(' D', 15 , 2 ),(' A', 9 , 1 ),(' A', 10 , 2 )我的原始解决方案将返回
1 A 10 2 3
2 A 10 1 3
3 A 10 1 3考虑到
其他$ c $时哪个不正确c>专栏。
我们可以使用校验和 [ ^ ]组合所有相关的列,例如
; cte1 as
(
< span class =code-keyword>选择校验和(sname,marks,other)< span class =code-keyword> as sno,sname,marks,other from @ tbl
)
选择 sname,标记,其他
来自 cte1
内部 join ( select 校验和(sname,marks,other) as sno 来自 @tbl group by 校验和(sname,marks,other) count(*)> 1 ) cte1.sno = A.sno结果
A 10 1
A 10 1
声明@tbl表(id int identity(1,1) ,sname varchar(10),标记int)
插入@tbl值('A',10),('B',10),('A',10),('B',40) ,('C',10),('D',15),('A',9)
select * from(
select row_number()over(sname分区,标记按标记排序) )Sno,sname,来自@tbl的标记),其中sno = 1
声明@tbl表(id int identity(1,1),sname varchar(10),标记int)
插入@tbl值('A',10),('B',10) ,('A',10),('B',40),('C',10),('D',15),('A',9)
; with cte as
(
select row_number()over(sname分区,标记按标记排序)sno,sname,来自@tbl的标记)
select * from cte where sno = 1
Hi,
I have a query below
declare @tbl table (id int identity(1,1),sname varchar(10),marks int)
insert into @tbl values('A',10),('B',10),('A',10),('B',40),('C',10),('D',15),('A',9)
select row_number()over(partition by sname,marks order by marks)sno,sname,marks from @tbl
my result is:
sno sname marks
1 A 9
1 A 10
2 A 10
1 B 10
1 B 40
1 C 10
1 D 15
But I want to only second and third rows as these are duplicate records. i want all duplicate records irrespective of repeatation.
Please suggest
Thanks
What I have tried:
duplicates records elimination
Given thatsno
is always going to be 1 (because you are partitioning bysname
ANDmarks
) you could simplify this toselect distinct 1 as sno,sname, marks from @tblThe key point being that it is your introduction of the row_number() that is actually causing the issue.
But I suspect you wantsno
to be something meaningful (correct me if I'm wrong)
If you want it to show the number of the mark per name then useselect row_number() over (partition by sname order by sname,marks) as sno, sname, marks FROM (select distinct 1 as sno,sname, marks from @tbl) Awhich will give you
sno sname mark 1 A 9 2 A 10 1 B 10 2 B 40 1 C 10 1 D 15If you want to give each row a number then remove the partition altogether
select row_number() over (order by sname,marks) as sno, sname, marks FROM (select distinct sname, marks from @tbl) Asno sname mark 1 A 9 2 A 10 3 B 10 4 B 40 5 C 10 6 D 15
------------------ EDIT AFTER OP COMMENT ---------------
To only get the lines that are duplicated there are two ways you could try:
1. To get the sname and marks and the number of duplications try thisselect sname, marks, max(sno) as numberOfDups from (select row_number() over (partition by sname, marks order by sname,marks) as sno, sname, marks FROM @tbl) A where sno > 1 group by sname, marks
2. To get the full list of items that have duplicates try this:;with cte as ( select row_number() over (partition by sname, marks order by sname,marks) as sno, sname, marks from @tbl ) select * from cte inner join (select sname, marks, max(sno) as numberOfDup from cte where sno > 1 group by sname, marks) B on cte.sname = B.sname AND cte.marks = B.marksThe key is the INNER JOIN on a sub-query looking at the CTE a second time. It's not just a simple "give me the rows where sno > 1" - the join ensures that while we don't see the non-duplicated rows we do see all the rows where there has been a duplicate - i.e. A 10 where sno - 1 AND 2, not just 2
------------------ EDIT AFTER ANOTHER OP COMMENT ---------------
Introducing other columns which must be taken into account - the joins take "much time for the on conditions
Test data:declare @tbl table (id int identity(1,1),sname varchar(10),marks int, other int) insert into @tbl values('A',10,1),('B',10,1),('A',10,1),('B',40,1),('C',10,2),('D',15,2),('A',9,1), ('A',10,2)My original solution would return
1 A 10 2 3 2 A 10 1 3 3 A 10 1 3Which is incorrect when taking into account the
other
column.
We can use Checksum [^] to "combine" all of the columns that are relevent e.g.
;with cte1 as ( select checksum(sname, marks, other) as sno, sname, marks, other from @tbl ) select sname, marks, other from cte1 inner join (select checksum(sname, marks, other) as sno from @tbl group by checksum(sname, marks, other) having count(*) > 1) A on cte1.sno = A.snoresults
A 10 1 A 10 1
declare @tbl table (id int identity(1,1),sname varchar(10),marks int) insert into @tbl values('A',10),('B',10),('A',10),('B',40),('C',10),('D',15),('A',9) select * from ( select row_number()over(partition by sname,marks order by marks)sno,sname,marks from @tbl) as a where sno=1
declare @tbl table (id int identity(1,1),sname varchar(10),marks int) insert into @tbl values('A',10),('B',10),('A',10),('B',40),('C',10),('D',15),('A',9) ;with cte as ( select row_number()over(partition by sname,marks order by marks)sno,sname,marks from @tbl) select * from cte where sno=1
这篇关于选择重复记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!