如何在SQL Server中选择顺序重复项 [英] How to select sequential duplicates in SQL Server
问题描述
我想从SQL Server表中选择重复的条目,但仅当id连续时.
I would like to select duplicate entries from a SQL Server table, but only if the id is consecutive.
我一直在尝试根据自己的需要修改此答案,但我无法使其正常工作.
I have been trying to twist this answer to my needs, but I can't get it to work.
上面的答案是针对Oracle的,但是我看到SQL Server也具有lead
和lag
函数.
The above answer is for Oracle, but I see that SQL Server also has lead
and lag
functions.
此外,我认为上面的答案在重复项旁边放置*
,但是我只想选择重复项.
Also, I think that the answer above puts a *
next to duplicates, but I only want to select the duplicates.
select
id, companyName,
case
when companyName in (prev, next)
then '*'
end match,
prev,
next
from
(select
id,
companyName,
lag(companyName, 1) over (order by id) prev,
lead(companyName, 1) over (order by id) next
from
companies)
order by
id;
示例:
因此,从该数据集中:
id companyName
-------------------
1 dogs ltd
2 cats ltd
3 pigs ltd
4 pigs ltd
5 cats ltd
6 cats ltd
7 dogs ltd
8 pigs ltd
我要选择:
id companyName
-------------------
3 pigs ltd
4 pigs ltd
5 cats ltd
6 cats ltd
更新
我时不时地对我得到的答案的数量和质量感到惊讶.这是那些时代之一.我没有判断一个答案比另一个答案更好的专业知识,所以我选择了SqlZim,因为这是我看到的第一个可行答案.但是很高兴看到不同的方法.尤其是仅在一个小时前,我想知道这是否可能?".
Update
Every now and again I am taken aback by the quantity and quality of answers I get on SO. This is one of those times. I don't have the level of expertise to judge one answer as being better than another, so I've gone for SqlZim as this was the first working answer I saw. But it's great to see the different approaches. Especially when only an hour ago I was wondering "is this even possible?".
推荐答案
这是一个间隙和孤岛样式的问题,但是我们在最里面的子查询中使用了id
和row_number()
,而不是使用两个row_numbers()
.后跟count() over()
以获得每个grp
的计数,最后返回带有cnt > 1
的计数.
This is a gaps and islands style problem, but instead of using two row_numbers()
, we use the id
and row_number()
in the innermost subquery. Followed by count() over()
to get the count per grp
, and finally return those with a cnt > 1
.
select id, companyname
from (
select
id
, companyName
, grp
, cnt = count(*) over (partition by companyname, grp)
from (
select *
, grp = id - row_number() over (partition by companyname order by id)
from
companies
) islands
) d
where cnt > 1
order by id
rextester演示: http://rextester.com/ACP73683
rextester demo: http://rextester.com/ACP73683
返回:
+----+-------------+
| id | companyname |
+----+-------------+
| 3 | pigs ltd |
| 4 | pigs ltd |
| 5 | cats ltd |
| 6 | cats ltd |
+----+-------------+
这篇关于如何在SQL Server中选择顺序重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!