找到具有以相同的最后四位数字结尾的值的重复列 - SQL [英] Find duplicate columns that have values that end with same last four digits - SQL
问题描述
我需要一个查询,将搜索表中的所有记录,并只返回具有2个或更多值的相同4位数的结果,但是我没有特定的四位数,我只需要它找到具有多个值结束的所有内容。
例如表有:
I need a query that will search for all records in a table and return only the ones that have 2 or more values that end with same 4 digits, but i dont have specific four digits i can give you, i just need it to find everything that has more than one value that ends the same. eg. table has:
person_number------------name
==============================
1234567------------------john
00004567-----------------mark
9999999------------------mike
0009999------------------lidia
10101010-----------------alex
23232323-----------------matt
98748484-----------------steve
我希望它只返回:
person_number------------name
==============================
1234567------------------john
00004567-----------------mark
9999999------------------mike
0009999------------------lidia
你看,我想要返回以相同的最后4位数结尾的重复,但是我没有特定的数字给你,所以我不能使用像'%9999'的东西。它也需要正确排序,所以我让他们彼此相邻。我需要找到重复项并删除其中的一个,但我不知道哪一个,直到我实际上看到其他列中的所有其他值,并选择要手动删除的值。
You see, i want it to return duplicates that end with the same last 4 digits, but i dont have specific digits to give you so i cant use something like '%9999'. It also needs to be sorted properly so I have them next to each other. I need to find duplicates and delete one of them, but i dont know which one until i actually see all the other values in other columns and pick which one to delete manually.
谢谢!
推荐答案
我只想延长Sick答案。
I just want extend Sick answer.
你说你会选择哪一个消除。但您也可以在 ORDER
clausule中添加一个 CASE
statment来过滤您想要消除的错误。
You say you will like to choose which one to eliminate. But you can also include a ORDER
clausule with a CASE
statment to filter the one you want eliminate.
在这种情况下,我通过name
订购,以便您可以使用 rn> 1
并保持名字。
In this case I order by "name"
so you can delete all with rn > 1
and keep the first name.
SqlFiddleDemo
select "person_number", "name", rn, zero_count
from
(
select "person_number",
"name",
substr("person_number", 1, 1),
count(1) over (partition by substr("person_number",-4)) as Cnt,
SUM(case
when substr("person_number", 1, 1) = '0' then 1
else 0 end) over (partition by substr("person_number",-4)) as zero_count,
row_number() over (partition by substr("person_number",-4) order by "name") as rn
from person
)
Where Cnt > 1
and zero_count > 0
ORDER BY substr("person_number",-4)
我增加数据样本
- 现在包含一个字段
zero_count
来计算以$开头的行数每个组中的$ c> 0 - 这两个行的最后都有相同的4个字符,并且还以0开头(
ZERO_COUNT = 2
) - case没有匹配的行也从0开始
- now include a field
zero_count
to calculate how many rows start with0
in each group - case where both row have same 4 char at the end and also start with 0 (
ZERO_COUNT = 2
) - case when a row without match also start with 0
这篇关于找到具有以相同的最后四位数字结尾的值的重复列 - SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!