找到具有以相同的最后四位数字结尾的值的重复列 - SQL [英] Find duplicate columns that have values that end with same last four digits - SQL

查看:224
本文介绍了找到具有以相同的最后四位数字结尾的值的重复列 - SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个查询,将搜索表中的所有记录,并只返回具有2个或更多值的相同4位数的结果,但是我没有特定的四位数,我只需要它找到具有多个值结束的所有内容。
例如表有:

I need a query that will search for all records in a table and return only the ones that have 2 or more values that end with same 4 digits, but i dont have specific four digits i can give you, i just need it to find everything that has more than one value that ends the same. eg. table has:

person_number------------name
==============================
1234567------------------john 
00004567-----------------mark
9999999------------------mike
0009999------------------lidia
10101010-----------------alex
23232323-----------------matt
98748484-----------------steve

我希望它只返回:

person_number------------name
==============================
1234567------------------john 
00004567-----------------mark
9999999------------------mike
0009999------------------lidia

你看,我想要返回以相同的最后4位数结尾的重复,但是我没有特定的数字给你,所以我不能使用像'%9999'的东西。它也需要正确排序,所以我让他们彼此相邻。我需要找到重复项并删除其中的一个,但我不知道哪一个,直到我实际上看到其他列中的所有其他值,并选择要手动删除的值。

You see, i want it to return duplicates that end with the same last 4 digits, but i dont have specific digits to give you so i cant use something like '%9999'. It also needs to be sorted properly so I have them next to each other. I need to find duplicates and delete one of them, but i dont know which one until i actually see all the other values in other columns and pick which one to delete manually.

谢谢!

推荐答案

我只想延长Sick答案。

I just want extend Sick answer.

你说你会选择哪一个消除。但您也可以在 ORDER clausule中添加一个 CASE statment来过滤您想要消除的错误。

You say you will like to choose which one to eliminate. But you can also include a ORDER clausule with a CASE statment to filter the one you want eliminate.

在这种情况下,我通过name订购,以便您可以使用 rn> 1 并保持名字。

In this case I order by "name" so you can delete all with rn > 1 and keep the first name.

SqlFiddleDemo

SqlFiddleDemo

select "person_number", "name", rn, zero_count
from
(
  select "person_number", 
         "name", 
         substr("person_number", 1, 1),
         count(1) over (partition by substr("person_number",-4)) as Cnt,
         SUM(case 
             when substr("person_number", 1, 1)  = '0' then 1
             else 0 end) over (partition by substr("person_number",-4)) as zero_count,
          row_number() over (partition by substr("person_number",-4) order by "name") as rn
  from person
)
Where Cnt > 1
and zero_count > 0
ORDER BY substr("person_number",-4)

我增加数据样本


  • 现在包含一个字段 zero_count 来计算以$开头的行数每个组中的$ c> 0

  • 这两个行的最后都有相同的4个字符,并且还以0开头( ZERO_COUNT = 2

  • case没有匹配的行也从0开始

  • now include a field zero_count to calculate how many rows start with 0 in each group
  • case where both row have same 4 char at the end and also start with 0 (ZERO_COUNT = 2)
  • case when a row without match also start with 0

这篇关于找到具有以相同的最后四位数字结尾的值的重复列 - SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆