SQL:有什么方法可以找到重复项并将它们标记为带有大小写的新列 [英] SQL : Is there any way to find Duplicates and flag them as new column with case

查看：58 发布时间：2021/4/21 19:33:27 sql duplicates case impala hue

本文介绍了SQL:有什么方法可以找到重复项并将它们标记为带有大小写的新列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在表中查找重复项，并想添加一个新的标志列.下面的示例数据:

I am trying to find duplicates in the table and wanted to add a new flag column. Sample data below :

Column A        Column B
1                888
1                999
1                777
1                777
2                444
2                444
3                555
4                222
-5               333
-5               672
-5               045

所需的输出:

Column A        Column B         Flag_output
1                888              DUPLICATE
1                999              DUPLICATE
1                777              NULL
1                777              NULL
2                444              NULL
2                444              NULL
3                555              NULL
4                222              NULL
-5               333              DUPLICATE
-5               672              DUPLICATE
-5               045              DUPLICATE

情况1 :当A列具有相同的值且B列具有不同的值时(例如A列中的值 1 )-应标记为DUPLICATE

case 1: When Column A has the same values with different values in Column B (e.g. value 1 in Column A) - should be marked as DUPLICATE

情况2 :当A列具有相同的值且B列中的多行具有相同的值时(例如，A列中的值 2 )-应该标记为NULL

case 2: When Column A has the same values with the same values in Column B in multiple rows(e.g. value 2 in column A) - should be marked as NULL

情况3 :当A列和B列具有唯一值时(例如A列中的值 3 和 4 )-也应为标记为NULL

case 3: When Column A and Column B has unique values (e.g. value 3 and 4 in Column A) - Also should be marked as NULL

任何帮助将不胜感激.

推荐答案

根据您的描述，我可以说出您的条件，例如当 a的 b 的最小值和最大值不同时，然后标记为'duplicate'.

Based on your description, I can phrase your conditions as when the minimum and maximum values of b are different for a, then label as 'duplicate'.

为此，请使用窗口功能:

For this, use window functions:

select t.*,
       (case when min(b) over (partition by a) <> max(b) over (partition by a)
             then 'duplicate'
        end) as flag_output
from t;

根据数据，您似乎想要:

Based on the data, you seem to want:

select t.*,
       (case when count(*) over (partition by a, b) = 1 and
                  count(*) over (partition by a) > 1
             then 'duplicate'
        end) as flag_output
from t;

也就是说，仅当 a 的值不止一个时，才标记单例值.

That is, to flag singleton values only when there is more than one value for a.

这篇关于SQL:有什么方法可以找到重复项并将它们标记为带有大小写的新列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

SQL:有什么方法可以找到重复项并将它们标记为带有大小写的新列 [英] SQL : Is there any way to find Duplicates and flag them as new column with case

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

SQL:有什么方法可以找到重复项并将它们标记为带有大小写的新列 [英] SQL : Is there any way to find Duplicates and flag them as new column with case

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭