SQL过滤掉不太具体的行 [英] SQL filtering out less specific rows

查看:67
本文介绍了SQL过滤掉不太具体的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的表格数据看起来像

 Col1  | Col2 | Col3
    1  |   2  | NULL
    1  |   2  | 3
    1  | NULL | NULL
    1  |   5  | NULL
    2  | NULL | NULL

我想编写一个查询,以便仅获得最具体的条目. IE.在上面的示例中,第1行是更具体的第3行,因为"Col1"的值在两者中都相同,但"Col2"中的值在第1行中更具体(不为null),类似地,第2行比第1行更具体.

I want to write a query, so that I get only the most specific entries. ie. in the above example row1 is more specific row3 as Value of "Col1" is same in both but Value in "Col2" is more specific( not null) in row1, similarly row2 is more specific than row1.

对于上述数据集,结果应类似于:

For the above dataset the result should look like:

Col1 | Col2 | Col3
  1  |  2   |  3
  1  |  5   | NULL
  2  | NULL | NULL

注意:列的数据类型可以是任何值.

NOTE: Datatype of column can be anything.

推荐答案

我假设列与查询中的列一样是有序的",因此您不会遇到col2为空且col3不为空:

I am assuming that the columns are "ordered" as they are in your query, so you don't have a case where col2 is null and col3 is not null:

select col1, col2, col3
from table t
where (col3 is not null) or
      (col3 is null and col2 is not null and
       not exists (select 1
                   from table t2
                   where t2.col1 = t.col1 and t2.col2 = t.col2 and t2.col3 is not null
                  )
      ) or
      (col2 is null and col1 is not null and
       not exists (select 1
                   from table t2
                   where t2.col1 = t.col1 and t2.col2 is not null
                  )
      );

其背后的逻辑是:

  1. 获取所有col3不为null的行.
  2. 采用所有col2不为空且col3中没有值的相似行.
  3. 采用所有col1不为空且在col2中没有值的相似行.
  1. Take all rows where col3 is not null.
  2. Take all rows where col2 is not null and there are no similar rows with a value in col3.
  3. Take all rows where col1 is not null and there are no similar rows with a value in col2.

在Oracle中,您可以更简单地做到这一点:

In Oracle, you can do this more simply:

select col1, col2, col3
from (select t.*,
             max(col3) over (partition by col1, col2) as maxcol3,
             max(col2) over (partition by col1) as maxcol2
      from table t
     ) t
where (col3 is not null) or
      (col2 is not null and maxcol3 is null) or
      (col1 is not null and maxcol2 is null);

编辑II: (明确定义为更具体".)

EDIT II: (With a clarified definition of "more specific".)

我认为这是逻辑的外推.它需要查看所有组合:

I think this is the extrapolation of the logic. It requires looking at all combinations:

select col1, col2, col3
from (select t.*,
             max(col3) over (partition by col1, col2) as maxcol3_12,
             max(col2) over (partition by col1, col3) as maxcol2_13,
             max(col1) over (partition by col2, col3) as maxcol1_23,
             max(col1) over (partition by col1) as maxcol1_2,
             max(col1) over (partition by col2) as maxcol1_3,
             max(col2) over (partition by col1) as maxcol2_1,
             max(col2) over (partition by col3) as maxcol2_3,
             max(col3) over (partition by col2) as maxcol3_1,
             max(col3) over (partition by col2) as maxcol3_2,
      from table t
     ) t
where (col1 is not null and col2 is not null and col3 is not null) or
      (col1 is not null and col2 is not null and maxcol3 is null) or
      (col1 is not null and col3 is not null and maxcol2 is null) or
      (col2 is not null and col1 is not null and maxcol3 is null) or
      (col2 is not null and col3 is not null and maxcol1 is null) or
      (col3 is not null and col1 is not null and maxcol2 is null) or
      (col3 is not null and col2 is not null and maxcol1 is null) or
      (col1 is not null and maxcol2 is null and maxcol3 is null) or
      (col2 is not null and maxcol1 is null and maxcol3 is null) or
      (col3 is not null and maxcol1 is null and maxcol2 is null);

第一个组合显示如果所有值都不为null,请保留此行".第二句话说:如果col1和col2不为null并且col3永远没有值,请保留此行".依此类推,直到最后一个说:保持此行的col3不为null,并且col1和col2永远没有值".

The first combination says "keep this row if all values are not null". The second says: "keep this row if col1 and col2 are not null and col3 never has a value". And so on to the last one that says: "keep this row is col3 is not null and col1 and col2 never have values".

这可能简化为:

where not ((col1 is null and maxcol1 is not null) or
           (col2 is null and maxcol2 is not null) or
           (col3 is null and maxcol3 is not null)
          );

这篇关于SQL过滤掉不太具体的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆