SQL过滤掉不太具体的行 [英] SQL filtering out less specific rows
问题描述
我的表格数据看起来像
Col1 | Col2 | Col3
1 | 2 | NULL
1 | 2 | 3
1 | NULL | NULL
1 | 5 | NULL
2 | NULL | NULL
我想编写一个查询,以便仅获得最具体的条目. IE.在上面的示例中,第1行是更具体的第3行,因为"Col1"的值在两者中都相同,但"Col2"中的值在第1行中更具体(不为null),类似地,第2行比第1行更具体.
I want to write a query, so that I get only the most specific entries. ie. in the above example row1 is more specific row3 as Value of "Col1" is same in both but Value in "Col2" is more specific( not null) in row1, similarly row2 is more specific than row1.
对于上述数据集,结果应类似于:
For the above dataset the result should look like:
Col1 | Col2 | Col3
1 | 2 | 3
1 | 5 | NULL
2 | NULL | NULL
注意:列的数据类型可以是任何值.
NOTE: Datatype of column can be anything.
推荐答案
我假设列与查询中的列一样是有序的",因此您不会遇到col2
为空且col3
不为空:
I am assuming that the columns are "ordered" as they are in your query, so you don't have a case where col2
is null and col3
is not null:
select col1, col2, col3
from table t
where (col3 is not null) or
(col3 is null and col2 is not null and
not exists (select 1
from table t2
where t2.col1 = t.col1 and t2.col2 = t.col2 and t2.col3 is not null
)
) or
(col2 is null and col1 is not null and
not exists (select 1
from table t2
where t2.col1 = t.col1 and t2.col2 is not null
)
);
其背后的逻辑是:
- 获取所有
col3
不为null的行. - 采用所有
col2
不为空且col3
中没有值的相似行. - 采用所有
col1
不为空且在col2
中没有值的相似行.
- Take all rows where
col3
is not null. - Take all rows where
col2
is not null and there are no similar rows with a value incol3
. - Take all rows where
col1
is not null and there are no similar rows with a value incol2
.
在Oracle中,您可以更简单地做到这一点:
In Oracle, you can do this more simply:
select col1, col2, col3
from (select t.*,
max(col3) over (partition by col1, col2) as maxcol3,
max(col2) over (partition by col1) as maxcol2
from table t
) t
where (col3 is not null) or
(col2 is not null and maxcol3 is null) or
(col1 is not null and maxcol2 is null);
编辑II: (明确定义为更具体".)
EDIT II: (With a clarified definition of "more specific".)
我认为这是逻辑的外推.它需要查看所有组合:
I think this is the extrapolation of the logic. It requires looking at all combinations:
select col1, col2, col3
from (select t.*,
max(col3) over (partition by col1, col2) as maxcol3_12,
max(col2) over (partition by col1, col3) as maxcol2_13,
max(col1) over (partition by col2, col3) as maxcol1_23,
max(col1) over (partition by col1) as maxcol1_2,
max(col1) over (partition by col2) as maxcol1_3,
max(col2) over (partition by col1) as maxcol2_1,
max(col2) over (partition by col3) as maxcol2_3,
max(col3) over (partition by col2) as maxcol3_1,
max(col3) over (partition by col2) as maxcol3_2,
from table t
) t
where (col1 is not null and col2 is not null and col3 is not null) or
(col1 is not null and col2 is not null and maxcol3 is null) or
(col1 is not null and col3 is not null and maxcol2 is null) or
(col2 is not null and col1 is not null and maxcol3 is null) or
(col2 is not null and col3 is not null and maxcol1 is null) or
(col3 is not null and col1 is not null and maxcol2 is null) or
(col3 is not null and col2 is not null and maxcol1 is null) or
(col1 is not null and maxcol2 is null and maxcol3 is null) or
(col2 is not null and maxcol1 is null and maxcol3 is null) or
(col3 is not null and maxcol1 is null and maxcol2 is null);
第一个组合显示如果所有值都不为null,请保留此行".第二句话说:如果col1和col2不为null并且col3永远没有值,请保留此行".依此类推,直到最后一个说:保持此行的col3不为null,并且col1和col2永远没有值".
The first combination says "keep this row if all values are not null". The second says: "keep this row if col1 and col2 are not null and col3 never has a value". And so on to the last one that says: "keep this row is col3 is not null and col1 and col2 never have values".
这可能简化为:
where not ((col1 is null and maxcol1 is not null) or
(col2 is null and maxcol2 is not null) or
(col3 is null and maxcol3 is not null)
);
这篇关于SQL过滤掉不太具体的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!