优先处理来自特定数据源的数据 [英] Give priority to data from Specific Datasource
问题描述
我的数据如下
ID Name Description DataSource Year
1 Apple Sweet & Tasty Source_A 2016
1 Apple Red and Sweet & Tasty Source_B 2015
2 Apple Delicious Source_A 2016
2 Apple Delicious and Red Source_C 2015
3 Apple Source_C 2013
3 Apple Green and Large Source_B 2016
就我而言,我想优先考虑source_B,因为它更可靠.因此,如果有来自Soure_B的数据,我想显示该行的特定ID,而忽略其他行.如果不存在来自source_B的数据,则仅要显示来自其他源的数据.另外,我只想显示一行最新数据.
In my case, I would like to give priority to source_B because it is more reliable. So if there is a data from Soure_B, I would like to display that row for the specific ID and ignore others. If data from source_B is not present, only then I want to display data from other sources. Also, I would like to display just one row of most recent data.
在上面的示例中,结果应类似于
In above example, result should look like
ID Name Description DataSource Year
1 Apple Red and Sweet & Tasty Source_B 2015
2 Apple Delicious Source_A 2016
3 Apple Green and Large Source_B 2016
推荐答案
您可以使用row_number + case设置优先级,如下所示:
You can do the priorities using row_number + case, like this:
select
id,
name,
description,
datasource,
year
from (
select
id,
name,
description,
datasource,
year,
row_number () over (
partition by ID
order by case when DataSource = 'Source_B' then 1 else 2 end,
Year desc
) as RN
from
table1
) X
where
RN = 1
分区by将为每个ID选择新的ID,而order by选择行将以哪个顺序获取数字.然后,外部选择过滤掉编号为1的行以外的行.
The partition by will choose new ID for each ID, and order by selects in which order the rows will get the number. The outer select then filters out others than rows with number 1.
您可以在 SQL小提琴
这篇关于优先处理来自特定数据源的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!