SQL查询以按日期范围折叠重复的值 [英] SQL Query to Collapse Duplicate Values By Date Range

查看:92
本文介绍了SQL查询以按日期范围折叠重复的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有以下结构的表:ID,月,年,值,每个月每个ID具有一个条目的值,大多数月份具有相同的值.

我想为该表创建一个视图,该视图折叠以下相同的值:ID,开始月,结束月,开始年,结束年,值,每个值每个ID一行.

要注意的是,如果值发生变化然后又回到原始值,则表中应该有两行

所以:

  • 100 1 2008 80
  • 100 2 2008 80
  • 100 3 2008 90
  • 100 4 2008 80

应该产生

  • 100 1 2008 2 2008 80
  • 100 3 2008 3 2008 90
  • 100 4 2008 4 2008 80

当值返回原始值时,以下查询适用于除此特殊情况以外的所有情况.

select distinct id, min(month) keep (dense_rank first order by month) 
over (partition   by id, value) startMonth, 
max(month) keep (dense_rank first order by month desc) over (partition
by id, value) endMonth, 
value

数据库是Oracle

解决方案

我按如下方式进行工作.它专注于分析功能,并且特定于Oracle.

select distinct id, value,
decode(startMonth, null,
  lag(startMonth) over(partition by id, value order by startMonth, endMonth),  --if start is null, it's an end so take from the row before
startMonth) startMonth,

  decode(endMonth, null,
  lead(endMonth) over(partition by id, value order by startMonth, endMonth),  --if end is null, it's an start so take from the row after
endMonth) endMonth    

from (
select id, value, startMonth, endMonth from(
select id, value, 
decode(month+1, lead(month) over(partition by id,value order by month), null, month)     
startMonth, --get the beginning month for each interval
decode(month-1, lag(month) over(partition by id,value order by month), null, month)     
endMonth --get the end month for each interval from Tbl
) a 
where startMonth is not null or endMonth is not null --remain with start and ends only
)b

也许可以稍微简化一些内部查询

内部查询按如下方式检查月份是否为间隔的第一个月/最后一个月:如果月份+ 1 ==该分组的下个月(滞后),则由于存在下个月,因此本月为显然不是月底.否则,它是间隔的最后一个月.相同的概念用于检查第一个月.

外部查询首先过滤掉不是开始或结束月份(where startMonth is not null or endMonth is not null)的所有行. 然后,每行要么是开始月份,要么是结束月份(或两者兼有),具体取决于开始或结束不为空.如果该月份是开始月份,则通过获取该id的下一个(领先)endMonth(由endMonth排序)来获得相应的结束月份;如果是结束月份,则通过查找前一个startMonth(滞后)来获取startMonth >

I have a table with the following structure: ID, Month, Year, Value with values for one entry per id per month, most months have the same value.

I would like to create a view for that table that collapses the same values like this: ID, Start Month, End Month, Start Year, End Year, Value, with one row per ID per value.

The catch is that if a value changes and then goes back to the original, it should have two rows in the table

So:

  • 100 1 2008 80
  • 100 2 2008 80
  • 100 3 2008 90
  • 100 4 2008 80

should produce

  • 100 1 2008 2 2008 80
  • 100 3 2008 3 2008 90
  • 100 4 2008 4 2008 80

The following query works for everything besides this special case, when the value returns to the original.

select distinct id, min(month) keep (dense_rank first order by month) 
over (partition   by id, value) startMonth, 
max(month) keep (dense_rank first order by month desc) over (partition
by id, value) endMonth, 
value

Database is Oracle

解决方案

I got it to work as follows. It is heavy on analytic functions and is Oracle specific.

select distinct id, value,
decode(startMonth, null,
  lag(startMonth) over(partition by id, value order by startMonth, endMonth),  --if start is null, it's an end so take from the row before
startMonth) startMonth,

  decode(endMonth, null,
  lead(endMonth) over(partition by id, value order by startMonth, endMonth),  --if end is null, it's an start so take from the row after
endMonth) endMonth    

from (
select id, value, startMonth, endMonth from(
select id, value, 
decode(month+1, lead(month) over(partition by id,value order by month), null, month)     
startMonth, --get the beginning month for each interval
decode(month-1, lag(month) over(partition by id,value order by month), null, month)     
endMonth --get the end month for each interval from Tbl
) a 
where startMonth is not null or endMonth is not null --remain with start and ends only
)b

It might be possible to simplify some of the inner queries somewhat

The inner query checks if the month is a first/last month of the interval as follows: if the month + 1 == the next month (lag) for that grouping, then since there is a next month, this month is obviously not the end month. Otherwise, it is the last month of the interval. The same concept is used to check for the first month.

The outer query first filters out all rows that are not either start or end months (where startMonth is not null or endMonth is not null). Then, each row is either a start month or an end month (or both), determined by whether start or end is not null). If the month is a start month, get the corresponding end month by getting the next (lead) endMonth for that id,value ordered by endMonth, and if it is an endMonth get the startMonth by looking for the previous startMonth (lag)

这篇关于SQL查询以按日期范围折叠重复的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆