使用累积分析功能在查询中添加零值记录 [英] Adding zero-value records in a query using cumulative analytical functions
问题描述
输入和代码:
with data as (
select 1 id, 'A' name, 'fruit' r_group, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007' year, '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008' year, '05' month, 10 sales from dual union all
select 4 id, 'B' name, 'vegetable' r_group, '2008' year, '07' month, 20 sales from dual
)
select year,
month,
r_group,
sum(sales) sales,
sum(opening) opening,
sum(closing) closing
from (
select t.*,
(sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) -sales ) as opening,
sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) as closing
from data t
)
group by year, month, r_group
order by year, month
输出:
year | month | r_group | sales | opening | closing |
2007 | 04 | fruit | 104 | 0 | 104 |
2008 | 05 | fruit | 10 | 5 | 15 |
2008 | 07 | vegetable | 20 | 0 | 20 |
我希望输出如下所示:
year | month | r_group | sales | opening | closing |
2007 | 04 | fruit | 104 | 0 | 104 |
2008 | 05 | fruit | 10 | 104 | 114 |
2008 | 07 | vegetable | 20 | 0 | 20 |
我只能通过在数据中添加一个零值记录来实现期望的输出,该记录的值为= 05,名称为"Z",如下所示:
I can achieve the desired output only by adding a zero-valued record in the data for month=05 and for name = 'Z' like this:
select 1 id, 'A' name, 'fruit' r_group, '2007', year '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007', year '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008', year '05' month, 10 sales from dual union all
select 4 id, 'Z' name, 'fruit' r_group, '2008', year '05' month, 0 sales from dual union all
select 5 id, 'B' name, 'vegetable' r_group, '2008', year '07' month, 20 sales from dual ))
但是,我想知道是否可以将它作为选择查询的一部分而不必编辑数据本身.
However, I want to know if I can do this as part of the select query without having to edit the data itself.
编辑
内部select语句会将详细版本输入到数据库表中:年,月,名称,r_group,打开,关闭.换句话说,该查询的结果将用于填充数据库表,然后使用外部查询进行聚合:
The inner select statement will input into a database table the detailed version: year, month, name, r_group, opening, closing. In other words the result of this query will be used to populate the db table and then aggregation using the outer query will happen afterwards:
select t.*,
(sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) -sales ) as opening,
sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) as closing
from data t
然后,我将使用分析工具(第3方)对它进行汇总,以仅在r_group上汇总而不包含名称.但是年,月,名称,r_group详细信息必须在后台存在.
then I'll use an aggregate on that using an analytical tool (3rd party) to aggregate on r_group only without including the name. But the year, month, name, r_group detail must exist in the background.
编辑2
在其他workds中,我试图动态添加丢失的数据.例如,如果name ='Z'在2007,04中存在,但在2008,05中不存在,则累积函数一旦到2008年就会失败.因为,在2008年,它的名字不等于'Z'它失败了.
In other workds, I'm trying to dynamically add missing data. For instance, if name = 'Z' exists in 2007,04 but DOES NOT in 2008,05 then the cumulative function will fail once it gets to 2008. Because, it does not have a name ='Z' in 2008 to start with it fails.
推荐答案
首先按R_GROUP
,YEAR
和MONTH
分组,然后使用分析查询:
Group by R_GROUP
, YEAR
and MONTH
first then use the analytical query:
SELECT t.*,
SUM( sales ) OVER ( PARTITION BY r_group ORDER BY year, month ) - sales
AS opening,
SUM( sales ) OVER ( PARTITION BY r_group ORDER BY year, month ) AS closing
FROM (
SELECT r_group,
year,
month,
SUM( sales ) AS sales
FROM data
GROUP BY r_group, year, month
) t
ORDER BY year, month
更新:
这还将在输出中包括名称:
This will also include the name in the output:
SELECT t.*,
SUM( sales ) OVER ( PARTITION BY r_group, dt ) AS r_group_month_sales,
COALESCE(
SUM( sales ) OVER (
PARTITION BY r_group
ORDER BY dt
RANGE BETWEEN UNBOUNDED PRECEDING AND INTERVAL '1' MONTH PRECEDING
),
0
) AS opening,
SUM( sales ) OVER (
PARTITION BY r_group
ORDER BY dt
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS closing
FROM (
SELECT d.*,
TO_DATE( year || month, 'YYYYMM' ) AS dt
FROM data d
) t
ORDER BY dt
输出:
ID NAME R_GROUP YEAR MONTH SALES DT R_GROUP_MONTH_SALES OPENING CLOSING
-- ---- --------- ---- ----- ----- ---------- ------------------- ------- -------
1 A fruit 2007 04 5 2007-04-01 104 0 104
2 Z fruit 2007 04 99 2007-04-01 104 0 104
3 A fruit 2008 05 10 2008-05-01 10 104 114
4 B vegetable 2008 07 20 2008-07-01 20 0 20
然后,您可以在此查询的输出上进行所需的任何处理.
You can then do whatever processing you want on the output of this query.
也许是这样的:
SELECT year,
month,
r_group,
MAX( r_group_month_sales ) AS sales,
MAX( opening ) AS opening,
MAX( closing ) AS closing,
YOUR_THIRD_PARTY_AGGREGATION_FUNCTION( column_names ) AS other
FROM (
-- insert the query above
)
GROUP BY year, month, r_group
ORDER BY year, month
这篇关于使用累积分析功能在查询中添加零值记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!