在表格的不同部分计算多个平均值? [英] Calculating multiple averages across different parts of the table?
问题描述
我有以下 transactions
表:
customer_id purchase_date product category department quantity store_id
1 2020-10-01 Kit Kat Candy Food 2 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
2 2020-10-01 Snickers Candy Food 2 store_A
2 2020-10-01 Baguette Bread Food 5 store_A
2 2020-10-01 iPhone Cell phones Electronics 2 store_A
3 2020-10-01 Sony PS5 Games Electronics 1 store_A
我想计算购买的产品的平均数量(针对表中的每个 product
).我还希望通过考虑同一 category
或 department
.应该注意划分唯一客户,并且产品数量
大于0(数量为0表示退款,因此不予考虑).
I would like to calculate the average number of products purchased (for each product
in the table). I'm also looking to calculate averages across each category
and each department
by accounting for all products within the same category
or department
respectively. Care should be taken to divide over unique customers AND the product quantity
being greater than 0 (a 0 quantity indicates a refund, and should not be accounted for).
因此,基本上,输出表如下所示:
So basically, the output table would like below:
...其中 store_id
和 average_level_type
是分区列.
...where store_id
and average_level_type
are partition columns.
有没有一种方法可以一次通过交易表来实现这一目标?还是需要将我的方法分解为多个步骤?
Is there a way to achieve this in a single pass over the transactions table? or do I need to break down my approach into multiple steps?
谢谢!
推荐答案
如何按以下方式使用全部联盟"-
How about using "union all" as below -
Select store_id, 'product' as average_level_type,product as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,product
Union all
Select store_id, 'category' as average_level_type, category as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,category
Union all
Select store_id, 'department' as average_level_type,department as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,department;
如果在这种情况下要避免全部使用联合,则可以使用类似rollup()或通过分组sets()分组的方法来实现相同的功能,但要获得精确格式的输出,查询会稍微复杂一些您在问题中所显示的.
If you want to avoid using union all in that case you can use something like rollup() or group by grouping sets() to achieve the same but the query would be a little more complicated to get the output in the exact format which you have shown in the question.
以下是如何使用分组集获得相同输出的方法-
EDIT : Below is how you can use grouping sets to get the same output -
Select store_id,
case when G_ID = 3 then 'product'
when G_ID = 5 then 'category'
when G_ID = 6 then 'department' end As average_level_type,
case when G_ID = 3 then product
when G_ID = 5 then category
when G_ID = 6 then department end As id,
total_quantity,
unique_customer_count,
average
from
(select store_id, product, category, department, sum(quantity) as total_quantity, Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average, GROUPING__ID As G_ID
from transactions
group by store_id,product,category,department
grouping sets((store_id,product),(store_id,category),(store_id,department))
) Tab
order by 2
;
这篇关于在表格的不同部分计算多个平均值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!