如何从非规范化表计算非膨胀SUM [英] How to calculate a non-inflated SUM from a denormalized table
问题描述
这是在我问过的上一个问题。假设我有一个非正规化的表,看起来像这样:
this is kind of building off a previous question I asked. Suppose I have a denormalized table that looks something like this:
Apple_ID | Tree_ID | Orchard_ID | Tree_Height | ...other columns...
---------------------------------------------------------------------
1 | 1 | 1 | 12 | ...other values...
---------------------------------------------------------------------
2 | 1 | 1 | 12 | ...other values...
---------------------------------------------------------------------
3 | 1 | 1 | 12 | ...other values...
---------------------------------------------------------------------
4 | 2 | 1 | 15 | ...other values...
---------------------------------------------------------------------
5 | 2 | 1 | 15 | ...other values...
---------------------------------------------------------------------
6 | 2 | 1 | 15 | ...other values...
---------------------------------------------------------------------
7 | 2 | 1 | 15 | ...other values...
---------------------------------------------------------------------
8 | 3 | 1 | 20 | ...other values...
---------------------------------------------------------------------
9 | 3 | 1 | 20 | ...other values...
---------------------------------------------------------------------
10 | 4 | 2 | 30 | ...other values...
---------------------------------------------------------------------
11 | 5 | 2 | 10 | ...other values...
---------------------------------------------------------------------
12 | 5 | 2 | 10 | ...other values...
---------------------------------------------------------------------
13 | 5 | 2 | 10 | ...other values...
---------------------------------------------------------------------
我想计算每个果园中Tree_Heights的总和,所以我想返回的结果是:
I want to calculate the sum of Tree_Heights in each orchard, so the result I want to get back is:
Orchard_ID | sum(Tree_Height)
-------------------------------
1 | 47
-------------------------------
2 | 40
但是,由于非正规化,总和膨胀为:
However, due to the denormalization, the sum inflates to this:
Orchard_ID | sum(Tree_Height)
-------------------------------
1 | 136
-------------------------------
2 | 60
我之前提到的问题的解决方案无法在此处实现,因为我们不能通过sum()唯一的行列。如何编写简单的查询以获得预期的结果?
The solution from the question I mentioned before cannot be implemented here, since we cannot sum() by a unique row column. How can I write a simple query to get the intended result?
推荐答案
最简单的方法是使用 CTE
,但是如果您的系统不支持,则可以使用派生表。对于每棵树,我们在 Apple_ID
上使用 ROW_NUMBER()
给每棵树提供一个唯一的行来求和:
The easiest way to write this is with a CTE
, but if your system doesn't support that you can use a derived table. We use ROW_NUMBER()
over the Apple_ID
for each tree to give us a unique row to sum:
SELECT "Orchard_ID", SUM("Tree_Height") AS Total_Height
FROM (
SELECT "Orchard_ID", "Tree_Height",
ROW_NUMBER() OVER (PARTITION BY "Orchard_ID", "Tree_ID" ORDER BY "Apple_ID") AS rn
FROM data
) d
WHERE rn = 1
GROUP BY "Orchard_ID"
输出
Orchard_ID total_height
1 47
2 40
如果可以使用 CTE
s,则将这样写:
If you could use CTE
s this is how it would be written:
WITH CTE AS (
SELECT "Orchard_ID", "Tree_Height",
ROW_NUMBER() OVER (PARTITION BY "Orchard_ID", "Tree_ID" ORDER BY "Apple_ID") AS rn
FROM data
)
SELECT "Orchard_ID", SUM("Tree_Height") AS Total_Height
FROM CTE
WHERE rn = 1
GROUP BY "Orchard_ID"
这篇关于如何从非规范化表计算非膨胀SUM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!