使用PIVOT时无法在SQL Server 2012中获得正确的SUM值 [英] Not getting the correct SUM values in SQL Server 2012 when using a PIVOT

查看:67
本文介绍了使用PIVOT时无法在SQL Server 2012中获得正确的SUM值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个查询,该查询将遍历某些行,但将对某些列进行 SUM ,然后将它们分组在一起.我以前使用过 PIVOT 函数,但是当我的结果集包含相似的值时,我遇到了问题.

I am trying to create a query that will pivot over some rows but will SUM some columns and then group them together. I've used the PIVOT function before but I am running into issues when my result set contains similar values.

这是SQL Server 2012.

This is SQL Server 2012.

示例代码:

CREATE TABLE #Foo
(
     Store varchar(50), 
     Employee varchar(50), 
     Sold money, 
     Waste money, 
     Tmsp datetime
)

INSERT INTO #Foo 
VALUES
   ('Harrisburg', 'John', 20.00, 10.00, GETDATE()),
   ('Harrisburg', 'John', 20.00, 10.00, GETDATE()),
   ('Harrisburg', 'Jim', 20.00, 10.00, GETDATE()),
   ('Seattle', 'Jim', 20.00, 10.00, GETDATE()),
   ('Seattle', 'Alex', 20.00, 10.00, GETDATE())

SELECT 
    Store,
    SUM(Sold) TotalSold,
    SUM([John]) WastedByJohn,
    SUM([Jim]) WastedByJim,
    SUM([Alex]) WastedByAlex
FROM 
    #Foo
PIVOT
    (SUM(Waste)
     FOR Employee IN ([John], [Jim], [Alex])
    ) PVT
GROUP BY 
   Store

DROP TABLE #Foo

这将产生以下结果:

Store      | TotalSold | WastedByJohn | WastedByJim | WastedByAlex
Harrisburg | 20.00     | 20.00        | 10.00       | NULL
Seattle    | 20.00     | NULL         | 10.00       | 10.00

基于表格中的数据,Harrisburg的TotalSold是否为60.00,Seattle的TotalSold是否为40.00?

Shouldn't the TotalSold for Harrisburg be 60.00 and the TotalSold for Seattle be 40.00 based on the data in the table?

这让我很难理解,因为如果更改数据以使值不相同,我将得到正确的结果.

It gets harder for me to understand, because if I change the data so that the values aren't the same, I get the correct results.

INSERT INTO #Foo 
VALUES
    ('Harrisburg', 'John', 25.00, 10.00, GETDATE()),
    ('Harrisburg', 'John', 30.00, 10.00, GETDATE()),
    ('Harrisburg', 'Jim', 40.00, 10.00, GETDATE()),
    ('Seattle', 'Jim', 50.00, 10.00, GETDATE()),
    ('Seattle', 'Alex', 60.00, 10.00, GETDATE())

这组数据产生了预期的结果:

This set of data yields the expected result:

Store      | TotalSold | WastedByJohn | WastedByJim | WastedByAlex
Harrisburg | 95.00     | 20.00        | 10.00       | NULL
Seattle    | 110.00    | NULL         | 10.00       | 10.00

我四处张望,找不到答案,为什么PIVOT在聚合时会基于不同的值而有所不同.我觉得这里缺少一些基本知识,除非我碰巧遇到了SQL Server的一些不太可能的问题.

I looked around for a bit and couldn't find an answer as to why the PIVOT would be different based on distinct values when it comes to aggregation. I feel like there's something fundamental that I'm missing here, unless I just happened to come across some issue with SQL Server which is unlikely.

任何帮助将不胜感激.

谢谢!

推荐答案

以下查询应为您提供所需的内容:

The following query should give you what you want:

SELECT Store,
       TotalSold,
       [John] AS WastedByJohn,
       [Jim] AS WastedByJim,
       [Alex] AS WastedByAlex
FROM (SELECT Store, Employee, Waste,
             SUM(Sold) OVER (PARTITION BY Store) AS TotalSold
      FROM #Foo) src
PIVOT
    (SUM(Waste)
     FOR Employee IN ([John], [Jim], [Alex])
    ) PVT

要了解为什么会得到意外结果,请尝试在没有 GROUP BY 子句的情况下进行查询:

To understand why you get unexpected results, try your query without the GROUP BY clause:

SELECT Store, Sold, [John], [Jim], [Alex]
FROM 
    #Foo
PIVOT
    (SUM(Waste)
     FOR Employee IN ([John], [Jim], [Alex])
    ) PVT

输出:

Store       Sold    John    Jim     Alex
Harrisburg  20,00   20,00   10,00   NULL
Seattle     20,00   NULL    10,00   10,00

现在,使用第二版示例数据再试一次:

Now, try the same again with second version of sample data:

输出:

Store       Sold    John    Jim     Alex
Harrisburg  25,00   10,00   NULL    NULL
Harrisburg  30,00   10,00   NULL    NULL
Harrisburg  40,00   NULL    10,00   NULL
Seattle     50,00   NULL    10,00   NULL
Seattle     60,00   NULL    NULL    10,00

通过比较2个不同的结果集,您可以清楚地看到,对于不参与其中的每种列组合,即对于 Store 已售.

By comparing the 2 different result sets you can clearly see that PIVOT takes place for every combination of columns not participating in it, i.e. for every combination of Store, Sold.

在第一种情况下,只有 Harrisburg,20,00 Seattle,20,00 .这就是为什么在这种情况下只能得到两行的原因.在第二种情况下,您总共有3 + 2 = 5个组合.

In first case there is only Harrisburg,20,00 and Seattle,20,00. That's why you only get two rows in this case. In second case you have a total of 3 + 2 = 5 combinations.

您现在可以看到为什么 GROUP BY 仅在第二种情况下有效.

You can now see why GROUP BY works only in the second case.

这篇关于使用PIVOT时无法在SQL Server 2012中获得正确的SUM值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆