GROUP BY问题与SUM()结合 [英] GROUP BY problem in conjunction with a SUM()

查看:92
本文介绍了GROUP BY问题与SUM()结合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

也许某种善良的灵魂可以帮助我解决SQL问题,我一直在努力工作,但是一个专栏并没有总结我想要的方式

to。


我有以下数据


GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ


1 A 2 aa

1 A 2 bb

1 A 2 cc

1 B 3 dd

1 B 3 ee

2 C 1 ff

2 C 1 gg

物品是团体的一部分。这些物品以指定的数量出售,并且与人相关的是



我想要实现的第一个结果是列出每个项目和项目数量

使用该商品的人数:


选择

GRP_SEQ

,ITEM_SEQ

,总和(ITEM_NR)/计数(ITEM_SEQ)

,计数(PERSONS_SEQ)

来自

GRP

内部联接ITEM上的ITEM_GRPSEQ = GRP_SEQ

左外联接人PERSONS_ITEMSEQ = ITEM_SEQ

其中

ITEM_GRPSEQ = GRP_SEQ

group by ITEM_SEQ,GRP_SEQ

订购2,1


结果是


1 A 2 3

1 B 3 2

2 C 1 2


此工作正常。


下一个SQL是我遇到的问题。我想按照他们的小组对物品分组

,并返回该组

物品的总销售数量,并计算使用该物品的人数小组。

我想出的是以下内容:


选择

GRP_SEQ

,???

,计数(PERSONS_SEQ)

来自

GRP

内部联接项目ITEM_GRPSEQ = GRP_SEQ

左外联接人PERSONS_ITEMSEQ = ITEM_SEQ

其中

ITEM_GRPSEQ = GRP_SEQ

group by GRP_SEQ
订购1


我期望实现的结果是


1 5 5

2 1 2


但是我无法弄清楚如何能够为每件商品每件商品只收取一次ITEM_NR

。我当然可以将SQL修改为


选择

GRP_SEQ

,(从ITEM i2中选择总和(i2.ITEM_NR)其中i2.ITEM_GRPSEQ = GRP_SEQ)

,计数(PERSONS_SEQ)

来自

GRP

内连接ITEM i1在i1.ITEM_GRPSEQ = GRP_SEQ

左外联接人PERSONS_ITEMSEQ = i1.ITEM_SEQ

其中

i1.ITEM_GRPSEQ = GRP_SEQ

group by GRP_SEQ

订购1


但必须有一些方法才能达到结果而不使用昂贵的

subselect。不知何故,我认为在第一个SQL中有一些事情我会监督,这将允许我替换

sum(ITEM_NR)/ count(ITEM_SEQ) ;更优雅的东西。


也许某种灵魂可以帮助我....


问候


Rudolf Bargholz




解决方案

Rudolf Bargholz写道:


也许某种灵魂可以帮我解决一个SQL我一直在努力

整天去上班,其中一个专栏我只是没有总结我想要它的方式。


我有以下数据


GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ


1 A 2 aa

1 A 2 bb

1 A 2 cc

1 B 3 dd

1 B 3 ee

2 C 1 ff

2 C 1 gg


项目是群组的一部分。这些物品以指定的数量出售,并且
与人物相关联。


我想要实现的第一个结果是列出每个项目和数量

与该项目相关联的人数使用的项目:



从您对查询的描述中我看不出任何操作理由在

GRP_SEQ。另外,你想独立计算物品的数量和

人数吗?或者你想计算某种平均值,

即订购的5人(平均)是项目A的1.3倍?


select

GRP_SEQ

,ITEM_SEQ

,总和(ITEM_NR)/计数(ITEM_SEQ)



我猜NR_ITEMS与ITEM_NR相同?如果是这样,如果ITEM_SEQ不能
为NULL,那么这里也可以使用简单的AVG(ITEM_NR)。


,计数(PERSONS_SEQ)

来自

GRP

内连接ITEM上的ITEM_GRPSEQ = GRP_SEQ

左外连接PERSONS_ITEMSEQ上的人= ITEM_SEQ

其中

ITEM_GRPSEQ = GRP_SEQ

按ITEM_SEQ分组,GRP_SEQ

按2顺序排序, 1



您的查询与您提供的架构和样本数据完全不匹配

以上。什么是ITEM和GROUP表?


结果是


1 A 2 3

1 B 3 2

2 C 1 2


此工作正常。


下一个SQL是一个我有问题。我想将他们的小组分组

项目

并返回该组销售的商品总数

项目并计算数量使用那个

组的物品的人。我想出的是:



再次,假设你想拥有独立的人数和

项目,如何关于这个:


SELECT grp_seq,SUM(nr_items),COUNT(DISTINCT person_seq)

FROM ...

GROUP BY grp_seq


select

GRP_SEQ

,???

,count( PERSONS_SEQ)

来自

GRP

内连接ITEM上的ITEM_GRPSEQ = GRP_SEQ

左外连接PERSONS_ITEMSEQ = ITEM_SEQ

其中

ITEM_GRPSEQ = GRP_SEQ

group by GRP_SEQ

order by 1

我期望实现的结果是


1 5 5

2 1 2



为什么5为和1。在第二栏?


但是我无法弄清楚如何能够为每个项目总结ITEM_NR

每个项目只有一次。我当然可以将SQL修改为


选择

GRP_SEQ

,(从ITEM i2中选择总和(i2.ITEM_NR)其中i2.ITEM_GRPSEQ = GRP_SEQ)

,计数(PERSONS_SEQ)

来自

GRP

内连接ITEM i1在i1.ITEM_GRPSEQ = GRP_SEQ

左外联接人PERSONS_ITEMSEQ = i1.ITEM_SEQ

其中

i1.ITEM_GRPSEQ = GRP_SEQ

group by GRP_SEQ

订购1


但必须有一些方法才能达到结果而不使用

昂贵的次级选择。



为什么你认为子选择是昂贵的? SQL的想法是告诉数据库管理系统什么是什么?你想要的,DBMS找出如何的最佳方式到达
到达那里。因此,首先看看

优化计划并在进行一些性能测量之前做一些好的想法,然后才能说明价格昂贵与否。

无论如何,我遇到的最大问题是你的查询的语义是

根本不清楚...


-

Knut Stolze

DB2 z / OS实用程序开发

IBM德国


< blockquote>嗨Knut,


再次,假设你想拥有独立的人数和

项目,那怎么样:


SELECT grp_seq,SUM(nr_items),COUNT(DISTINCT person_seq)

FROM ...

GROUP BY grp_seq



这里的原始数据集:


GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ


1 A 2 aa

1 A 2 bb

1 A 2 cc

1 B 3 dd

1 B 3 ee

2 C 1 ff

2 C 1 gg


如果按GRP_SEQ分组,则为第1组的NR_ITEMS将是

2 + 2 + 2 + 3 + 3 = 12.我要找的结果是2 + 3 = 3.


售出第1组的两件物品(A和B)。第一项(A)是与三个人(aa,bb和cc)相关的
。第二项(B)是与两个人(dd和ee)相关的



>


为什么5为和1。在第二栏?



也许将NR_ITEMS和PERSONS列加到同一个数字的
有点愚蠢。一个更好的例子是


GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ

1 A 2 aa

1 A 2 bb

1 A 2 cc

1 B 1 dd

1 B 1 ee

2 C 1 ff

2 C 1 gg

GROUP 1销售的商品数量为2 + 1 = 3

使用GROUP 1旅行的人数是5

GROUP 2售出的商品数量为1

使用GROUP 1旅行的人数为2


这是否更有意义?


问候


Rudolf Bargholz


嗨Knut,


>选择
GRP_SEQ
,(从ITEM中选择总和(i2.ITEM_NR) i2,其中i2.ITEM_GRPSEQ = GRP_SEQ)
,计数(PERSONS_SEQ)
来自
GRP
内部联接ITEM i1 on i1.ITEM_GRPSEQ = GRP_SEQ
left outer jo在PERSONS上的PERSONS_ITEMSEQ = i1.ITEM_SEQ
其中
i1.ITEM_GRPSEQ = GRP_SEQ
按GRP_SEQ分组
按顺序排列

但必须有一些方法在不使用昂贵的子选择的情况下实现结果。



为什么你认为子选择是昂贵的? SQL的想法是告诉数据库管理系统什么是什么?你想要的,DBMS找出如何的最佳方式到达
到达那里。因此,首先看看

优化计划并在进行一些性能测量之前做一些好的想法,然后才能说明价格昂贵与否。

无论如何,我遇到的最大问题是你的查询的语义



根本不清楚......



性能问题只是对我的猜测。在使用列中的子选择时,我总是处于

印象中,数据库处理集和集合分组优于

。对我来说只是经验问题

无论我在哪里使用,在SQL的列中选择性能下降,并且当我依赖set函数时,
加速SQL 。


我的例子非常清楚,我一整天都试图找到解决这个问题的方法。对于其他人,我现在看到它没有多大意义。

也许以下内容会有所帮助:


创建表DB2ADMIN.GRP



GRP_SEQ INTEGER not null,

GRP_NAME VARCHAR(20)not null

);

insert into DB2ADMIN.GRP(GRP_SEQ,GRP_NAME)值(1,''酒店

Adlon''),(2,''Marriot'');;


创建表DB2ADMIN.ITEM



ITEM_SEQ INTEGER不为空,

ITEM_GRPSEQ INTEGER不为空,

ITEM_NAME VARCHAR(20)not null,

ITEM_NR INTEGER

);

插入DB2ADMIN.ITEM(ITEM_SEQ,ITEM_GRPSEQ,ITEM_NAME,ITEM_NR)

值(1,1,''Triple'',2),(2,1,''Double'',1),(3,2,''Single'',1);


创建表DB2ADMIN.PERSON



PERSON_SEQ INTEGER不为空,

PERSON_ITEMSEQ INTEGER不为空,

PERSON_NAME VARCHAR(20)not null

);

insert进入DB2ADMIN.PERSON(PERSON_SEQ,PERSON_ITEMSEQ,PERSON_NAME)



(1,1,'''Rudolf''),(2,1,''Sandra'' ),(3,1,''Alyssa''),(3,2,''Hans''),(4,2,''Diana''),(5,3,''Eduard'');


" Hotel Adlon"共有3间客房,共5人

" Mariott"有1个房间,总共1人


注意,有两个三人房卖给了三个人。


以下SQL


SELECT grp_seq,SUM(nr_items),COUNT(DISTINCT person_seq)

FROM ...

GROUP BY grp_seq


将返回以下结果:


1,8,5 5

2,1,1

阿德隆酒店出售8间可容纳5人的房间。


我要找的是


1,3,5

2,1,1

问候


Rudolf Bargholz




Perhaps some kind soul could help me out with an SQL I have been trying all
day to get to work, where one colum is just not summing up the way I want it
to.

I have the following data

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ

1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 3 dd
1 B 3 ee
2 C 1 ff
2 C 1 gg

An item is part of a group. The items are sold in specified numbers and are
associated with persons.

The first result I want to acheive is list each item and the number of items
used by the number of persons associated with the item:

select
GRP_SEQ
,ITEM_SEQ
,sum(ITEM_NR)/count(ITEM_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM on ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=ITEM_SEQ
where
ITEM_GRPSEQ=GRP_SEQ
group by ITEM_SEQ,GRP_SEQ
order by 2,1

The result is

1 A 2 3
1 B 3 2
2 C 1 2

This works fine.

The next SQL is the one I am having problems with. I want to group the items
by their group and return the total number of items sold for that group of
items and count the number of persons that used the items of that group.
What I have come up with is the following:

select
GRP_SEQ
,???
,count(PERSONS_SEQ)
from
GRP
inner join ITEM on ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=ITEM_SEQ
where
ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

The result I am expecting to acheive is

1 5 5
2 1 2

but I am have not been able to work out how I can I can sum the ITEM_NR for
each item only once for each item. I could of course modify the SQL to

select
GRP_SEQ
,(select sum(i2.ITEM_NR) from ITEM i2 where i2.ITEM_GRPSEQ=GRP_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM i1 on i1.ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=i1.ITEM_SEQ
where
i1.ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

but there must be some method to achieve the result without using the costly
subselect. Somehow I think there is event something in the first SQL that I
am overseeing that would allow me to replace the
"sum(ITEM_NR)/count(ITEM_SEQ)" with something more elegant.

Perhaps some kind soul could help me out ....

Regards

Rudolf Bargholz





解决方案

Rudolf Bargholz wrote:

Perhaps some kind soul could help me out with an SQL I have been trying
all day to get to work, where one colum is just not summing up the way I
want it to.

I have the following data

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ

1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 3 dd
1 B 3 ee
2 C 1 ff
2 C 1 gg

An item is part of a group. The items are sold in specified numbers and
are associated with persons.

The first result I want to acheive is list each item and the number of
items used by the number of persons associated with the item:

From your description of the query I don''t see any reason to operate on
GRP_SEQ. Also, do you want to count the number of items and the number of
persons independently? Or do you want to calculate some sort of average,
i.e. 5 persons ordered (on average) 1.3 times item A?

select
GRP_SEQ
,ITEM_SEQ
,sum(ITEM_NR)/count(ITEM_SEQ)

I guess that NR_ITEMS is the same as ITEM_NR? If so, and if ITEM_SEQ can''t
be NULL, then a simple AVG(ITEM_NR) would be fine here as well.

,count(PERSONS_SEQ)
from
GRP
inner join ITEM on ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=ITEM_SEQ
where
ITEM_GRPSEQ=GRP_SEQ
group by ITEM_SEQ,GRP_SEQ
order by 2,1

Your query doesn''t match at all with the schema and sample data you provided
above. What are the ITEM and GROUP tables?

The result is

1 A 2 3
1 B 3 2
2 C 1 2

This works fine.

The next SQL is the one I am having problems with. I want to group the
items
by their group and return the total number of items sold for that group
of items and count the number of persons that used the items of that
group. What I have come up with is the following:

Again, assuming that you want to have independent counts of persons and
items, how about this:

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

select
GRP_SEQ
,???
,count(PERSONS_SEQ)
from
GRP
inner join ITEM on ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=ITEM_SEQ
where
ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

The result I am expecting to acheive is

1 5 5
2 1 2

Why the "5" and "1" in the 2nd column?

but I am have not been able to work out how I can I can sum the ITEM_NR
for each item only once for each item. I could of course modify the SQL to

select
GRP_SEQ
,(select sum(i2.ITEM_NR) from ITEM i2 where i2.ITEM_GRPSEQ=GRP_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM i1 on i1.ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=i1.ITEM_SEQ
where
i1.ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

but there must be some method to achieve the result without using the
costly subselect.

Why do you think that a subselect is "costly"? The idea of SQL is to tell
the DBMS "what" you want and the DBMS figures out the best way on "how" to
get there. So it would be a good idea to first have a look at the
optimized plans and also to do some performance measurements before saying
something is expensive or not.

Anyway, the biggest problem I have is that the semantics of your queries are
not clear at all...

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany


Hi Knut,

Again, assuming that you want to have independent counts of persons and
items, how about this:

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

Here the original data set:

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ

1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 3 dd
1 B 3 ee
2 C 1 ff
2 C 1 gg

If I group by the GRP_SEQ, the sum of the NR_ITEMS for the GROUP 1 would be
2 + 2 + 2 + 3 + 3 = 12. The result I am looking for is 2 + 3 = 3.

There were two items (A and B) of the GROUP 1 sold. The first item (A) is
associated with tthree persons (aa, bb, and cc). The second item (B) is
associated with two persons (dd and ee).

>

Why the "5" and "1" in the 2nd column?

Perhaps it was a little dumb having the columns NR_ITEMS and PERSONS add up
to the same number. A better example would be

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ
1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 1 dd
1 B 1 ee
2 C 1 ff
2 C 1 gg

The number of items sold for the GROUP 1 is 2 + 1 = 3
The number of persons that travelled using GROUP 1 is 5
The number of items sold for the GROUP 2 is 1
The number of persons that travelled using GROUP 1 is 2

Does this make the more sense?

Regards

Rudolf Bargholz


Hi Knut,

>select
GRP_SEQ
,(select sum(i2.ITEM_NR) from ITEM i2 where i2.ITEM_GRPSEQ=GRP_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM i1 on i1.ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=i1.ITEM_SEQ
where
i1.ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

but there must be some method to achieve the result without using the
costly subselect.


Why do you think that a subselect is "costly"? The idea of SQL is to tell
the DBMS "what" you want and the DBMS figures out the best way on "how" to
get there. So it would be a good idea to first have a look at the
optimized plans and also to do some performance measurements before saying
something is expensive or not.

Anyway, the biggest problem I have is that the semantics of your queries
are
not clear at all...

The perormance issue is just a guess of mine. I was always under the
impression that databases handled sets and the grouping of sets better than
when using subselects in columns. It is just a matter of experience for me
that wherever I use selects in columns of an SQL the performance drops, and
when I rely on set functions the SQLs speed up.

My example is very clear to me who has been trying to find a solution to
this problem all day. To someone else, I now see it makes less sense.
Perhaps the following will help:

create table DB2ADMIN.GRP
(
GRP_SEQ INTEGER not null,
GRP_NAME VARCHAR(20) not null
);
insert into DB2ADMIN.GRP(GRP_SEQ, GRP_NAME) values (1,''Hotel
Adlon''),(2,''Marriot'');;

create table DB2ADMIN.ITEM
(
ITEM_SEQ INTEGER not null,
ITEM_GRPSEQ INTEGER not null,
ITEM_NAME VARCHAR(20) not null,
ITEM_NR INTEGER
);
insert into DB2ADMIN.ITEM(ITEM_SEQ, ITEM_GRPSEQ, ITEM_NAME, ITEM_NR)
values (1,1,''Triple'',2),(2,1,''Double'',1),(3,2,''Single'',1) ;

create table DB2ADMIN.PERSON
(
PERSON_SEQ INTEGER not null,
PERSON_ITEMSEQ INTEGER not null,
PERSON_NAME VARCHAR(20) not null
);
insert into DB2ADMIN.PERSON(PERSON_SEQ, PERSON_ITEMSEQ, PERSON_NAME)
values
(1,1,''Rudolf''),(2,1,''Sandra''),(3,1,''Alyssa''),(3,2, ''Hans''),(4,2,''Diana''),(5,3,''Eduard'');

"Hotel Adlon" has 3 sold rooms with a total of 5 persons
"Mariott" has 1 sold room with a total of 1 person

Note, there were two tripple rooms sold to three persons.

The following SQL

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

will return the following incorect result:

1, 8, 5
2, 1, 1
The Hotel Adlon sold eight rooms for five persons.

What I am looking for is

1, 3, 5
2, 1, 1
The Hotel Adlon sold three rooms for five persons
Regards

Rudolf Bargholz




这篇关于GROUP BY问题与SUM()结合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆