mysql组通过返回不正确的结果 [英] mysql group by returning incorrect result
问题描述
我有两个表格(时间表和任务),每个表格都包含一个小时值列分配小时数和实际小时数,其中我试图获得这两个值的总和。
也是时间表表中包含staff_id的整数值,它对应于任务表中的assigned_to
任务表包含:
task_id INT(11)
assigned_to INT(11)
date_start DATE
hrs DECIMAL(10, 0)
时间表中包含:
timesheet_id(int)
varchar(100)
十进制小时(10,0)
staff_id(INT 11)
$SELECT
timesheet.staff_id,
task.assigned_to,
SUM(task.hrs)AS assigned_hrs,
timesheet.name,
SUM(timesheet.hours)AS actual_hours
FROM时间表
INNER JOIN任务
ON timesheet.staff_id = task.assigned_to
GROUP BY timesheet.name
哪个(不正确)会导致:
staff_id | assigned_to | assigned_hrs |名称。 | actual_hours |
--------------- | ------------ | ---------------- | --------------- | --------------- |
4 | 4 | 1364 |约翰史密斯| 52
2 | 2 | 80 | Jane Doe | 14.5
6 | 6 | 454 |测试用户1 | 40
9 | 9 | 262 |测试用户2 | 4
以上是我想要得到的结果,但是
全部的结果是正确的,但约翰史密斯的指定小时数翻倍。
我知道它与这里描述的分组陷阱
有关:
http://wikido.isoftdata.com/index.php/The_GROUPing_pitfall
但我只是为了解决这个问题而不择手段。
有人可以指向正确的方向吗?
(再次编辑)
如果我只在任务表上运行查询:SELECT
task.assigned_to,
SUM(task.hrs)AS allocated_hrs
FROM task
GROUP BY task.assigned_to
它(正确)的结果是:
assigned_to | allocated_hrs |
----------------------------
4 | 682
7 | 378
2 | 40
6 | 227
9 | 262
你可以看到约翰史密斯的4的用户ID翻了一番ID 6)
在时间表上运行一个查询:
SELECT
timesheet.name,
SUM(timesheet.hours)AS actual_hours
FROM时间表
GROUP BY timesheet.name
正确结果:
name | Actual_hrs
-------------------------
Jane Doe | 19.5
John Smith | 6.5
测试用户1 | 4
测试用户2 | 5
运行JoachimL提供的查询结果:
staff_id | assigned_to | assigned_hrs |名称| actual_hours
--------------------------------------------- -------------------------
2 2 40 Jane Doe 19.5
4 4 24 John Smith 6.5
4 4 7 John Smith 6.5
4 4 21 John Smith 6.5
4 4 210 John Smith 6.5
4 4 28 John Smith 6.5
4 4 91 John Smith 6.5
6 6 14测试用户1 8
6 6 91测试用户1 8
6 6 28测试用户1 8
6 6 3测试用户1 8
9 9 24测试用户2 1
9 9 91测试用户2 1
9 9 56测试用户2 1
这是一个小提琴 http://sqlfiddle.com/#!2/ef680
解决方案
SELECT x。*
,SUM(y.hrs)n
FROM
(SELECT t.staff_id
,t.name
,SUM(t.hours)actual_hours
FROM时间表t
GROUP
BY t.staff_id
)x
JOIN任务y
ON y.assigned_to = x.staff_id
GROUP
BY staff_id;
http://sqlfiddle.com/#!2/ef680/14
I have two tables (timesheet and tasks) each contains an hour value column "allocated hours" and "actual hours" of which I am trying to get the sum of both of these values. also the timesheet table contains a integer value for "staff_id" which corresponds to the "assigned_to" in the task table
the task table contains:
task_id INT(11) assigned_to INT(11) date_start DATE hrs DECIMAL (10,0)
the timesheet table contains:
timesheet_id (int) name varchar(100) hours decimal(10,0) staff_id(INT 11)
my query looks like:
SELECT timesheet.staff_id, task.assigned_to, SUM(task.hrs) AS assigned_hrs, timesheet.name, SUM(timesheet.hours) AS actual_hours FROM timesheet INNER JOIN task ON timesheet.staff_id = task.assigned_to GROUP BY timesheet.name
which will (incorrectly) result in:
staff_id |assigned_to |assigned_hrs | name. | actual_hours | ---------------|------------|----------------|---------------|---------------| 4 |4 | 1364 | John Smith |52 2 |2 | 80 | Jane Doe |14.5 6 |6 | 454 | Test User 1 |40 9 |9 | 262 | Test User 2 |4
The above is what I am trying to get, However all of the results are correct but John Smith's assigned hours get doubled. I know it has to do with the "Grouping Pitfall" as described here:
http://wikido.isoftdata.com/index.php/The_GROUPing_pitfall
but I just go cross eyed trying to figure this out. can someone point me in the right direction?
(edit again) If I run a query just on the task table:
SELECT task.assigned_to, SUM(task.hrs) AS allocated_hrs FROM task GROUP BY task.assigned_to
It (correctly) results in:
assigned_to | allocated_hrs | ---------------------------- 4 | 682 7 | 378 2 | 40 6 | 227 9 | 262
you can see that the user ID of "4" which is John Smith has doubled (and also ID 6)
running a query on just the timesheet table :
SELECT timesheet.name, SUM(timesheet.hours) AS actual_hours FROM timesheet GROUP BY timesheet.name
correctly results in :
name | Actual_hrs ------------------------- Jane Doe | 19.5 John Smith | 6.5 Test User1 | 4 Test User2 | 5
running the query supplied by JoachimL results in :
staff_id | assigned_to | assigned_hrs | name | actual_hours ---------------------------------------------------------------------- 2 2 40 Jane Doe 19.5 4 4 24 John Smith 6.5 4 4 7 John Smith 6.5 4 4 21 John Smith 6.5 4 4 210 John Smith 6.5 4 4 28 John Smith 6.5 4 4 91 John Smith 6.5 6 6 14 Test User 1 8 6 6 91 Test User 1 8 6 6 28 Test User 1 8 6 6 3 Test User 1 8 9 9 24 Test User 2 1 9 9 91 Test User 2 1 9 9 56 Test User 2 1
Here's a fiddle http://sqlfiddle.com/#!2/ef680
解决方案SELECT x.* , SUM(y.hrs) n FROM ( SELECT t.staff_id , t.name , SUM(t.hours) actual_hours FROM timesheet t GROUP BY t.staff_id ) x JOIN task y ON y.assigned_to = x.staff_id GROUP BY staff_id;
http://sqlfiddle.com/#!2/ef680/14
这篇关于mysql组通过返回不正确的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!