mysql组通过返回不正确的结果 [英] mysql group by returning incorrect result

查看:133
本文介绍了mysql组通过返回不正确的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个表格(时间表和任务),每个表格都包含一个小时值列分配小时数和实际小时数,其中我试图获得这两个值的总和。
也是时间表表中包含staff_id的整数值,它对应于任务表中的assigned_to

任务表包含:

  task_id INT(11)
assigned_to INT(11)
date_start DATE
hrs DECIMAL(10, 0)

时间表中包含:

 timesheet_id(int)
varchar(100)
十进制小时(10,0)
staff_id(INT 11)
  SELECT 
timesheet.staff_id,
task.assigned_to,
SUM(task.hrs)AS assigned_hrs,
timesheet.name,
SUM(timesheet.hours)AS actual_hours
FROM时间表
INNER JOIN任务
ON timesheet.staff_id = task.assigned_to
GROUP BY timesheet.name

哪个(不正确)会导致:

  staff_id | assigned_to | assigned_hrs |名称。 | actual_hours | 
--------------- | ------------ | ---------------- | --------------- | --------------- |
4 | 4 | 1364 |约翰史密斯| 52
2 | 2 | 80 | Jane Doe | 14.5
6 | 6 | 454 |测试用户1 | 40
9 | 9 | 262 |测试用户2 | 4

以上是我想要得到的结果,但是
全部的结果是正确的,但约翰史密斯的指定小时数翻倍。
我知道它与这里描述的分组陷阱
有关:

http://wikido.isoftdata.com/index.php/The_GROUPing_pitfall



但我只是为了解决这个问题而不择手段。
有人可以指向正确的方向吗?

(再次编辑)
如果我只在任务表上运行查询:

  SELECT 
task.assigned_to,
SUM(task.hrs)AS allocated_hrs
FROM task
GROUP BY task.assigned_to

它(正确)的结果是:

  assigned_to | allocated_hrs | 
----------------------------
4 | 682
7 | 378
2 | 40
6 | 227
9 | 262

你可以看到约翰史密斯的4的用户ID翻了一番ID 6)



在时间表上运行一个查询:

  SELECT 
timesheet.name,
SUM(timesheet.hours)AS actual_hours
FROM时间表
GROUP BY timesheet.name

正确结果:

  name | Actual_hrs 
-------------------------
Jane Doe | 19.5
John Smith | 6.5
测试用户1 | 4
测试用户2 | 5

运行JoachimL提供的查询结果:

  staff_id | assigned_to | assigned_hrs |名称| actual_hours 
--------------------------------------------- -------------------------
2 2 40 Jane Doe 19.5
4 4 24 John Smith 6.5
4 4 7 John Smith 6.5
4 4 21 John Smith 6.5
4 4 210 John Smith 6.5
4 4 28 John Smith 6.5
4 4 91 John Smith 6.5
6 6 14测试用户1 8
6 6 91测试用户1 8
6 6 28测试用户1 8
6 6 3测试用户1 8
9 9 24测试用户2 1
9 9 91测试用户2 1
9 9 56测试用户2 1

这是一个小提琴 http://sqlfiddle.com/#!2/ef680

解决方案

  SELECT x。* 
,SUM(y.hrs)n
FROM
(SELECT t.staff_id
,t.name
,SUM(t.hours)actual_hours
FROM时间表t
GROUP
BY t.staff_id
)x
JOIN任务y
ON y.assigned_to = x.staff_id
GROUP
BY staff_id;

http://sqlfiddle.com/#!2/ef680/14


I have two tables (timesheet and tasks) each contains an hour value column "allocated hours" and "actual hours" of which I am trying to get the sum of both of these values. also the timesheet table contains a integer value for "staff_id" which corresponds to the "assigned_to" in the task table

the task table contains:

task_id INT(11)
assigned_to INT(11)
date_start DATE
hrs DECIMAL (10,0)

the timesheet table contains:

timesheet_id (int)
name varchar(100)
hours decimal(10,0)
staff_id(INT 11)

my query looks like:

    SELECT
        timesheet.staff_id,
        task.assigned_to,
        SUM(task.hrs) AS assigned_hrs,
        timesheet.name,
        SUM(timesheet.hours) AS actual_hours
    FROM timesheet
    INNER JOIN task
    ON timesheet.staff_id = task.assigned_to
    GROUP BY timesheet.name

which will (incorrectly) result in:

staff_id       |assigned_to |assigned_hrs    | name.         |  actual_hours |
---------------|------------|----------------|---------------|---------------|
4              |4           | 1364           | John Smith    |52          
2              |2           | 80             | Jane Doe      |14.5        
6              |6           | 454            | Test User 1   |40          
9              |9           | 262            | Test User 2   |4           

The above is what I am trying to get, However all of the results are correct but John Smith's assigned hours get doubled. I know it has to do with the "Grouping Pitfall" as described here:

http://wikido.isoftdata.com/index.php/The_GROUPing_pitfall

but I just go cross eyed trying to figure this out. can someone point me in the right direction?

(edit again) If I run a query just on the task table:

    SELECT
    task.assigned_to,
    SUM(task.hrs) AS allocated_hrs
    FROM task
    GROUP BY task.assigned_to

It (correctly) results in:

assigned_to | allocated_hrs |
----------------------------
4           |    682
7           |    378
2           |    40
6           |    227
9           |    262

you can see that the user ID of "4" which is John Smith has doubled (and also ID 6)

running a query on just the timesheet table :

    SELECT
    timesheet.name,
    SUM(timesheet.hours) AS actual_hours
    FROM timesheet
    GROUP BY timesheet.name

correctly results in :

    name    |  Actual_hrs
    -------------------------
    Jane Doe   | 19.5
    John Smith | 6.5
    Test User1 | 4
    Test User2 | 5

running the query supplied by JoachimL results in :

    staff_id |  assigned_to |   assigned_hrs |  name |  actual_hours
    ----------------------------------------------------------------------
    2   2   40  Jane Doe    19.5
    4   4   24  John Smith  6.5
    4   4   7   John Smith  6.5
    4   4   21  John Smith  6.5
    4   4   210 John Smith  6.5
    4   4   28  John Smith  6.5
    4   4   91  John Smith  6.5
    6   6   14  Test User 1 8
    6   6   91  Test User 1 8
    6   6   28  Test User 1 8
    6   6   3   Test User 1 8
    9   9   24  Test User 2 1
    9   9   91  Test User 2 1
    9   9   56  Test User 2 1

Here's a fiddle http://sqlfiddle.com/#!2/ef680

解决方案

SELECT x.*
     , SUM(y.hrs) n
  FROM
     ( SELECT t.staff_id
            , t.name
            , SUM(t.hours) actual_hours
         FROM timesheet t
        GROUP 
           BY t.staff_id
     ) x
  JOIN task y
    ON y.assigned_to = x.staff_id
 GROUP
    BY staff_id;

http://sqlfiddle.com/#!2/ef680/14

这篇关于mysql组通过返回不正确的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆