如何在MYSQL中将行配对在一起? [英] How do I pair rows together in MYSQL?

查看:56
本文介绍了如何在MYSQL中将行配对在一起?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个简单的时间跟踪应用程序.

I'm working on a simple time tracking app.

我创建了一个表,用于记录员工的进出时间.

I've created a table that logs the IN and OUT times of employees.

以下是我的数据当前外观的一个示例:

Here is an example of how my data currently looks:

E_ID | In_Out |      Date_Time
------------------------------------
  3  |   I    | 2012-08-19 15:41:52
  3  |   O    | 2012-08-19 17:30:22
  1  |   I    | 2012-08-19 18:51:11
  3  |   I    | 2012-08-19 18:55:52
  1  |   O    | 2012-08-19 20:41:52
  3  |   O    | 2012-08-19 21:50:30

我正在尝试创建一个将员工的进出时间配对为一行的查询,如下所示:

Im trying to create a query that will pair the IN and OUT times of an employee into one row like this:

E_ID |       In_Time       |      Out_Time
------------------------------------------------
  3  | 2012-08-19 15:41:52 | 2012-08-19 17:30:22
  3  | 2012-08-19 18:55:52 | 2012-08-19 21:50:30
  1  | 2012-08-19 18:51:11 | 2012-08-19 20:41:52

我希望我在这里想要实现的目标很明确. 基本上,我想生成一个报表,将进出时间都合并为一行.

I hope I'm being clear in what I'm trying to achieve here. Basically I want to generate a report that had both the in and out time merged into one row.

任何对此的帮助将不胜感激. 预先感谢.

Any help with this would be greatly appreciated. Thanks in advance.

推荐答案

我可以想到三种基本方法.

There are three basic approaches I can think of.

一种方法使用MySQL用户变量,一种方法使用theta JOIN,另一种方法使用SELECT列表中的子查询.

One approach makes use of MySQL user variables, one approach uses a theta JOIN, another uses a subquery in the SELECT list.

theta-JOIN

一种方法是使用theta-JOIN.这种方法是通用的SQL方法(没有特定于MySQL的语法),可以与多个RDBMS一起使用.

One approach is to use a theta-JOIN. This approach is a generic SQL approach (no MySQL specific syntax), which can work with multiple RDBMS.

如果行数很多,这种方法会创建一个很大的中间结果集,这可能会导致性能出现问题.

N.B. With a large number of rows, this approach can create a significantly large intermediate result set, which can lead to problematic performance.

SELECT o.e_id, MAX(i.date_time) AS in_time, o.date_time AS out_time    
  FROM e `o`
  LEFT
  JOIN e `i` ON i.e_id = o.e_id AND i.date_time < o.date_time AND i.in_out = 'I'
 WHERE o.in_out = 'O'
 GROUP BY o.e_id, o.date_time
 ORDER BY o.date_time

这是将雇员的每个"O"行与更早的每个"I"行匹配,然后我们使用MAX汇总来选择日期时间最接近的"I"记录.

What this does is match every 'O' row for an employee with every 'I' row that is earlier, and then we use the MAX aggregate to pick out the 'I' record with the closest date time.

这适用于完美配对的数据;对于不完美的配对可能会产生奇怪的结果...(两个连续的"O"记录,中间没有"I"行,都将匹配到同一"I"行,等等)

This works for perfectly paired data; could produce odd results for imperfect pairs... (two consecutive 'O' records with no intermediate 'I' row, will both get matched to the same 'I' row, etc.)

SELECT列表中的相关子查询

另一种方法是在SELECT列表中使用相关的子查询.

Another approach is to use a correlated subquery in the SELECT list. This can have sub-optimal performance, but is sometimes workable (and is occasionally the fastest way to return the specified result set... this approach works best when we have a limited number of rows returned in the outer query.)

 SELECT o.e_id
      , (SELECT MAX(i.date_time)
           FROM e `i`
          WHERE i.in_out = 'I'
            AND i.e_id = o.e_id
            AND i.date_time < o.date_time
        ) AS in_time
      , o.date_time AS out_time
   FROM e `o`
  WHERE o.in_out = 'O'
  ORDER BY o.date_time


用户变量

另一种方法是利用MySQL用户变量. (这是一种特定于MySQL的方法,是缺少"分析函数的一种解决方法.)

Another approach is to make use of MySQL user variables. (This is a MySQL-specific approach, and is a workaround to the "missing" analytic functions.)

此查询的作用是按e_id然后按date_time对所有行进行排序,因此我们可以按顺序处理它们.每当遇到"O"(输出)行时,我们都将紧接在"I"行之前的date_time值用作​​ in_time")

What this query does is order all of the rows by e_id, then by date_time, so we can process them in order. Whenever we encounter an 'O' (out) row, we use the value of date_time from the immediately preceding 'I' row as the 'in_time')

N.B .: MySQL用户变量的这种用法取决于MySQL按照特定顺序(可预见的计划)执行操作.内联视图(或MySQL术语中的派生表")的使用为我们提供了可预测的执行计划.但是这种行为可能会在MySQL的未来版本中发生变化.

N.B.: This usage of MySQL user variables is dependent on MySQL performing operations in a specific order, a predictable plan. The use of the inline views (or "derived tables", in MySQL parlance) gets us a predictable execution plan. But this behavior is subject to change in future releases of MySQL.

SELECT c.e_id
     , CAST(c.in_time AS DATETIME) AS in_time
     , c.out_time
  FROM (
         SELECT IF(@prev_e_id = d.e_id,@in_time,@in_time:=NULL) AS reset_in_time
              , @in_time := IF(d.in_out = 'I',d.date_time,@in_time) AS in_time
              , IF(d.in_out = 'O',d.date_time,NULL) AS out_time
              , @prev_e_id := d.e_id  AS e_id
           FROM (
                  SELECT e_id, date_time, in_out 
                    FROM e
                    JOIN (SELECT @prev_e_id := NULL, @in_time := NULL) f
                   ORDER BY e_id, date_time, in_out 
                 ) d
       ) c
 WHERE c.out_time IS NOT NULL
 ORDER BY c.out_time

这适用于您拥有的数据集,它需要更彻底的测试和调整,以确保当行没有完美配对时(例如,两个'O'行,而没有'我在他们之间排成一列,在"I"行中没有随后的"O"行,等等.)

This works for the set of data you have, it needs more thorough testing and tweaking to ensure you get the result set you want with quirky data, when the rows are not perfectly paired (e.g. two 'O' rows with no 'I' row between them, an 'I' row with no subsequent 'O' row, etc.)

SQL提琴

这篇关于如何在MYSQL中将行配对在一起?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆