连续行之间每人之间的时间差 [英] Time Difference between per person between consecutive rows

查看:50
本文介绍了连续行之间每人之间的时间差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些数据(广义而言)由以下字段组成:

I have some data which (broadly speaking) consist of following fields:

Person  TaskID   Start_time                      End_time
Alpha   1       'Wed, 18 Oct 2017 10:10:03 GMT' 'Wed. 18 Oct 2017 10:10:36 GMT'
Alpha   2       'Wed, 18 Oct 2017 10:11:16 GMT' 'Wed, 18 Oct 2017 10:11:28 GMT'
Beta    1       'Wed, 18 Oct 2017 10:12:03 GMT' 'Wed, 18 Oct 2017 10:12:49 GMT'
Alpha   3       'Wed, 18 Oct 2017 10:12:03 GMT' 'Wed, 18 Oct 2017 10:13:13 GMT'
Gamma   1       'Fri, 27 Oct 2017 22:57:12 GMT' 'Sat, 28 Oct 2017 02:00:54 GMT'
Beta    2       'Wed, 18 Oct 2017 10:13:40 GMT' 'Wed, 18 Oct 2017 10:14:03 GMT'

对于这些数据,我需要的输出类似于:

For this data, my required output is something like:

Person  TaskID Time_between_attempts
Alpha   1      NULL      ['Wed, 18 Oct 2017 10:10:03 GMT' - NULL]
Alpha   2      0:00:40   ['Wed, 18 Oct 2017 10:11:16 GMT' -'Wed, 18 Oct 2017 10:10:36 GMT']
Beta    1      NULL      ['Wed, 18 Oct 2017 10:12:03 GMT' - NULL]
Alpha   3      0:00:35   ['Wed, 18 Oct 2017 10:12:03 GMT' -'Wed, 18 Oct 2017 10:11:28 GMT']
Gamma   1      NULL      ['Fri, 27 Oct 2017 22:57:12 GMT' - NULL]
Beta    2      0:00:51   ['Wed, 18 Oct 2017 10:13:40 GMT' -'Wed, 18 Oct 2017 10:12:49 GMT']

我的要求如下:

一个.对于给定的人(Alpha、Beta 或 Gamma),变量time_between_attempts"的第一次出现将为零/NULL - 在示例中我将其显示为 NULL.

a. For a given person (Alpha, Beta or Gamma), the first occurrence of the variable 'time_between_attempts' would be zero/NULL - in the example I have shown it as NULL.

B.第二次(以及随后的)同一个人出现时将有一个非 NULL 或非零的 'time_between_attempts'.该变量是通过取上一个任务的结束时间和下一个任务的开始时间之间的差值来计算的.

b. The second (and the subsequent) times, the same person appears will have a non NULL or non-zero 'time_between_attempts'. This variable is calculated by taking the difference between the ending time of the previous task and the starting time of the next task.

我在这方面有以下问题:

I have following question in this regard:

  1. 如何编写可以帮助我实现所需输出的 ​​SQL 脚本?

请注意,TaskID 写为整数只是为了简化.在原始数据中,TaskID 很复杂,由不连续的字符串组成:

Please note that the TaskID is written as integer just for simplification. In the original data, TaskID is complicated and consists of non-continuous strings as:

'q:1392763916495:441',
'q:1392763916495:436'

对此的任何建议将不胜感激.

Any advice on this would be greatly appreciated.

推荐答案

这回答了问题的原始版本.

This answers the original version of the question.

您可以使用 lag()timestampdiff() 进行计算.假设您的值是一个真实的日期/时间或时间戳,那么您可以轻松地以秒为单位计算该值:

You can use lag() and timestampdiff() for the calculation. Assuming your value is a real date/time or timestamp, then you can easily calculate the value in seconds:

select t.*,
       timestampdiff(start_time,
                     lag(end_time) over (partition by person_id order by start_time)
                     seconds
                    )
from t;

如果值存储为字符串,请修复数据!同时,您可以在函数中使用str_to_date().

If the values are stored as string, fix the data! In the meantime, you can use str_to_date() in the function.

要将其作为时间值:

select t.*,
       (time(0) +
        interval timestampdiff(start_time,
                               lag(end_time) over (partition by person_id order by start_time)
                               seconds
                              ) second
       )
from t;

这篇关于连续行之间每人之间的时间差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆