在PostgreSQL中寻找多个用户的时间戳之间的差距 [英] Finding Gaps in Timestamps for Multiple Users in PostgreSQL

查看:60
本文介绍了在PostgreSQL中寻找多个用户的时间戳之间的差距的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一个包含过去5年中多个办公室的入住和退房时间的数据集.要求我进行的项目之一是在假定正常营业时间(上午8点至下午5点)的情况下,计算每个房间在各个时间段(每天,每周,每月等)忙碌和空置的时间.为期两天的数据集示例如下:

I am working with a dataset containing Check-In and Check-Out times for multiple office rooms over the last 5 years. One of the projects I was asked to work on was calculating the amount of time each room is busy and vacant over various time ranges (daily, weekly, monthly, etc.) assuming normal operational hours (8am to 5pm). A sample of the dataset for two days looks like this:

room_id         start_dt                end_dt
Room: Room 3    2019-05-04 09:00:00     2019-05-04 11:30:00
Room: Room 3    2019-05-04 11:30:00     2019-05-04 12:15:00
Room: Room 3    2019-05-04 12:30:00     2019-05-04 13:00:00
Room: Room 3    2019-05-05 09:00:00     2019-05-05 13:00:00
Room: Room 4    2019-05-04 08:00:00     2019-05-04 09:00:00
Room: Room 4    2019-05-04 09:00:00     2019-05-04 11:00:00
Room: Room 4    2019-05-04 14:00:00     2019-05-04 16:00:00
Room: Room 4    2019-05-05 08:30:00     2019-05-05 09:30:00

我借用并修改了@Branko Dimitrijevic在先前StackOverflow帖子中编写的一些代码(全文:

I borrowed and modified some code written in a previous StackOverflow post by @Branko Dimitrijevic (full post: SQL Query to show gaps between multiple date ranges) to try and handle multiple different rooms. Below is my modified code with two instances of room_id in the SELECT clause for debugging purposes:

SELECT t1.room_id, t2.room_id, end_dt, start_dt, start_dt - end_dt as gap_dur
FROM
    (
        SELECT DISTINCT room_id, start_dt, ROW_NUMBER() OVER (ORDER BY start_dt) RN
        FROM my_table T1
        WHERE
            NOT EXISTS (
                SELECT *
                FROM my_table T2
                WHERE (T1.start_dt > T2.start_dt and t1.resource = t2.resource)
                    AND (T1.start_dt < T2.end_dt and t1.resource = t2.resource)
            )
        ) T1
    JOIN (
        SELECT DISTINCT resource, end_dt, ROW_NUMBER() OVER (ORDER BY end_dt) RN
        FROM my_table T1
        WHERE
            NOT EXISTS (
                SELECT *
                FROM my_table T2
                WHERE (T1.end_dt > T2.start_dt and t1.resource = t2.resource)
                    AND (T1.end_dt < T2.end_dt and t1.resource = t2.resource)
            )
    ) T2
    ON T1.RN - 1 = T2.RN
WHERE
    end_dt < start_dt

这是我收到的输出:

room_id         room_id         end_dt                  start_dt                gap_dur
Room: Exam 4    Room: Exam 4    2019-05-04 16:00:00     2019-05-05 08:30:00     16:30:00
Room: Exam 4    Room: Exam 3    2019-05-04 13:00:00     2019-05-04 14:00:00     01:00:00
Room: Exam 3    Room: Exam 3    2019-05-04 12:15:00     2019-05-04 12:30:00     00:15:00

但是,这在不同的房间之间变得越来越混乱,而且我不知道如何实施工作日约束,例如查找上午8点和第一个预定事件之间的时间间隔.下面是一个最佳输出,或者至少是一种数据格式,可用于通过一些简单的GROUP BY脚本来计算我需要的统计信息:

However, this is becoming confused between different rooms and I don't know how to implement workday constraints, such as finding time gaps between 8am and the first scheduled event. Below is an optimal output, or at least a data format that would be usable to calculate the statistics I would need with some simple GROUP BY scripts:

room_id         end_dt                  start_dt                gap_dur
Room: Exam 3    2019-05-04 08:00:00     2019-05-04 09:00:00     01:00:00
Room: Exam 3    2019-05-04 12:15:00     2019-05-04 12:30:00     00:15:00
Room: Exam 3    2019-05-04 13:00:00     2019-05-04 17:00:00     04:00:00
Room: Exam 3    2019-05-05 08:00:00     2019-05-05 09:00:00     01:00:00
Room: Exam 3    2019-05-05 13:00:00     2019-05-05 17:00:00     04:00:00
Room: Exam 4    2019-05-04 11:00:00     2019-05-04 14:00:00     03:00:00
Room: Exam 4    2019-05-04 16:00:00     2019-05-04 17:00:00     01:00:00
Room: Exam 4    2019-05-05 08:00:00     2019-05-05 08:30:00     00:30:00
Room: Exam 4    2019-05-05 09:30:00     2019-05-05 17:00:00     09:30:00

在此方面的任何帮助将不胜感激,并乐于提供其他信息,如果有帮助的话!

Any help on this would be greatly appreciated and happy to provide additional information if it helps!

推荐答案

我被要求从事的一个项目是在假定正常营业时间(上午8点至下午5点)的情况下,计算每个房间在不同时间范围(每天,每周,每月等)中繁忙和空置的时间.

One of the projects I was asked to work on was calculating the amount of time each room is busy and vacant over various time ranges (daily, weekly, monthly, etc.) assuming normal operational hours (8am to 5pm).

根据您的样本数据,两个假设似乎是合理的:

Based on your sample data, two assumptions seem reasonable:

  • 忙碌"时段不会重叠.
  • 忙碌"时段都在一天之内.

如果这些都不成立,建议您提出一个新问题,并提供适当的解释和示例数据.

If these are not true, I would suggest that you ask a NEW question with appropriate explanation and sample data.

然后对于给定的一天,计算非常简单:

The calculation is then pretty simple for a given day:

select date_trunc('day', start_dt),
       sum( least(extract(epoch from end_dt), v.epoch2) - 
            greatest(extract(epoch from start_dt), epoch1)
          ) as busy_seconds,
       (epoch2 - epoch1 -
        sum( least(extract(epoch from end_dt), v.epoch2) - 
             greatest(extract(epoch from start_dt), epoch1)
           )
       ) as free_seconds
from rooms r cross join
     (values (extract(epoch from date_trunc('day', start_dt) + interval '8 hour'),
              extract(epoch from date_trunc('day', start_dt) + interval '17 hour')
             )
     ) v(epoch1, epoch2)                  
group by date_trunc('day', start_dt)

这篇关于在PostgreSQL中寻找多个用户的时间戳之间的差距的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆