在不同时间之间在数据库中查找同时发生的事件 [英] Finding simultaneous events in a database between times

查看:26
本文介绍了在不同时间之间在数据库中查找同时发生的事件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个存储电话记录的数据库.每个电话记录都有一个开始时间和一个结束时间.我想知道同时发生的最大电话数量是多少,以便知道我们是否超过了电话银行中可用的电话线路数量.我该如何解决这个问题?

I have a database that stores phone call records. Each phone call record has a start time and an end time. I want to find out what is the maximum amount of phone calls that are simultaneously happening in order to know if we have exceed the amount of available phone lines in our phone bank. How could I go about solving this problem?

推荐答案

免责声明:我正在根据以下(优秀的)帖子撰写我的答案:

Disclaimer: I'm writing my answer based on the (excelent) following post:

http://sqlmag.com/t-sql/calculating-concurrent-sessions-part-3(推荐 Part1 和 Part-2)

http://sqlmag.com/t-sql/calculating-concurrent-sessions-part-3 (Part1 and 2 are recomended also)

首先要了解这个问题,目前在互联网上找到的大多数解决方案基本上都有两个问题

The first thing to understand here with that problem is that most of the current solutions found in the internet can have basically two issues

  • 结果不是正确答案(例如,如果范围 A 与 B 和 C 重叠,但 B 不与 C 重叠,则它们计为 3 个重叠范围).
  • 计算它的方法非常低效(因为 O(n^2) 和/或它们在周期内每秒循环一次)

像 Unreasons 提出的解决方案中常见的性能问题是一个 cuadratic 解决方案,对于每个调用,您需要检查所有其他调用是否重叠.

The common performance problem in solutions like the proposed by Unreasons is a cuadratic solution, for each call you need to check all the other calls if they are overlaped.

有一个算法线性通用解决方案,它列出按日期排序的所有事件"(开始呼叫和结束呼叫),并为开始加 1,为挂断减 1,并记住最大值.这可以通过游标轻松实现(Hafhor 提出的解决方案似乎就是这种方式),但游标并不是解决问题的最有效方法.

there is an algoritmical linear common solution that is list all the "events" (start call and end call) ordered by date, and add 1 for a start and substract 1 for a hang-up, and remember the max. That can be implemented easily with a cursor (solution proposed by Hafhor seems to be in that way) but cursors are not the most efficient ways to solve problems.

引用的文章有很好的例子,不同的解决方案,它们的性能比较.建议的解决方案是:

The referenced article has excelent examples, differnt solutions, performance comparison of them. The proposed solution is:

WITH C1 AS
(
  SELECT starttime AS ts, +1 AS TYPE,
    ROW_NUMBER() OVER(ORDER BY starttime) AS start_ordinal
  FROM Calls

  UNION ALL

  SELECT endtime, -1, NULL
  FROM Calls
),
C2 AS
(
  SELECT *,
    ROW_NUMBER() OVER(  ORDER BY ts, TYPE) AS start_or_end_ordinal
  FROM C1
)
SELECT MAX(2 * start_ordinal - start_or_end_ordinal) AS mx
FROM C2
WHERE TYPE = 1

<小时>

说明

假设这组数据

+-------------------------+-------------------------+
|        starttime        |         endtime         |
+-------------------------+-------------------------+
| 2009-01-01 00:02:10.000 | 2009-01-01 00:05:24.000 |
| 2009-01-01 00:02:19.000 | 2009-01-01 00:02:35.000 |
| 2009-01-01 00:02:57.000 | 2009-01-01 00:04:04.000 |
| 2009-01-01 00:04:12.000 | 2009-01-01 00:04:52.000 |
+-------------------------+-------------------------+

这是一种用查询实现相同想法的方法,每次调用开始加 1,每次结束减 1.

This is a way to implement with a query the same idea, adding 1 for each starting of a call and substracting 1 for each ending.

  SELECT starttime AS ts, +1 AS TYPE,
    ROW_NUMBER() OVER(ORDER BY starttime) AS start_ordinal
  FROM Calls

C1 CTE 的这一部分将记录每次调用的每个开始时间并为其编号

this part of the C1 CTE will take each starttime of each call and number it

+-------------------------+------+---------------+
|           ts            | TYPE | start_ordinal |
+-------------------------+------+---------------+
| 2009-01-01 00:02:10.000 |    1 |             1 |
| 2009-01-01 00:02:19.000 |    1 |             2 |
| 2009-01-01 00:02:57.000 |    1 |             3 |
| 2009-01-01 00:04:12.000 |    1 |             4 |
+-------------------------+------+---------------+

现在这个代码

  SELECT endtime, -1, NULL
  FROM Calls

将生成所有没有行编号的结束时间"

Will generate all the "endtimes" without row numbering

+-------------------------+----+------+
|         endtime         |    |      |
+-------------------------+----+------+
| 2009-01-01 00:02:35.000 | -1 | NULL |
| 2009-01-01 00:04:04.000 | -1 | NULL |
| 2009-01-01 00:04:52.000 | -1 | NULL |
| 2009-01-01 00:05:24.000 | -1 | NULL |
+-------------------------+----+------+

现在使 UNION 具有完整的 C1 CTE 定义,您将混合两个表

Now making the UNION to have the full C1 CTE definition, you will have both tables mixed

+-------------------------+------+---------------+
|           ts            | TYPE | start_ordinal |
+-------------------------+------+---------------+
| 2009-01-01 00:02:10.000 |    1 |             1 |
| 2009-01-01 00:02:19.000 |    1 |             2 |
| 2009-01-01 00:02:57.000 |    1 |             3 |
| 2009-01-01 00:04:12.000 |    1 |             4 |
| 2009-01-01 00:02:35.000 | -1   |     NULL      |
| 2009-01-01 00:04:04.000 | -1   |     NULL      |
| 2009-01-01 00:04:52.000 | -1   |     NULL      |
| 2009-01-01 00:05:24.000 | -1   |     NULL      |
+-------------------------+------+---------------+

C2 使用新列计算排序和编号 C1

C2 is computed sorting and numbering C1 with a new column

C2 AS
(
  SELECT *,
    ROW_NUMBER() OVER(  ORDER BY ts, TYPE) AS start_or_end_ordinal
  FROM C1
)

+-------------------------+------+-------+--------------+
|           ts            | TYPE | start | start_or_end |
+-------------------------+------+-------+--------------+
| 2009-01-01 00:02:10.000 |    1 | 1     |            1 |
| 2009-01-01 00:02:19.000 |    1 | 2     |            2 |
| 2009-01-01 00:02:35.000 |   -1 | NULL  |            3 |
| 2009-01-01 00:02:57.000 |    1 | 3     |            4 |
| 2009-01-01 00:04:04.000 |   -1 | NULL  |            5 |
| 2009-01-01 00:04:12.000 |    1 | 4     |            6 |
| 2009-01-01 00:04:52.000 |   -1 | NULL  |            7 |
| 2009-01-01 00:05:24.000 |   -1 | NULL  |            8 |
+-------------------------+------+-------+--------------+

神奇的地方在于,任何时候#start - #ends 的结果就是此时的并发调用量.

And there is where the magic occurs, at any time the result of #start - #ends is the amount of cocurrent calls at this moment.

对于每个 Type = 1(开始事件),我们在第 3 列中有 #start 值.我们还有 #start + #end(在第 4 列中)

for each Type = 1 (start event) we have the #start value in the 3rd column. and we also have the #start + #end (in the 4th column)

#start_or_end = #start + #end

#end = (#start_or_end - #start)

#start - #end = #start - (#start_or_end - #start)

#start - #end = 2 * #start - #start_or_end

在 SQL 中如此:

SELECT MAX(2 * start_ordinal - start_or_end_ordinal) AS mx
FROM C2
WHERE TYPE = 1

在这种情况下,使用建议的调用集,结果是 2.

In this case with the prposed set of calls, the result is 2.

在提议的文章中,有一点改进,例如按服务或电话公司"或电话中心"分组结果,此想法也可用于按时间段分组并具有给定一天中每小时的最大并发数.

In the proposed article, there is a little improvment to have a grouped result by for example a service or a "phone company" or "phone central" and this idea can also be used to group for example by time slot and have the maximum concurrency hour by hour in a given day.

这篇关于在不同时间之间在数据库中查找同时发生的事件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆