如何有效地对联接中的行进行预过滤? [英] How to efficiently pre-filter rows in conjunction with a join?
问题描述
我需要计算表中已过滤行之间的时间戳.
I need to calculate timestamps between filtered rows in a table.
我正在使用基于另一个问题的有用答案的联接:
I am using a join based on a helpful answer from another question:
StackOverflow:连续两行中的时间戳之间的差异在单个表中
StackOverflow: difference between timestamps in two consecutive rows in single table
我的问题是我的表混合了来自多个不同对象的数据,我希望在执行连接之前我需要先对"object_id = blah"进行过滤,因为如果我不进行预过滤,则连接会有所不同来自不相关的object_id的时间戳(因为它只是逐行连续查找).
My problem is that my table has mixed data from multiple different objects and I expect that I need to first filter on "object_id=blah" before performing the join, because if I don't pre-filter then the join will difference the timestamps from unrelated object_id's (because it's just looking consecutively, row by row).
我需要联接根据上一次出现的object_id = blah进行时间戳差异,而不仅仅是前一行.
I need the join to do the timestamp differences based on the previous occurrence of object_id=blah, not merely the immediately preceeding row.
寻找如何最有效地解决此查询. 提前致谢! = D
Looking for how to solve this query most efficiently. Thanks in advance! =D
我已经尝试在JOIN之后添加WHERE object_id = blah,并且行数正确,但是时间戳差异仍然仅基于每个时间戳差异中的前一行.
I've already tried adding a WHERE object_id=blah after the JOIN, and I got the right number of rows, but the timestamp diffs were still just based on the prior row in each timestamp difference.
我真的希望WHERE可以在JOIN发生之前应用于源表,但是显然这不是这样的. :-(
I was really hoping the WHERE would apply to the source table before the JOIN occurred, but clearly that isn't how this works. :-(
SELECT
t1.scanid, t1.event_id, t1.objectect_id, t1.object_timestamp, t1.object_counter,
TIMEDIFF(t1.object_timestamp, t2.object_timestamp) AS diff
FROM event_data AS t1
LEFT JOIN event_data AS t2 ON ( t2.scanid = t1.scanid - 1);
--+---------------+--------------+------+-----+----------------------+----------------+
--| Field | Type | Null | Key | Default | Extra |
--+---------------+--------------+------+-----+----------------------+----------------+
--| scanid | int(11) | NO | PRI | NULL | auto_increment |
--| event_id | int(12) | NO | | NULL | |
--| objectect_id | int(11) | NO | | NULL | |
--| obj_timestamp | timestamp(3) | NO | | CURRENT_TIMESTAMP(3) | |
--| obj_counter | int(11) | YES | | -1 | |
--+---------------+--------------+------+-----+----------------------+----------------+
在WHERE object_id = 2的情况下:
And with the WHERE object_id=2:
SELECT
t1.scanid, t1.event_id, t1.objectect_id, t1.object_timestamp, t1.object_counter,
TIMEDIFF(t1.object_timestamp, t2.object_timestamp) AS diff
FROM event_data AS t1
LEFT JOIN event_data AS t2 ON ( t2.scanid = t1.scanid - 1)
WHERE t1.object_id = 2;
这是原始对象数据:
+--------+----------+------------+-------------------------+-------------+
| scanid | event_id | object_id | obj_timestamp | obj_counter |
+--------+----------+------------+-------------------------+-------------+
| 1 | 1 | 2 | 2019-02-17 13:11:02.425 | 0 |
| 2 | 1 | 0 | 2019-02-17 13:11:08.227 | 0 |
| 3 | 1 | 0 | 2019-02-17 13:11:12.303 | 1 |
| 4 | 1 | 0 | 2019-02-17 13:11:31.383 | 2 |
| 5 | 1 | 0 | 2019-02-17 13:11:32.417 | 3 |
| 6 | 1 | 0 | 2019-02-17 13:11:33.451 | 4 |
| 7 | 1 | 0 | 2019-02-17 13:11:34.839 | 5 |
| 8 | 1 | 0 | 2019-02-17 13:11:35.868 | 6 |
| 9 | 1 | 0 | 2019-02-17 13:12:05.143 | 7 |
| 10 | 1 | 0 | 2019-02-17 13:13:08.733 | 8 |
| 11 | 1 | 0 | 2019-02-17 13:13:11.169 | 9 |
| 12 | 1 | 0 | 2019-02-17 13:13:22.239 | 10 |
| 13 | 1 | 0 | 2019-02-17 13:13:24.256 | 11 |
| 14 | 1 | 0 | 2019-02-17 13:13:26.875 | 12 |
| 15 | 1 | 0 | 2019-02-17 13:13:27.910 | 13 |
| 16 | 1 | 2 | 2019-02-17 13:16:24.326 | 1 |
| 17 | 1 | 2 | 2019-02-17 13:16:25.362 | 2 |
| 18 | 1 | 2 | 2019-02-17 13:19:48.318 | 3 |
| 19 | 1 | 2 | 2019-02-17 13:25:01.604 | 4 |
| 20 | 1 | 2 | 2019-02-17 13:30:17.024 | 5 |
| 21 | 1 | 0 | 2019-02-17 13:39:19.664 | 14 |
| 22 | 1 | 0 | 2019-02-17 13:39:20.696 | 15 |
| 23 | 1 | 2 | 2019-02-17 13:41:12.324 | 6 |
| 24 | 1 | 2 | 2019-02-17 13:41:13.349 | 7 |
| 25 | 1 | 0 | 2019-02-17 13:41:14.381 | 16 |
| 26 | 1 | 0 | 2019-02-17 13:41:17.436 | 17 |
| 27 | 1 | 2 | 2019-02-17 13:41:18.467 | 8 |
| 28 | 1 | 0 | 2019-02-17 13:41:20.503 | 18 |
| 29 | 1 | 0 | 2019-02-17 13:41:21.535 | 19 |
| 30 | 1 | 0 | 2019-02-17 13:41:22.563 | 20 |
| 31 | 1 | 2 | 2019-02-17 13:41:23.591 | 9 |
| 32 | 1 | 2 | 2019-02-17 13:41:24.619 | 10 |
+--------+----------+------------+-------------------------+-------------+
32 rows in set (0.00 sec)
实际输出(不包含WHERE):
Actual Output (without the WHERE):
+--------+----------+------------+-------------------------+-------------+--------------+
| scanid | event_id | object_id | obj_timestamp | obj_counter | diff |
+--------+----------+------------+-------------------------+-------------+--------------+
| 1 | 1 | 2 | 2019-02-17 13:11:02.425 | 0 | NULL |
| 2 | 1 | 0 | 2019-02-17 13:11:08.227 | 0 | 00:00:05.802 |
| 3 | 1 | 0 | 2019-02-17 13:11:12.303 | 1 | 00:00:04.076 |
| 4 | 1 | 0 | 2019-02-17 13:11:31.383 | 2 | 00:00:19.080 |
| 5 | 1 | 0 | 2019-02-17 13:11:32.417 | 3 | 00:00:01.034 |
| 6 | 1 | 0 | 2019-02-17 13:11:33.451 | 4 | 00:00:01.034 |
| 7 | 1 | 0 | 2019-02-17 13:11:34.839 | 5 | 00:00:01.388 |
| 8 | 1 | 0 | 2019-02-17 13:11:35.868 | 6 | 00:00:01.029 |
| 9 | 1 | 0 | 2019-02-17 13:12:05.143 | 7 | 00:00:29.275 |
| 10 | 1 | 0 | 2019-02-17 13:13:08.733 | 8 | 00:01:03.590 |
| 11 | 1 | 0 | 2019-02-17 13:13:11.169 | 9 | 00:00:02.436 |
| 12 | 1 | 0 | 2019-02-17 13:13:22.239 | 10 | 00:00:11.070 |
| 13 | 1 | 0 | 2019-02-17 13:13:24.256 | 11 | 00:00:02.017 |
| 14 | 1 | 0 | 2019-02-17 13:13:26.875 | 12 | 00:00:02.619 |
| 15 | 1 | 0 | 2019-02-17 13:13:27.910 | 13 | 00:00:01.035 |
| 16 | 1 | 2 | 2019-02-17 13:16:24.326 | 1 | 00:02:56.416 |
| 17 | 1 | 2 | 2019-02-17 13:16:25.362 | 2 | 00:00:01.036 |
| 18 | 1 | 2 | 2019-02-17 13:19:48.318 | 3 | 00:03:22.956 |
| 19 | 1 | 2 | 2019-02-17 13:25:01.604 | 4 | 00:05:13.286 |
| 20 | 1 | 2 | 2019-02-17 13:30:17.024 | 5 | 00:05:15.420 |
| 21 | 1 | 0 | 2019-02-17 13:39:19.664 | 14 | 00:09:02.640 |
| 22 | 1 | 0 | 2019-02-17 13:39:20.696 | 15 | 00:00:01.032 |
| 23 | 1 | 2 | 2019-02-17 13:41:12.324 | 6 | 00:01:51.628 |
| 24 | 1 | 2 | 2019-02-17 13:41:13.349 | 7 | 00:00:01.025 |
| 25 | 1 | 0 | 2019-02-17 13:41:14.381 | 16 | 00:00:01.032 |
| 26 | 1 | 0 | 2019-02-17 13:41:17.436 | 17 | 00:00:03.055 |
| 27 | 1 | 2 | 2019-02-17 13:41:18.467 | 8 | 00:00:01.031 |
| 28 | 1 | 0 | 2019-02-17 13:41:20.503 | 18 | 00:00:02.036 |
| 29 | 1 | 0 | 2019-02-17 13:41:21.535 | 19 | 00:00:01.032 |
| 30 | 1 | 0 | 2019-02-17 13:41:22.563 | 20 | 00:00:01.028 |
| 31 | 1 | 2 | 2019-02-17 13:41:23.591 | 9 | 00:00:01.028 |
| 32 | 1 | 2 | 2019-02-17 13:41:24.619 | 10 | 00:00:01.028 |
+--------+----------+------------+-------------------------+-------------+--------------+
32 rows in set (0.01 sec)
在WHERE object_id = 2的情况下:
And with the WHERE object_id=2:
+--------+----------+------------+-------------------------+-------------+--------------+
| scanid | event_id | object_id | obj_timestamp | obj_counter | diff |
+--------+----------+------------+-------------------------+-------------+--------------+
| 1 | 1 | 2 | 2019-02-17 13:11:02.425 | 0 | NULL |
| 16 | 1 | 2 | 2019-02-17 13:16:24.326 | 1 | 00:02:56.416 |
| 17 | 1 | 2 | 2019-02-17 13:16:25.362 | 2 | 00:00:01.036 |
| 18 | 1 | 2 | 2019-02-17 13:19:48.318 | 3 | 00:03:22.956 |
| 19 | 1 | 2 | 2019-02-17 13:25:01.604 | 4 | 00:05:13.286 |
| 20 | 1 | 2 | 2019-02-17 13:30:17.024 | 5 | 00:05:15.420 |
| 23 | 1 | 2 | 2019-02-17 13:41:12.324 | 6 | 00:01:51.628 |
| 24 | 1 | 2 | 2019-02-17 13:41:13.349 | 7 | 00:00:01.025 |
| 27 | 1 | 2 | 2019-02-17 13:41:18.467 | 8 | 00:00:01.031 |
| 31 | 1 | 2 | 2019-02-17 13:41:23.591 | 9 | 00:00:01.028 |
| 32 | 1 | 2 | 2019-02-17 13:41:24.619 | 10 | 00:00:01.028 |
+--------+----------+------------+-------------------------+-------------+--------------+
11 rows in set (0.00 sec)
推荐答案
从查看您的数据来看,似乎适合表的JOIN
条件实际上是t2.obj_counter = t1.obj_counter - 1 AND t2.object_id = t1.object_id
;这样可以确保仅将与给定对象相关的时间戳进行比较.因此,您的查询将变得(根据您的示例数据):
From looking at your data, it seems the appropriate JOIN
condition for your table is actually t2.obj_counter = t1.obj_counter - 1 AND t2.object_id = t1.object_id
; this will ensure that only timestamps related to a given object are compared with each other. So your query becomes (based on your sample data):
SELECT
t1.scanid, t1.event_id, t1.object_id, t1.obj_timestamp, t1.obj_counter,
TIMEDIFF(t1.obj_timestamp, t2.obj_timestamp) AS diff
FROM event_data AS t1
LEFT JOIN event_data AS t2 ON t2.obj_counter = t1.obj_counter - 1 AND t2.object_id = t1.object_id
WHERE t1.object_id = 2
ORDER BY t1.obj_counter
输出:
scanid event_id object_id obj_timestamp obj_counter diff
1 1 2 2019-02-17 13:11:02 0 null
16 1 2 2019-02-17 13:16:24 1 00:05:22
17 1 2 2019-02-17 13:16:25 2 00:00:01
18 1 2 2019-02-17 13:19:48 3 00:03:23
19 1 2 2019-02-17 13:25:02 4 00:05:14
20 1 2 2019-02-17 13:30:17 5 00:05:15
23 1 2 2019-02-17 13:41:12 6 00:10:55
24 1 2 2019-02-17 13:41:13 7 00:00:01
27 1 2 2019-02-17 13:41:18 8 00:00:05
31 1 2 2019-02-17 13:41:24 9 00:00:06
32 1 2 2019-02-17 13:41:25 10 00:00:01
这篇关于如何有效地对联接中的行进行预过滤?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!