MySQL GROUP BY DateTime +/- 3秒 [英] MySQL GROUP BY DateTime +/- 3 seconds

查看:663
本文介绍了MySQL GROUP BY DateTime +/- 3秒的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个包含3列的表:


  • id(PK,int)

  • timestamp(datetime)
  • 标题(文本)


我有以下记录:

  1,2010-01-01 15:00:00,Some Title 
2,2010-01-01 15 :00:02,Some Title
3,2010-01-02 15:00:00,Some Title

我需要做一个3秒钟之内的GROUP BY记录。对于此表,第1行和第2行将被组合在一起。



这里有一个类似的问题: Mysql DateTime组15分钟

我还发现: http://www.artfulsoftware.com/infotree/queries.php#106



我不知道如何将这些方法转换为可以在几秒钟内工作的东西。 SO问题中的方法的问题在于,它似乎只适用于从已知点开始的时间区间内的记录。例如,如果我得到 FLOOR()以秒为单位工作,间隔为5秒,则15:00:04的时间将被分组为15: 00:01,但没有与15:00:06分组。



这有意义吗?请让我知道是否需要进一步澄清。



编辑:对于数字集{1,2,3,4,5 ,6,7,50,51,60},似乎最好将它们分组{1,2,3,4,5,6,7},{50,51},{60},以便每个分组行取决于该行是否在先前的3秒内。我知道这会改变一些东西,我很抱歉在wishywashy上。



我试图模糊匹配来自不同服务器的日志。服务器#1可以记录项目,项目#1和服务器#2将在服务器#1的几秒钟内记录相同的项目项目#1。我需要在两个日志行上执行一些聚合函数。不幸的是,由于服务器软件的性质,我只能继续使用它。

我正在使用Tom H 。这是一个很好的主意,但在这里做了一些改变:

不是找到所有的链条起始行,我们可以找到所有这是链的开始,然后返回并找到匹配时间的行。



查询#1在这里应该告诉你哪些时间是通过查找哪些时间没有任何时间低于他们,但在3秒内链:

  SELECT DISTINCT时间戳
从表a
LEFT JOIN表b
ON(b.Timestamp> = a.TimeStamp - INTERVAL 3 SECONDS
AND b.Timestamp< a.Timestamp)
WHERE b.Timestamp IS NULL

然后对于每一行,我们可以找到最小的链起始时间戳,它小于我们使用Query#2的时间戳:

  SELECT Table.id,MAX(StartOfChains.TimeStamp)AS ChainStartTime 
FROM Table
JOIN([query#1])StartofChains
ON Table.Timestamp> = StartOfChains.TimeStamp
GROUP BY Table.id

一旦我们有了这些,我们可以按照您的想法对其进行GROUP BY。

  SELECT COUNT(*) - 或者任何
FROM表
JOIN([query#2] )GroupingQuery
ON Table.id = GroupingQuery.id
GROUP BY GroupingQuery.ChainStartTime

我不完全确定Tom H的答案是独立发布的,但这听起来像是你在实施过程中遇到了麻烦,而我正在考虑这个问题,所以我想我会再发布一次。祝你好运!


Suppose I have a table with 3 columns:

  • id (PK, int)
  • timestamp (datetime)
  • title (text)

I have the following records:

1, 2010-01-01 15:00:00, Some Title
2, 2010-01-01 15:00:02, Some Title
3, 2010-01-02 15:00:00, Some Title

I need to do a GROUP BY records that are within 3 seconds of each other. For this table, rows 1 and 2 would be grouped together.

There is a similar question here: Mysql DateTime group by 15 mins

I also found this: http://www.artfulsoftware.com/infotree/queries.php#106

I don't know how to convert these methods into something that will work for seconds. The trouble with the method on the SO question is that it seems to me that it would only work for records falling within a bin of time that starts at a known point. For instance, if I were to get FLOOR() to work with seconds, at an interval of 5 seconds, a time of 15:00:04 would be grouped with 15:00:01, but not grouped with 15:00:06.

Does this make sense? Please let me know if further clarification is needed.

EDIT: For the set of numbers, {1, 2, 3, 4, 5, 6, 7, 50, 51, 60}, it seems it might be best to group them {1, 2, 3, 4, 5, 6, 7}, {50, 51}, {60}, so that each grouping row depends on if the row is within 3 seconds of the previous. I know this changes things a bit, I'm sorry for being wishywashy on this.

I am trying to fuzzy-match logs from different servers. Server #1 may log an item, "Item #1", and Server #2 will log that same item, "Item #1", within a few seconds of server #1. I need to do some aggregate functions on both log lines. Unfortunately, I only have title to go on, due to the nature of the server software.

解决方案

I'm using Tom H.'s excellent idea but doing it a little differently here:

Instead of finding all the rows that are the beginnings of chains, we can find all times that are the beginnings of chains, then go back and ifnd the rows that match the times.

Query #1 here should tell you which times are the beginnings of chains by finding which times do not have any times below them but within 3 seconds:

SELECT DISTINCT Timestamp
FROM Table a
LEFT JOIN Table b
ON (b.Timestamp >= a.TimeStamp - INTERVAL 3 SECONDS
    AND b.Timestamp < a.Timestamp)
WHERE b.Timestamp IS NULL

And then for each row, we can find the largest chain-starting timestamp that is less than our timestamp with Query #2:

SELECT Table.id, MAX(StartOfChains.TimeStamp) AS ChainStartTime
FROM Table
JOIN ([query #1]) StartofChains
ON Table.Timestamp >= StartOfChains.TimeStamp
GROUP BY Table.id

Once we have that, we can GROUP BY it as you wanted.

SELECT COUNT(*) --or whatever
FROM Table
JOIN ([query #2]) GroupingQuery
ON Table.id = GroupingQuery.id
GROUP BY GroupingQuery.ChainStartTime

I'm not entirely sure this is distinct enough from Tom H's answer to be posted separately, but it sounded like you were having trouble with implementation, and I was thinking about it, so I thought I'd post again. Good luck!

这篇关于MySQL GROUP BY DateTime +/- 3秒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆