计算30天箱中的行数 [英] Count the number of rows in 30 day bins

查看:115
本文介绍了计算30天箱中的行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

表格中的每一行都有一个日期时间戳,我想从现在开始查询数据库,计算过去30天,30天前的行数,等等。直到有一个30天的bin回到表的开始。



我已经通过使用Python并进行了多次查询来成功地执行了这个查询。但我几乎可以肯定,它可以在一个单一的MySQL查询。

解决方案

没有存储过程,临时表一个查询以及在日期列上给出索引的有效执行计划:

  select 

subdate (
'2012-12-31',
floor(dateDiff('2012-12-31',dateStampColumn)/ 30)* 30 + 30 - 1
) ,

subdate(
'2012-12-31',
floor(dateDiff('2012-12-31',dateStampColumn)/ 30)* 30
)作为期末,

计数(*)


YOURTABLE
按楼层(dateDiff('2012-12-31' ,dateStampColumn)/ 30);

很明显这里发生了什么,除了这个咒语:

  floor(dateDiff('2012-12-31',dateStampColumn)/ 30)

该表达式出现几次,它的计算结果为30天前的日期 dateStampColumn dateDiff 返回天数差,将其除以30,以30天为周期,并将其全部馈送到 floor()将其舍入为整数。一旦我们有这个号码,我们可以 GROUP BY 它,并且我们做一些数学把这个数字翻译成期间的开始和结束日期。 p>

使用 now()替换'2012-12-31' >如果你喜欢。下面是一些示例数据:

  CREATE TABLE YOURTABLE 
(`Id` int,`dateStampColumn` datetime);

INSERT INTO YOURTABLE
(`Id`,`dateStampColumn`)
VALUES
(1,'2012-10-15 02:00:00'),
(1,'2012-10-17 02:00:00'),
(1,'2012-10-30 02:00:00'),
(1, 2012-10-31 02:00:00'),
(1,'2012-11-01 02:00:00'),
(1,'2012-11-02 02:00 :00'),
(1,'2012-11-18 02:00:00'),
(1,'2012-11-19 02:00:00'),
(1,'2012-11-21 02:00:00'),
(1,'2012-11-25 02:00:00'),
(1,'2012-11 -25 02:00:00'),
(1,'2012-11-26 02:00:00'),
(1,'2012-11-26 02:00:00' ),
(1,'2012-11-24 02:00:00'),
(1,'2012-11-23 02:00:00'),
,'2012-11-28 02:00:00'),
(1,'2012-11-29 02:00:00'),
(1,'2012-11-30 02 :00:00'),
(1,'2012-12-01 02:00:00'),
(1,'2012-12-02 02:00:00'),
(1,'2012-12-15 02:00:00'),
(1,'2012-12-17 02:00:00'),
(1,'2012 -12-18 02:00:00'),
(1,'2012-12-19 02:00:00'),
(1,'2012-12-21 02:00: 00'),
(1,'2012-12-25 02:00:00'),
(1,'2012-12-25 02:00:00'),
(1,'2012-12-26 02:00:00'),
(1,'2012-12-26 02:00:00'),
(1,'2012-12- 24 02:00:00'),
(1,'2012-12-23 02:00:00'),
(1,'2012-12-31 02:00:00') ,
(1,'2012-12-30 02:00:00'),
(1,'2012-12-28 02:00:00'),
(1, '2012-12-28 02:00:00'),
(1,'2012-12-30 02:00:00');

结果:

 期间开始期间结束计数(*)
2012-12-02 2012-12-31 17
2012-11-02 2012-12-01 14
2012 -10-03 2012-11-01 5

期末端点包含。



SQL Fiddle 中使用此功能。 / p>

有一点潜在的愚蠢,任何30天的周期,零匹配行不会包括在结果中。如果你可以加入一个时间表,这可以消除。但是,MySQL没有像PostgreSQL的 generate_series() ,因此您必须在应用程式中处理它,或尝试这个聪明的黑客


Each row in my table has a date time stamp, and I wish to query the database from now, to count how many rows are in the last 30 days, the 30 days before that and so on. Until there is a 30 day bin going back to the start of the table.

I have successfully carried out this query by using Python and making several queries. But I'm almost certain that it can be done in one single MySQL query.

解决方案

No stored procedures, temporary tables, only one query, and an efficient execution plan given an index on the date column:

select

  subdate(
    '2012-12-31',
    floor(dateDiff('2012-12-31', dateStampColumn) / 30) * 30 + 30 - 1
  ) as "period starting",

  subdate(
    '2012-12-31',
    floor(dateDiff('2012-12-31', dateStampColumn) / 30) * 30
  ) as "period ending",

  count(*)

from
  YOURTABLE
group by floor(dateDiff('2012-12-31', dateStampColumn) / 30);

It should be pretty obvious what is happening here, except for this incantation:

floor(dateDiff('2012-12-31', dateStampColumn) / 30)

That expression appears several times, and it evaluates to the number of 30-day periods ago dateStampColumn is. dateDiff returns the difference in days, divide it by 30 to get it in 30-day periods, and feed it all to floor() to round it to an integer. Once we have this number, we can GROUP BY it, and further we do a bit of math to translate this number back into the starting and ending dates of the period.

Replace '2012-12-31' with now() if you prefer. Here's some sample data:

CREATE TABLE YOURTABLE
    (`Id` int, `dateStampColumn` datetime);

INSERT INTO YOURTABLE
    (`Id`, `dateStampColumn`)
VALUES
    (1, '2012-10-15 02:00:00'),
    (1, '2012-10-17 02:00:00'),
    (1, '2012-10-30 02:00:00'),
    (1, '2012-10-31 02:00:00'),
    (1, '2012-11-01 02:00:00'),
    (1, '2012-11-02 02:00:00'),
    (1, '2012-11-18 02:00:00'),
    (1, '2012-11-19 02:00:00'),
    (1, '2012-11-21 02:00:00'),
    (1, '2012-11-25 02:00:00'),
    (1, '2012-11-25 02:00:00'),
    (1, '2012-11-26 02:00:00'),
    (1, '2012-11-26 02:00:00'),
    (1, '2012-11-24 02:00:00'),
    (1, '2012-11-23 02:00:00'),
    (1, '2012-11-28 02:00:00'),
    (1, '2012-11-29 02:00:00'),
    (1, '2012-11-30 02:00:00'),
    (1, '2012-12-01 02:00:00'),
    (1, '2012-12-02 02:00:00'),
    (1, '2012-12-15 02:00:00'),
    (1, '2012-12-17 02:00:00'),
    (1, '2012-12-18 02:00:00'),
    (1, '2012-12-19 02:00:00'),
    (1, '2012-12-21 02:00:00'),
    (1, '2012-12-25 02:00:00'),
    (1, '2012-12-25 02:00:00'),
    (1, '2012-12-26 02:00:00'),
    (1, '2012-12-26 02:00:00'),
    (1, '2012-12-24 02:00:00'),
    (1, '2012-12-23 02:00:00'),
    (1, '2012-12-31 02:00:00'),
    (1, '2012-12-30 02:00:00'),
    (1, '2012-12-28 02:00:00'),
    (1, '2012-12-28 02:00:00'),
    (1, '2012-12-30 02:00:00');

And the result:

period starting     period ending   count(*)
2012-12-02          2012-12-31      17
2012-11-02          2012-12-01      14
2012-10-03          2012-11-01      5

period endpoints are inclusive.

Play with this in SQL Fiddle.

There's a bit of potential goofiness in that any 30 day period with zero matching rows will not be included in the result. If you could join this against a table of periods, that could be eliminated. However, MySQL doesn't have anything like PostgreSQL's generate_series(), so you'd have to deal with it in your application or try this clever hack.

这篇关于计算30天箱中的行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆