自满足日期条件以来对行进行计数 [英] Count rows since a date condition is met

查看:91
本文介绍了自满足日期条件以来对行进行计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有以下列的R data.table

I have a R data.table with following columns

Date    Time    X   Y   
5/9/2016    12:00:00 AM 30  125 
5/9/2016    12:30:00 AM 27  127 
5/9/2016    1:00:00 AM  30  133 
5/9/2016    1:30:00 AM  25  115 
5/9/2016    2:00:00 AM  26  116 
5/9/2016    3:00:00 AM  34  128 
5/9/2016    3:30:00 AM  34  120 
5/9/2016    4:00:00 AM  26  133 
5/9/2016    5:00:00 AM  35  119 
5/9/2016    5:30:00 AM  35  126 
5/9/2016    6:00:00 AM  35  118 
5/9/2016    6:30:00 AM  28  120 
5/9/2016    7:00:00 AM  29  123 
5/9/2016    7:30:00 AM  30  125 
5/9/2016    8:30:00 AM  35  116 
5/9/2016    9:00:00 AM  32  123 
5/9/2016    9:30:00 AM  33  135 
5/9/2016    10:00:00 AM 32  127 
5/9/2016    10:30:00 AM 26  122 
5/9/2016    11:00:00 AM 28  122 
5/9/2016    11:30:00 AM 31  129 
5/9/2016    12:00:00 PM 29  127 
5/9/2016    12:30:00 PM 26  120 
5/9/2016    1:00:00 PM  32  126 
5/9/2016    1:30:00 PM  26  117 
5/9/2016    2:30:00 PM  32  133 
5/9/2016    3:00:00 PM  31  120 
5/9/2016    3:30:00 PM  35  129 
5/9/2016    4:00:00 PM  30  121 
5/9/2016    4:30:00 PM  35  132 
5/9/2016    5:00:00 PM  27  129 
5/9/2016    5:30:00 PM  27  122 
5/9/2016    6:00:00 PM  28  121 
5/9/2016    6:30:00 PM  35  127 
5/9/2016    7:00:00 PM  34  132 
5/9/2016    7:30:00 PM  25  133 
5/9/2016    8:00:00 PM  34  119 
5/9/2016    8:30:00 PM  33  125 
5/9/2016    9:00:00 PM  29  120 
5/9/2016    9:30:00 PM  34  125 
5/9/2016    10:00:00 PM 31  118 
5/9/2016    10:30:00 PM 33  118 
5/9/2016    11:00:00 PM 28  115 
5/9/2016    11:30:00 PM 34  126 
5/10/2016   12:00:00 AM 34  131 
5/10/2016   12:30:00 AM 25  130 
5/10/2016   1:00:00 AM  28  129 
5/10/2016   1:30:00 AM  26  132

我想要下面的输出-

Date    Time    X   Y   Rowcount@6hours
    5/9/2016    12:00:00 AM 30  125 0
    5/9/2016    12:30:00 AM 27  127 1
    5/9/2016    1:00:00 AM  30  133 2
    5/9/2016    1:30:00 AM  25  115 3
    5/9/2016    2:00:00 AM  26  116 4
    5/9/2016    3:00:00 AM  34  128 5
    5/9/2016    3:30:00 AM  34  120 6
    5/9/2016    4:00:00 AM  26  133 7
    5/9/2016    5:00:00 AM  35  119 8
    5/9/2016    5:30:00 AM  35  126 9
    5/9/2016    6:00:00 AM  35  118 10
    5/9/2016    6:30:00 AM  28  120 10
    5/9/2016    7:00:00 AM  29  123 10
    5/9/2016    7:30:00 AM  30  125 10
    5/9/2016    8:30:00 AM  35  116 9
    5/9/2016    9:00:00 AM  32  123 10
    5/9/2016    9:30:00 AM  33  135 10
    5/9/2016    10:00:00 AM 32  127 10
    5/9/2016    10:30:00 AM 26  122 10
    5/9/2016    11:00:00 AM 28  122 11
    5/9/2016    11:30:00 AM 31  129 11
    5/9/2016    12:00:00 PM 29  127 11
    5/9/2016    12:30:00 PM 26  120 11
    5/9/2016    1:00:00 PM  32  126 11
    5/9/2016    1:30:00 PM  26  117 11
    5/9/2016    2:30:00 PM  32  133 11
    5/9/2016    3:00:00 PM  31  120 11
    5/9/2016    3:30:00 PM  35  129 11
    5/9/2016    4:00:00 PM  30  121 11
    5/9/2016    4:30:00 PM  35  132 11
    5/9/2016    5:00:00 PM  27  129 11
    5/9/2016    5:30:00 PM  27  122 11
    5/9/2016    6:00:00 PM  28  121 11
    5/9/2016    6:30:00 PM  35  127 11
    5/9/2016    7:00:00 PM  34  132 11
    5/9/2016    7:30:00 PM  25  133 11
    5/9/2016    8:00:00 PM  34  119 11
    5/9/2016    8:30:00 PM  33  125 11
    5/9/2016    9:00:00 PM  29  120 12
    5/9/2016    9:30:00 PM  34  125 12
    5/9/2016    10:00:00 PM 31  118 12
    5/9/2016    10:30:00 PM 33  118 12
    5/9/2016    11:00:00 PM 28  115 12
    5/9/2016    11:30:00 PM 34  126 12
    5/10/2016   12:00:00 AM 34  131 12
    5/10/2016   12:30:00 AM 25  130 12
    5/10/2016   1:00:00 AM  28  129 12
    5/10/2016   1:30:00 AM  26  132 12

所以我的目标是计算我在最近n个小时中获得的行(例如n = 6)并将其附加到每一行(我想使用该值来确定 width 参数c $ c> rollmean 函数)。这里的数据粒度为30分钟,但它可能会更改为秒级(也可能不均匀!),并且可能会丢失行

So my objective is to count number of rows that I have got in last n hours (in example n=6) and append it to each row (I want to use that value to determine the width parameter in rollmean functions). Here data granularity is at 30 minutes but it can change up to seconds level (and can be nonuniform too!) and there could be missing rows

我尝试了 window.zoo xts ,但无法找到解决方案。我在另一个链接中发布了相同的问题,并且该响应在一天之内有效,但并非跨日有效计算R数据帧中两个时间戳之间的行数

I tried with window.zoo and xts but could not arrive at a solution. I posted the same question in another link and the response there works within a single day but not across days Count number of rows between two timestamps in R dataframe

推荐答案

使用:

# convert to a 'data.table' (if necessary)
# add a datetime & rowid variable
DT <- as.data.table(df)[, `:=` (DateTime = as.POSIXct(paste(Date, Time), format = '%m/%d/%Y %I:%M:%S%p'),
                                rid = .I)][]

# calculate the count with a rolling join
DT[, count6hr := DT[.(DateTime = DateTime - 6*60*60, rid = rid)
                    , on = 'DateTime'
                    , roll = -Inf
                    , mult = 'last'
                    , i.rid - rid]][]

它给出:


         Date       Time  X   Y            DateTime rid count6hr
 1:  5/9/2016 12:00:00AM 30 125 2016-05-09 00:00:00   1        0
 2:  5/9/2016 12:30:00AM 27 127 2016-05-09 00:30:00   2        1
 3:  5/9/2016  1:00:00AM 30 133 2016-05-09 01:00:00   3        2
 4:  5/9/2016  1:30:00AM 25 115 2016-05-09 01:30:00   4        3
 5:  5/9/2016  2:00:00AM 26 116 2016-05-09 02:00:00   5        4
 6:  5/9/2016  3:00:00AM 34 128 2016-05-09 03:00:00   6        5
 7:  5/9/2016  3:30:00AM 34 120 2016-05-09 03:30:00   7        6
 8:  5/9/2016  4:00:00AM 26 133 2016-05-09 04:00:00   8        7
 9:  5/9/2016  5:00:00AM 35 119 2016-05-09 05:00:00   9        8
10:  5/9/2016  5:30:00AM 35 126 2016-05-09 05:30:00  10        9
11:  5/9/2016  6:00:00AM 35 118 2016-05-09 06:00:00  11       10
12:  5/9/2016  6:30:00AM 28 120 2016-05-09 06:30:00  12       10
13:  5/9/2016  7:00:00AM 29 123 2016-05-09 07:00:00  13       10
14:  5/9/2016  7:30:00AM 30 125 2016-05-09 07:30:00  14       10
15:  5/9/2016  8:30:00AM 35 116 2016-05-09 08:30:00  15        9
16:  5/9/2016  9:00:00AM 32 123 2016-05-09 09:00:00  16       10
17:  5/9/2016  9:30:00AM 33 135 2016-05-09 09:30:00  17       10
18:  5/9/2016 10:00:00AM 32 127 2016-05-09 10:00:00  18       10
19:  5/9/2016 10:30:00AM 26 122 2016-05-09 10:30:00  19       10
20:  5/9/2016 11:00:00AM 28 122 2016-05-09 11:00:00  20       11
21:  5/9/2016 11:30:00AM 31 129 2016-05-09 11:30:00  21       11
22:  5/9/2016 12:00:00PM 29 127 2016-05-09 12:00:00  22       11
23:  5/9/2016 12:30:00PM 26 120 2016-05-09 12:30:00  23       11
24:  5/9/2016  1:00:00PM 32 126 2016-05-09 13:00:00  24       11
25:  5/9/2016  1:30:00PM 26 117 2016-05-09 13:30:00  25       11
26:  5/9/2016  2:30:00PM 32 133 2016-05-09 14:30:00  26       11
27:  5/9/2016  3:00:00PM 31 120 2016-05-09 15:00:00  27       11
28:  5/9/2016  3:30:00PM 35 129 2016-05-09 15:30:00  28       11
29:  5/9/2016  4:00:00PM 30 121 2016-05-09 16:00:00  29       11
30:  5/9/2016  4:30:00PM 35 132 2016-05-09 16:30:00  30       11
31:  5/9/2016  5:00:00PM 27 129 2016-05-09 17:00:00  31       11
32:  5/9/2016  5:30:00PM 27 122 2016-05-09 17:30:00  32       11
33:  5/9/2016  6:00:00PM 28 121 2016-05-09 18:00:00  33       11
34:  5/9/2016  6:30:00PM 35 127 2016-05-09 18:30:00  34       11
35:  5/9/2016  7:00:00PM 34 132 2016-05-09 19:00:00  35       11
36:  5/9/2016  7:30:00PM 25 133 2016-05-09 19:30:00  36       11
37:  5/9/2016  8:00:00PM 34 119 2016-05-09 20:00:00  37       11
38:  5/9/2016  8:30:00PM 33 125 2016-05-09 20:30:00  38       12
39:  5/9/2016  9:00:00PM 29 120 2016-05-09 21:00:00  39       12
40:  5/9/2016  9:30:00PM 34 125 2016-05-09 21:30:00  40       12
41:  5/9/2016 10:00:00PM 31 118 2016-05-09 22:00:00  41       12
42:  5/9/2016 10:30:00PM 33 118 2016-05-09 22:30:00  42       12
43:  5/9/2016 11:00:00PM 28 115 2016-05-09 23:00:00  43       12
44:  5/9/2016 11:30:00PM 34 126 2016-05-09 23:30:00  44       12
45: 5/10/2016 12:00:00AM 34 131 2016-05-10 00:00:00  45       12
46: 5/10/2016 12:30:00AM 25 130 2016-05-10 00:30:00  46       12
47: 5/10/2016  1:00:00AM 28 129 2016-05-10 01:00:00  47       12
48: 5/10/2016  1:30:00AM 26 132 2016-05-10 01:30:00  48       12


这篇关于自满足日期条件以来对行进行计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆