R //根据日期范围求和 [英] R // Sum by based on date range

查看:715
本文介绍了R //根据日期范围求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个数据框如下(只有前3列),其中总和是例如客户用户 >日期

 用户日期和金(上一个5天)
A 2013-01-01 10 0
A 2013-01-02 20 10
A 2013-01-03 10 30
A 2013-01-05 5 40
A 2013-01-06 6 45
A 2013-01-08 7 21
A 2013-01-09 4 22
A 2013-01-10 0 22
B 2013-01-06 1 0
B 2013-01-07 1 1

现在我想计算列4 [sum(previous5days)] ,这是特定日期之前的前5天(不包括实际日期)的客户用户的累计收入。这个计算必须对每一行进行。



如果没有使用循环,我该怎么做,这不是一个选项,因为数据大小相当大。 p>

非常感谢提前!

解决方案

使用 data.table 你可以放大键:

  library(data.table)
DT < - data.table(< yourdata>)
setkey(DT,user,date)

DT [,sumSum:= DT [。(。BY [[1]], .d +( - 5:-1))] [,sum(sum,na.rm = TRUE)],by = list(user,.d = date)]
DT
# sum.previous5days。 sumSum
#1:A 2013-01-01 10 0 0
#2:A 2013-01-02 20 10 10
#3:A 2013-01-03 10 30 30
#4:A 2013-01-05 5 40 40
#5:A 2013-01-06 6 45 45
#6:A 2013-01-08 7 21 21
#7:A 2013-01-09 4 22 18< ~~~差异
#8:A 2013-01-10 0 22 22
#9:B 2013-01-06 1 0 0
#10:B 2013-01-07 1 1 1


Lets say I have a data frame as follows (only the first 3 columns), in which sum is for example the revenue generated by customer user on day date:

user    date    sum sum(previous5days)
A   2013-01-01  10  0
A   2013-01-02  20  10
A   2013-01-03  10  30
A   2013-01-05  5   40
A   2013-01-06  6   45
A   2013-01-08  7   21
A   2013-01-09  4   22
A   2013-01-10  0   22
B   2013-01-06  1   0
B   2013-01-07  1   1

Now I want to calculate column 4 [sum(previous5days)], which is the aggregated revenue for customer user during the previous 5 days (actual date is not included) on the specific date. This calculation has to be conducted for each row.

How can I do this without using a loop, which is not an option since the data size is rather big.

Many thanks in advance!

解决方案

using data.table you can levearge keys:

library(data.table)
DT <- data.table(<yourdata>)
setkey(DT, user, date)

DT[, sumSum := DT[ .(.BY[[1]], .d+(-5:-1) )][, sum(sum, na.rm=TRUE)] , by=list(user, .d=date)]
DT
#      user       date sum sum.previous5days. sumSum
#   1:    A 2013-01-01  10                  0      0
#   2:    A 2013-01-02  20                 10     10
#   3:    A 2013-01-03  10                 30     30
#   4:    A 2013-01-05   5                 40     40
#   5:    A 2013-01-06   6                 45     45
#   6:    A 2013-01-08   7                 21     21
#   7:    A 2013-01-09   4                 22     18   <~~~ Discrepency
#   8:    A 2013-01-10   0                 22     22
#   9:    B 2013-01-06   1                  0      0
#  10:    B 2013-01-07   1                  1      1

这篇关于R //根据日期范围求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆