如何加快组内累计金额? [英] How to speed up cummulative sum within group?

查看:69
本文介绍了如何加快组内累计金额?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框:

id<-c(1,1,1,1,1,3,3,3,3)
spent<-c(10,20,30,40,50,60,70,80,90)
date<-c("11-11-07","11-11-07","23-11-07","12-12-08","17-12-08","11-11-07","23-11-07","23-   11-07","16-01-08")
df<-data.frame(id,date,spent)
df$date2<-as.Date(as.character(df$date), format = "%d-%m-%y")


 id     date spent      date2
1  1 11-11-07    10 2007-11-11
2  1 11-11-07    20 2007-11-11
3  1 23-11-07    30 2007-11-23
4  1 12-12-08    40 2008-12-12
5  1 17-12-08    50 2008-12-17
6  3 11-11-07    60 2007-11-11
7  3 23-11-07    70 2007-11-23
8  3 23-11-07    80 2007-11-23
9  3 16-01-08    90 2008-01-16

我需要计算每个 id 每天>>并将其包括在框架中,如下所示:

I need to calculate the sum spent by each id per day and include it in the frame work as follow:

 id     date spent      date2    sum.spent
1  1 11-11-07    10 2007-11-11    10
2  1 11-11-07    20 2007-11-11    30 
3  1 23-11-07    30 2007-11-23    30
4  1 12-12-08    40 2008-12-12    40
5  1 17-12-08    50 2008-12-17    50
6  3 11-11-07    60 2007-11-11    60
7  3 23-11-07    70 2007-11-23    70
8  3 23-11-07    80 2007-11-23    150
9  3 16-01-08    90 2008-01-16    90 

以下脚本运行良好(第一行除外):

The following script works well (except for the first row which is not a big deal):

df$spent2<-NA
for (a in 2:9)
if (df[a,1]==df[a-1,1]&& df[a,4]==df[a-1,4])
(df[a,5]=df[a,3]+df[a-1,3])else(df[a,5]=df[a,3])

但是由于实际数据集中的行数大约是150万,上述脚本大约需要5天才能执行。我想知道您是否可以建议一种更有效的方式来编写此代码并实现相同的目标。

However since the number of rows in my actual dataset is around 1.5 million, the above script takes around 5 days to be executed. I wonder if you can suggest a more efficient way to write this code and achieve the same objective.

推荐答案

data.table 相当快,尤其是对于如此大的数据集。对于150万条记录,这应该运行得很快。

data.table is pretty fast, especially for such large datasets. This should run pretty quickly for 1.5 mil records.

library(data.table)
df <- data.table(df)
df <- df[, sum.spent:=cumsum(spent), by = list(id, date2)]

这篇关于如何加快组内累计金额?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆