数据表:每两周汇总 [英] Data.Table: Aggregate by every two weeks

查看:33
本文介绍了数据表:每两周汇总的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,让我们采用以下data.table.它具有日期和一列数字.我想获取每个日期的星期,然后每两个星期汇总一次(总和).

So let's take the following data.table. It has dates and a column of numbers. I'd like to get the week of each date and then aggregate (sum) of each two weeks.

Date <- as.Date(c("1980-01-01", "1980-01-02", "1981-01-05", "1981-01-05", "1982-01-08", "1982-01-15", "1980-01-16", "1980-01-17", 
                  "1981-01-18", "1981-01-22", "1982-01-24", "1982-01-26"))
Runoff <- c(2, 1, 0.1, 3, 2, 5, 1.5, 0.5, 0.3, 2, 1.5, 4)
DT <- data.table(Date, Runoff)
DT

所以从日期开始,我可以轻松获得年份和星期.

So from the date, I can easily get the year and week.

DT[,c("Date_YrWeek") := paste(substr(Date,1,4), week(Date), sep="-")][]

我正在苦苦挣扎的是每两周进行一次汇总.我以为我会得到每个星期的第一个日期,并使用这些值进行过滤.不幸的是,那将是非常愚蠢的.

What I'm struggling with is aggregating with every two week. I thought that I'd get the first date for each week and filter using those values. Unfortunately, that would be pretty foolish.

DT[,.(min(Date)),by=.(Date_YrWeek)][order(Date)]

最终结果将是每两周的总和.

The final result would end up being the sum of every two weeks.

weeks    sum_value
1 and 2  ...
3 and 4  ...
5 and 6  ...

任何人都有使用data.table做到这一点的有效方法吗?

Anyone have an efficient way to do this with data.table?

推荐答案

1)将两个星期的时间段定义为从最小日期开始.这样我们就可以得出每个这样的时期的总径流.

1) Define the two week periods as starting from the minimum Date. Then we can get the total Runoff for each such period like this.

DT[, .(sum_value = sum(Runoff)), 
     keyby = .(Date = 14 * (as.numeric(Date - min(Date)) %/% 14) + min(Date))]

给出以下内容,其中日期"列是两周内第一天的日期.

giving the following where the Date column is the date of the first day of the two week period.

         Date sum_value
1: 1980-01-01       3.0
2: 1980-01-15       2.0
3: 1980-12-30       3.1
4: 1981-01-13       2.3
5: 1981-12-29       2.0
6: 1982-01-12       6.5
7: 1982-01-26       4.0

2):如果您希望第一列中的问题中显示的文本,那么:

2) If you prefer the text shown in the question for the first column then:

DT[, .(sum_value = sum(Runoff)), 
    keyby = .(two_week = as.numeric(Date - min(Date)) %/% 14)][
    , .(weeks = paste(2*two_week + 1, "and", 2*two_week + 2), sum_value)]

给予:

         weeks sum_value
1:     1 and 2       3.0
2:     3 and 4       2.0
3:   53 and 54       3.1
4:   55 and 56       2.3
5: 105 and 106       2.0
6: 107 and 108       6.5
7: 109 and 110       4.0

更新:修订并添加了(2).

Update: Revised and added (2).

这篇关于数据表:每两周汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆