R中的总体周数据 [英] Aggregate Weekly Data in R
问题描述
我确信这是直截了当的,但我似乎无法让它上班。我有一个代表每日总计的数据框。我只想按周计算总计,如果一个星期没有代表,保留为零。 R中最好的方法是什么?如果重要,我从CSV中读取数据并将其转换为R中的日期一次。
I am sure this is straight forward but I just cant seem to get it to work. I have a data frame that represents daily totals. I simply want to sum the totals by week, retaining a zero if a week is not represented. What is the best approach in R? In case it matters, I read the data in from a CSV and converted it to a date once in R.
以下是我的数据帧p1的结构: p>
Here is the structure of my data frame p1:
'data.frame': 407 obs. of 2 variables:
$ date:Class 'Date' num [1:407] 14335 14336 14337 14340 14341 ...
$ amt : num 45 150 165 165 45 45 150 150 15 165 ...
和前几个...
> head(p1)
date amt
1 2009-04-01 45
2 2009-04-02 150
3 2009-04-03 165
4 2009-04-06 165
5 2009-04-07 45
6 2009-04-08 45
提前非常感谢。
一个注释:我看到一个以前的发布,但无法让它工作
One note: I saw one previous post but couldn't get it to work
推荐答案
这是一个读取数据的解决方案,聚合它一周,然后填写3个代码行中的全部零丢失周。 read.zoo
读取它,假设一个标题和一个逗号分隔符。它将第一列转换为 Date
类,然后将日期转换为下一个星期五。动画包中 nextfri
的功能是从 zoo-quickref
小插曲中进行的。 (如果你想在星期几结束,那么只需用另一个日期代码替换5)。 read.zoo
命令也会聚集所有的点相同的指数(记住我们已经把它们变成了一周的最后一个星期五,所以在同一个星期的所有积分都会和现在的指数一样)。下一个命令创建一个零宽度的zoo对象,它具有从第一个到最后一个星期,并将其与使用 fill = 0
的读取输出合并,以便填充
Here is a solution that reads in the data, aggregates it by week and then fills in missing weeks with zero all in 3 lines of code. read.zoo
reads it in assuming a header and a field separator of comma. It converts the first column to Date
class and then transforms the date to the following Friday. The nextfri
function that does this transformation taken from the zoo-quickref
vignette in the zoo package. (If you want to have the end of week be a different day of the week just replace 5 with another day number.) The read.zoo
command also aggregates all points that have the same index (remember that we have transformed them to the last Friday of the week so all points in the same week will have the same Friday as their index now). The next command creates a zero width zoo object that has the weeks from the first to the last and merges that with the output of the read using fill = 0
so that the filled in weeks get that value.
Lines <- "date,amt
2009-04-01,45
2009-04-02,150
2009-04-03,165
2009-04-13,165
2009-04-14,45
2009-04-15,45"
library(zoo)
nextfri <- function(x) 7 * ceiling(as.numeric(x - 5 + 4)/7) + as.Date(5 - 4)
z <- read.zoo(textConnection(Lines), header = TRUE, sep = ",",
FUN = as.Date, FUN2 = nextfri, aggregate = sum)
merge(z, zoo(, seq(min(time(z)), max(time(z)), 7)), fill = 0)
我们使用 textConnection(Lines)
以使其自包含,以便您可以将其复制并粘贴到会话中,但实际上 textConnection(Lines)
将替换为您的文件的名称,例如myfile.csv
。
We used textConnection(Lines)
above to make it self contained so that you can just copy this and paste it right into your session but in reality textConnection(Lines)
would be replaced with the name of your file, e.g. "myfile.csv"
.
对于上面的输入,输出将是以下动物园对象:
For the input above the output would be the following zoo object:
2009-04-03 2009-04-10 2009-04-17
360 0 255
您可能想要阅读的动物园包中有三个小插页。
There are three vignettes that come with the zoo package that you might want to read.
这篇关于R中的总体周数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!