在R中按行汇总行 [英] Summing rows by month in R
问题描述
所以我有一个数据框,有一个日期列,一个小时列和一系列其他数字列。
So I have a data frame that has a date column, an hour column and a series of other numerical columns. Each row in the data frame is 1 hour of 1 day for an entire year.
数据框架如下所示:
Date Hour Melbourne Southern Flagstaff
1 2009-05-01 0 0 5 17
2 2009-05-01 2 0 2 1
3 2009-05-01 1 0 11 0
4 2009-05-01 3 0 3 8
5 2009-05-01 4 0 1 0
6 2009-05-01 5 0 49 79
7 2009-05-01 6 0 425 610
因为这是来自另一个数据框架的子集。
The hours are out of order because this is subsetted from another data frame.
我想将数字列中的值按月,可能按天进行求和。
I would like to sum the values in the numerical columns by month and possibly by day. Does anyone know how I can do this?
推荐答案
我创建由
data <- read.table( text=" Date Hour Melbourne Southern Flagstaff
1 2009-05-01 0 0 5 17
2 2009-05-01 2 0 2 1
3 2009-05-01 1 0 11 0
4 2009-05-01 3 0 3 8
5 2009-05-01 4 0 1 0
6 2009-05-01 5 0 49 79
7 2009-05-01 6 0 425 610",
header=TRUE,stringsAsFactors=FALSE)
您可以使用函数 aggregate
进行求和:
You can do the summation with the function aggregate
:
byday <- aggregate(cbind(Melbourne,Southern,Flagstaff)~Date,
data=data,FUN=sum)
library(lubridate)
bymonth <- aggregate(cbind(Melbourne,Southern,Flagstaff)~month(Date),
data=data,FUN=sum)
查看?aggregate
以更好地了解函数。从最后一个参数开始(因为这使解释更容易),参数执行以下操作:
Look at ?aggregate
to understand the function better. Starting with the last argument (because that makes explaining easier) the arguments do the following:
-
FUN
是应该用于聚合的函数。我使用sum
来总结这些值,但我也可以是mean
,max
或您自己编写的某些函数。 -
data
用于表示要汇总的数据框。 - 第一个参数告诉函数我要聚合到什么。在
〜
的左侧,我指示要聚合的变量。如果有多个,它们与cbind
组合。在右侧是应该分割数据的变量。设置Date
意味着aggregate将汇总Date
的每个不同值的变量。
FUN
is the function that should be used for the aggregation. I usesum
to sum up the values, but i could also bemean
,max
or some function you wrote yourself.data
is used to indicate that data frame that I want to aggregate.- The first argument tells the function what exactly I want to aggregate. On the left side of the
~
, I indicate the variables I want to aggregate. If there is more than one, they are combined withcbind
. On the right hand side is the variable by which the data should be split. PuttingDate
means that aggregate will sum up the variables for each distinct value ofDate
.
对于按月汇总,我使用了 lubridate中的函数
。它做什么期望:它返回一个数字值,指示给定日期的月份。也许你首先需要通过 month
install.packages(lubridate)
安装软件包。
For the aggregation by month, I used the function month
from the package lubridate
. It does what one expects: it returns a numeric value indicating the month for a given date. Maybe you first need to install the package by install.packages("lubridate")
.
要使用lubridate,可以改为:
If you prefer not to use lubridate, you could do the following instead:
data <- transform(data,month=as.numeric(format(as.Date(Date),"%m")))
bymonth <- aggregate(cbind(Melbourne,Southern,Flagstaff)~month,
data=data,FUN=sum)
在这里,我向包含月份的数据添加了一个新列,然后通过该列进行汇总。
Here I added a new column to data that contains the month and then aggregated by that column.
这篇关于在R中按行汇总行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!