在R中按行汇总行 [英] Summing rows by month in R

查看:173
本文介绍了在R中按行汇总行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个数据框,有一个日期列,一个小时列和一系列其他数字列。

So I have a data frame that has a date column, an hour column and a series of other numerical columns. Each row in the data frame is 1 hour of 1 day for an entire year.

数据框架如下所示:

    Date    Hour    Melbourne   Southern    Flagstaff
1   2009-05-01  0   0   5   17
2   2009-05-01  2   0   2   1
3   2009-05-01  1   0   11  0
4   2009-05-01  3   0   3   8
5   2009-05-01  4   0   1   0
6   2009-05-01  5   0   49  79
7   2009-05-01  6   0   425 610

因为这是来自另一个数据框架的子集。

The hours are out of order because this is subsetted from another data frame.

我想将数字列中的值按月,可能按天进行求和。

I would like to sum the values in the numerical columns by month and possibly by day. Does anyone know how I can do this?

推荐答案

我创建由

data <- read.table( text="   Date    Hour    Melbourne   Southern    Flagstaff
                       1   2009-05-01  0   0   5   17
                       2   2009-05-01  2   0   2   1
                       3   2009-05-01  1   0   11  0
                       4   2009-05-01  3   0   3   8
                       5   2009-05-01  4   0   1   0
                       6   2009-05-01  5   0   49  79
                       7   2009-05-01  6   0   425 610",
                    header=TRUE,stringsAsFactors=FALSE)

您可以使用函数 aggregate 进行求和:

You can do the summation with the function aggregate:

byday <- aggregate(cbind(Melbourne,Southern,Flagstaff)~Date,
             data=data,FUN=sum)
library(lubridate)
bymonth <- aggregate(cbind(Melbourne,Southern,Flagstaff)~month(Date),
             data=data,FUN=sum)

查看?aggregate 以更好地了解函数。从最后一个参数开始(因为这使解释更容易),参数执行以下操作:

Look at ?aggregate to understand the function better. Starting with the last argument (because that makes explaining easier) the arguments do the following:


  • FUN 是应该用于聚合的函数。我使用 sum 来总结这些值,但我也可以是 mean max 或您自己编写的某些函数。

  • data 用于表示要汇总的数据框。

  • 第一个参数告诉函数我要聚合到什么。在的左侧,我指示要聚合的变量。如果有多个,它们与 cbind 组合。在右侧是应该分割数据的变量。设置 Date 意味着aggregate将汇总 Date 的每个不同值的变量。

  • FUN is the function that should be used for the aggregation. I use sum to sum up the values, but i could also be mean, max or some function you wrote yourself.
  • data is used to indicate that data frame that I want to aggregate.
  • The first argument tells the function what exactly I want to aggregate. On the left side of the ~, I indicate the variables I want to aggregate. If there is more than one, they are combined with cbind. On the right hand side is the variable by which the data should be split. Putting Date means that aggregate will sum up the variables for each distinct value of Date.

对于按月汇总,我使用了 lubridate中的函数 month 。它做什么期望:它返回一个数字值,指示给定日期的月份。也许你首先需要通过 install.packages(lubridate)安装软件包。

For the aggregation by month, I used the function month from the package lubridate. It does what one expects: it returns a numeric value indicating the month for a given date. Maybe you first need to install the package by install.packages("lubridate").

要使用lubridate,可以改为:

If you prefer not to use lubridate, you could do the following instead:

data <- transform(data,month=as.numeric(format(as.Date(Date),"%m")))
bymonth <- aggregate(cbind(Melbourne,Southern,Flagstaff)~month,
                     data=data,FUN=sum)

在这里,我向包含月份的数据添加了一个新列,然后通过该列进行汇总。

Here I added a new column to data that contains the month and then aggregated by that column.

这篇关于在R中按行汇总行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆