在 R 中按月汇总行 [英] Summing rows by month in R

查看:20
本文介绍了在 R 中按月汇总行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个数据框,它有一个日期列、一个小时列和一系列其他数字列.数据框中的每一行是一整年的一天中的 1 小时.

So I have a data frame that has a date column, an hour column and a series of other numerical columns. Each row in the data frame is 1 hour of 1 day for an entire year.

数据框如下所示:

          Date  Hour  Melbourne  Southern  Flagstaff
1   2009-05-01     0          0         5         17
2   2009-05-01     2          0         2          1
3   2009-05-01     1          0        11          0
4   2009-05-01     3          0         3          8
5   2009-05-01     4          0         1          0
6   2009-05-01     5          0        49         79
7   2009-05-01     6          0       425        610

时间乱序,因为这是从另一个数据框中提取的子集.

The hours are out of order because this is subsetted from another data frame.

我想按月和可能按天对数字列中的值求和.有谁知道我该怎么做?

I would like to sum the values in the numerical columns by month and possibly by day. Does anyone know how I can do this?

推荐答案

我创建的数据集

data <- read.table( text="   Date    Hour    Melbourne   Southern    Flagstaff
                       1   2009-05-01  0   0   5   17
                       2   2009-05-01  2   0   2   1
                       3   2009-05-01  1   0   11  0
                       4   2009-05-01  3   0   3   8
                       5   2009-05-01  4   0   1   0
                       6   2009-05-01  5   0   49  79
                       7   2009-05-01  6   0   425 610",
                    header=TRUE,stringsAsFactors=FALSE)

你可以用aggregate函数求和:

byday <- aggregate(cbind(Melbourne,Southern,Flagstaff)~Date,
             data=data,FUN=sum)
library(lubridate)
bymonth <- aggregate(cbind(Melbourne,Southern,Flagstaff)~month(Date),
             data=data,FUN=sum)

查看 ?aggregate 以更好地理解该函数.从最后一个参数开始(因为这使解释更容易),参数执行以下操作:

Look at ?aggregate to understand the function better. Starting with the last argument (because that makes explaining easier) the arguments do the following:

  • FUN 是应该用于聚合的函数.我使用 sum 来总结这些值,但我也可以是 meanmax 或你自己编写的一些函数.
  • data 用于指示我要聚合的数据框.
  • 第一个参数告诉函数我到底想要聚合什么.在~的左边,我指明了我想要聚合的变量.如果有多个,它们会与 cbind 组合在一起.右侧是数据应该被分割的变量.放置 Date 意味着聚合将对 Date 的每个不同值的变量求和.
  • FUN is the function that should be used for the aggregation. I use sum to sum up the values, but i could also be mean, max or some function you wrote yourself.
  • data is used to indicate that data frame that I want to aggregate.
  • The first argument tells the function what exactly I want to aggregate. On the left side of the ~, I indicate the variables I want to aggregate. If there is more than one, they are combined with cbind. On the right hand side is the variable by which the data should be split. Putting Date means that aggregate will sum up the variables for each distinct value of Date.

对于按月的聚合,我使用了 lubridate 包中的函数 month.它做人们所期望的:它返回一个数值,指示给定日期的月份.也许你首先需要通过install.packages("lubridate")安装包.

For the aggregation by month, I used the function month from the package lubridate. It does what one expects: it returns a numeric value indicating the month for a given date. Maybe you first need to install the package by install.packages("lubridate").

如果您不想使用 lubridate,您可以改为执行以下操作:

If you prefer not to use lubridate, you could do the following instead:

data <- transform(data,month=as.numeric(format(as.Date(Date),"%m")))
bymonth <- aggregate(cbind(Melbourne,Southern,Flagstaff)~month,
                     data=data,FUN=sum)

在这里,我向包含月份的数据添加了一个新列,然后按该列聚合.

Here I added a new column to data that contains the month and then aggregated by that column.

这篇关于在 R 中按月汇总行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆