如何在R中按小时计算变量的平均值 [英] How to calculate average of a variable by hour in R

查看:867
本文介绍了如何在R中按小时计算变量的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试按小时计算平均温度时遇到麻烦.

I'm having trouble when trying to calculate the average temperature by hour.

我有一个数据框,其中包含日期时间(hh:mm:ss p.m./a.m.)和温度. 我需要的是按小时提取平均温度,以绘制温度的每日变化.

I have a data frame with date, time (hh:mm:ss p.m./a.m.)and temperature. What I need is to extract the mean temperature by hour in order to plot daily variation of temperature.

我是R的新手,但尝试使用我所知道的方法:我首先尝试将小时转换为数字,然后提取前两个字符,然后计算均值,但效果不佳.而且,我要分析的文件太多,以至于拥有比我发现的解决方案"更自动化,更干净的文件.

I'm new to R, but did a try with what I know: I first tried by transforming hours into numbers, then extracting the first two characters, and then to calculate the mean but it didn't work very well. Moreover I have so many files to analize that it would be much better to have something more automated and clean than the "solution" I found.

我相信这是按小时计算R中平均值的更好方法,因此我一直在这里的其他帖子中寻找答案.不幸的是,我找不到关于从时间数据中提取统计信息的明确答案.

I believe it must be a better way to calculate averages by hours in R so I've been looking for the answer in other posts here. Unfortunately I couldn't find a clear answer regarding extracting statistics from time data.

我的数据看起来像这样

          date     hour temperature
1   28/12/2013 13:03:01      41.572
2   28/12/2013 13:08:01      46.059
3   28/12/2013 13:13:01       48.55
4   28/12/2013 13:18:01      49.546
5   28/12/2013 13:23:01      49.546
6   28/12/2013 13:28:01      49.546
7   28/12/2013 13:33:01      50.044
8   28/12/2013 13:38:01      50.542
9   28/12/2013 13:43:01      50.542
10  28/12/2013 13:48:01       51.04
11  28/12/2013 13:53:01      51.538
12  28/12/2013 13:58:01      51.538
13  28/12/2013 14:03:01      50.542
14  28/12/2013 14:08:01       51.04
15  28/12/2013 14:13:01       51.04
16  28/12/2013 14:18:01      52.534
17  28/12/2013 14:23:01      53.031
18  28/12/2013 14:28:01      53.031
19  28/12/2013 14:33:01      53.031
20  28/12/2013 14:38:01      51.538
21  28/12/2013 14:43:01      53.031
22  28/12/2013 14:48:01      53.529
etc (24hs data)

我希望R计算每小时的平均值(不考虑分钟或秒的差异,只是按小时计算)

And I would like R to calculate average per hour (without taking into account differences in minutes or seconds, just by hour)

有什么建议吗? 提前非常感谢您!

Any suggestion? Thank you very much in advance!

关于, 玛丽亚

推荐答案

如果在问题中给出样本数据和预期输出,将总是更加容易.

It would always easier if sample data and expected output is given in the question.

使用Data.table程序包解决方案

Solution with Data.table package

require(data.table)
data <- fread('temp.csv',sep=',') #Assuming your data is in temp.csv
#if above step not executed, convert the data frame to data.table 
data <- data.table(data)
> str(data)
Classes ‘data.table’ and 'data.frame':  12 obs. of  3 variables:
$ date       : chr  "28/12/2013" "28/12/2013" "28/12/2013" "28/12/2013" ...
$ hour       : chr  "13:03:01" "13:08:01" "13:13:01" "13:18:01" ...
$ temperature: num  41.6 46.1 48.5 49.5 49.5 ...

> data
      date     hour    temperature      avg
1: 27/12/2013 13:00:00       42.99 35.78455
2: 27/12/2013 14:00:00       65.97 35.78455
3: 27/12/2013 15:00:00       63.57 35.78455 

  data[,list(avg=mean(temperature)),by=hour] #dataset is sorted by hour
    hour   avg
1: 13:00:00 42.99
2: 14:00:00 65.97
3: 15:00:00 63.57
  data[,list(avg=mean(temperature)),by="date,hour"] #data set is grouped by date,then hour
        date     hour   avg
1: 27/12/2013 13:00:00 42.99
2: 27/12/2013 14:00:00 65.97
3: 27/12/2013 15:00:00 63.57

data[,list(avg=mean(temperature)),by=list(date,hour(as.POSIXct(data$hour, format = "%H:%M:%S")))] # to group by hour only 
     date     hour    avg
1: 27/12/2013    1 29.530
2: 27/12/2013    4 65.970

这篇关于如何在R中按小时计算变量的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆