如何计算大型数据集的平均值 [英] How to calculate average values large datasets

查看：313 发布时间：2020/5/28 20:31:27 r time-series average plyr

本文介绍了如何计算大型数据集的平均值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用一个数据集，该数据集每小时100天，每天24小时读取一次温度，持续100多年.我想获得每天的平均温度以减少数据集的大小.标题如下:

I am working with a dataset that has temperature readings once an hour, 24 hrs a day for 100+ years. I want to get an average temperature for each day to reduce the size of my dataset. The headings look like this:

     YR MO DA HR MN TEMP
  1943  6 19 10  0   73
  1943  6 19 11  0   72
  1943  6 19 12  0   76
  1943  6 19 13  0   78
  1943  6 19 14  0   81
  1943  6 19 15  0   85
  1943  6 19 16  0   85
  1943  6 19 17  0   86
  1943  6 19 18  0   86
  1943  6 19 19  0   87

等，可获取600,000多个数据点.

etc for 600,000+ data points.

如何运行嵌套函数来计算每日平均温度，以便保留YR，MO，DA和TEMP? 掌握了这些信息后，我希望能够查看长期平均值和计算得出30年一月的平均温度.我该怎么做?

How can I run a nested function to calculate daily average temperature so i preserve the YR, MO, DA, TEMP? Once I have this, I want to be able to look at long term averages & calculate say the average temperature for the Month of January across 30 years. How do I do this?

推荐答案

第一步，您可以这样做:

In one step you could do this:

 meanTbl <- with(datfrm, tapply(TEMP, ISOdate(YR, MO, DA), mean) )

这为您提供了日期时间格式的索引以及值.如果您只想将日期作为字符而没有尾随时间:

This gives you a date-time formatted index as well as the values. If you wanted just the Date as character without the trailing time:

meanTbl <- with(dat, tapply(TEMP, as.Date(ISOdate(YR, MO, DA)), mean) )

每月平均值可以通过以下方式得出:

The monthly averages could be done with:

 monMeans <- with(meanTbl, tapply(TEMP, MO, mean))

这篇关于如何计算大型数据集的平均值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何计算大型数据集的平均值 [英] How to calculate average values large datasets

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何计算大型数据集的平均值 [英] How to calculate average values large datasets

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭