使用 NA 将 5 分钟数据聚合为每小时总和 [英] Aggregate 5 minute data to hourly sums with NA's

查看:41
本文介绍了使用 NA 将 5 分钟数据聚合为每小时总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题如下:我有一个包含 5 分钟降水数据的时间序列,例如:

My problem is as follows: I've got a time series with 5-Minute precipitation data like:

            Datum mm
1 2004-04-08 00:05:00 NA
2 2004-04-08 00:10:00 NA
3 2004-04-08 00:15:00 NA
4 2004-04-08 00:20:00 NA
5 2004-04-08 00:25:00 NA
6 2004-04-08 00:30:00 NA

采用这种结构:

'data.frame':   1098144 obs. of  2 variables:
$ Datum: POSIXlt, format: "2004-04-08 00:05:00" "2004-04-08 00:10:00"   "2004-04-08 00:15:00" "2004-04-08 00:20:00" ...
$ mm   : num  NA NA NA NA NA NA NA NA NA NA ...

如您所见,时间序列以许多 NA's 开始,但在更远的地方测量到降水,尽管其中充斥着单一的、不太常见的 NA's测量站故障.

As you can see, the time series begins with a lot of NA's, but there is measured precipitation further down, although riddled with single, less common NA's due to malfunction of the measuring station.

我想要实现的是将测量的降水量汇总为每小时总和,而不考虑 NA .

What I'm trying to achieve, is summing up the measured precipitation to hourly sums, not considering NA's.

这是我目前尝试过的:

sums <- aggregate(precip["mm"], 
               list(cut(precip$Datum, "1 hour")), sum)

即使时间戳正确聚合为小时,所有总和都是 0NA.如果根本没有 NA,则甚至不计算总和.

Even though the timestamps are correctly aggregated to hours, all sums are 0 or NA. The sums are not even calculated if there is no NA at all.

另外需要考虑:

气象学中的每小时降水总和总是描述直到某一小时的累积总和:0:00点的降水量描述了从前一天23:00到的总和 0:00.所以我总是需要总结前一个小时.

Hourly precipitation sums in meteorology always describe the cumulative sum until a certain hour: The amount of precipitation at 0:00 o'clock describes the sum from 23:00 the previous day until 0:00. So I always need to sum up the previous hour.

可重现的示例

set.seed(1120)
s <- as.POSIXlt("2004-03-08 23:00:00")
r <- seq(s, s+1e4, "30 min")
precip <- data.frame(Datum=r, mm=sample(c(1:5,NA), 6, T))

            Datum mm
2004-03-08 23:00:00  4
2004-03-08 23:30:00  1
2004-03-09 00:00:00  2
2004-03-09 00:30:00  4
2004-03-09 01:00:00  1
2004-03-09 01:30:00  4

上面的例子,我要找的结果是:

With the above example, the result I am looking for is:

            Datum mm
2004-03-09 00:00:00 5
2004-03-09 01:00:00 6
2004-03-09 02:00:00 5

推荐答案

尝试添加 na.rm=TRUE:

aggregate(precip['mm'], list(cut(precip$Datum, "1 hour")), sum, na.rm=TRUE)
#               Group.1 mm
# 1 2004-04-08 00:00:00 26
# 2 2004-04-08 01:00:00 35
# 3 2004-04-08 02:00:00 25

可重现的示例

set.seed(1120)
s <- as.POSIXlt("2004-04-08 00:05:00")
r <- seq(s, s+1e4, "5 min")
precip <- data.frame(Datum=r, mm=sample(c(1:5,NA), 34, T))

附录

对于您的第二个问题:如果您希望使用较小的小时计算小时的测量值,请添加 right=TRUE:

To your second question: If you would like measurements on the hour to be calculated with the lesser hour add right=TRUE:

aggregate(precip['mm'], list(cut(precip$Datum, "1 hour", right=TRUE)), sum, na.rm=TRUE)

进一步说明

我们将创建另一个更详细的解释来展示解决方案的工作原理:

We will create another more detailed explanation to show how the solution works:

p <- c("2004-04-07 23:48:20", "2004-04-08 00:00:00", "2004-04-08 00:03:20")
ptime <- as.POSIXlt(p)
#[1] "2004-04-07 23:48:20 EDT" "2004-04-08 00:00:00 EDT" "2004-04-08 00:03:20 EDT"

我们将三个日期分成几组.如果我们使用 cut 而没有任何额外的参数,第二个条目 "2004-04-08 00:00:00 EDT" 将与第三个条目分组为小时 00:00":

We have three dates to separate into groups. If we use cut without any extra arguments, the second entry "2004-04-08 00:00:00 EDT" will be grouped with the third entry for hour "00:00":

cut(ptime, "1 hour")
#[1] 2004-04-07 23:00:00 2004-04-08 00:00:00 2004-04-08 00:00:00

但是如果我们添加参数 right=FALSE 我们可以将它与 "23:00" 小时分组:

But if we add the argument right=FALSE we can group it with the "23:00" hour:

cut(ptime, "1 hour", right=TRUE)
#[1] 2004-04-07 23:00:00 2004-04-07 23:00:00 2004-04-08 00:00:00

我们可以指定边缘情况的行为.

We can specify the behavior of edge cases.

编辑

使用您的新数据,原始解决方案会产生所需的输出:

With your new data the original solution produces the desired output:

aggregate(precip['mm'], list(cut(precip$Datum, "1 hour")), sum, na.rm=TRUE)
              Group.1 mm
1 2004-03-08 23:00:00  5
2 2004-03-09 00:00:00  6
3 2004-03-09 01:00:00  5

这篇关于使用 NA 将 5 分钟数据聚合为每小时总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆