R:如何跨行对xts对象进行vapping? [英] R: how to vapply across rows for xts object?

查看:102
本文介绍了R:如何跨行对xts对象进行vapping?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下xts对象.

I have the following xts object.

x <- structure(c(30440.5, 30441, 30441.5, 30441.5, 30441, 30439.5, 30440.5, 30441,
                 30441.5, NA, NA, 30439.5, NA, NA, NA, 30441.5, 30441, NA), .indexTZ = "",
               class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"), 
               tclass = c("POSIXct", "POSIXt"), tzone = "", 
               index = structure(c(1519866931.1185, 1519866931.1255, 1519866931.1255, 
                                   1519866931.1905, 1519866931.1905, 1519866931.1915), 
                                 tzone = "", tclass = c("POSIXct", "POSIXt")), 
               .indexFormat = "%Y-%m-%d %H:%M:%OS",
               .Dim = c(6L, 3L), .Dimnames = list(NULL, c("x", "y", "z")))
#                              x        y        z
# 2018-03-01 09:15:31.118  30440.5  30440.5       NA
# 2018-03-01 09:15:31.125  30441.0  30441.0       NA
# 2018-03-01 09:15:31.125  30441.5  30441.5       NA
# 2018-03-01 09:15:31.190  30441.5       NA  30441.5
# 2018-03-01 09:15:31.190  30441.0       NA  30441.0
# 2018-03-01 09:15:31.191  30439.5  30439.5       NA

我如何编写vapply来获取mean(..., na.rm = TRUE)跨行的均值,以便它返回这样的单列?

How can I write the vapply to obtain the mean across rows with mean(..., na.rm = TRUE) such that it returns a single column like this?

                               w       
2018-03-01 09:15:31.118  30440.5
2018-03-01 09:15:31.125  30441.0 
2018-03-01 09:15:31.125  30441.5
2018-03-01 09:15:31.190  30441.5 
2018-03-01 09:15:31.190  30441.0 
2018-03-01 09:15:31.191  30439.5

我只是无法正常工作.

我注意到很多答案都建议我不要使用 vapply,而是使用其他功能.但是,根据此 answer vapply实际上是最快的.那么哪个apply功能在这里最好呢?

I am noticing that a lot of answers recommend me not to use vapply and use other functions instead. However, according to this answer, vapply is actually the fastest. So which apply function is the best here ?

推荐答案

如果您希望每行的列均值,我将不使用vapply.我将使用rowMeans,并请注意,您必须将结果转换回xts.

I would not use vapply if you want the mean of the columns for each row. I would use rowMeans, and note that you have to convert the result back to xts.

(xmean <- xts(rowMeans(x, na.rm = TRUE), index(x)))
#                        [,1]
# 2018-02-28 19:15:31 30440.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30439.5

我将apply用于没有专门实现的通用函数.请注意,如果函数返回多个值,则需要转置结果.

And I would use apply for a generic function that doesn't have a specialized implementation. Note that you will need to transpose the result if the function returns more than one value.

(xmin <- as.xts(apply(x, 1, min, na.rm = TRUE), dateFormat = "POSIXct"))
#                        [,1]
# 2018-02-28 19:15:31 30440.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30439.5
(xrange <- as.xts(t(apply(x, 1, range, na.rm = TRUE)), dateFormat = "POSIXct"))
#                        [,1]    [,2]
# 2018-02-28 19:15:31 30440.5 30440.5
# 2018-02-28 19:15:31 30441.0 30441.0
# 2018-02-28 19:15:31 30441.5 30441.5
# 2018-02-28 19:15:31 30441.5 30441.5
# 2018-02-28 19:15:31 30441.0 30441.0
# 2018-02-28 19:15:31 30439.5 30439.5

要解决为什么不使用vapply()"的注释,这里有一些基准(使用OP链接到的代码审查Q/A中的数据):

To address the comment of "why not use vapply()", here are some benchmarks (using the data from the code review Q/A the OP linked to):

set.seed(21)
xz <- xts(replicate(6, sample(c(1:100), 1000, rep = TRUE)),
          order.by = Sys.Date() + 1:1000)
xrowmean <- function(x) { xts(rowMeans(x, na.rm = TRUE), index(x)) }
xapply <- function(x) { as.xts(apply(x, 1, mean, na.rm = TRUE), dateFormat = "POSIXct") }
xvapply <- function(x) { xts(vapply(seq_len(nrow(x)), function(i) {
    mean(x[i,], na.rm = TRUE) }, FUN.VALUE = numeric(1)), index(x)) }

library(microbenchmark)
microbenchmark(xrowmean(xz), xapply(xz), xvapply(xz))
# Unit: microseconds
#          expr       min         lq       mean     median         uq       max neval
#  xrowmean(xz)   169.496   188.8505   207.1931   204.2455   219.4945   285.329   100
#    xapply(xz) 33477.542 34203.3260 35698.0503 35076.4655 36821.1320 43910.353   100
#   xvapply(xz) 32709.238 35010.1920 37514.7557 35884.3585 37972.7085 84409.961   100

那么,为什么不使用vapply()?它并没有增加性能优势.它比apply()版本更为冗长,并且尚不清楚,如果您可以控制对象的类型和所调用的函数,则预定返回值"的安全性会带来很多好处.也就是说,使用vapply()不会对您造成任何伤害.在这种情况下,我只是选择apply().

So, why not use vapply()? It doesn't add much in the way of performance benefit. It's quite a bit more verbose than the apply() version, and it's not clear there's much benefit to the safety of the 'pre-specified return value' if you have control over the type of object and the function being called. That said, you're not going to do any harm by using vapply(). I simply prefer apply() for this case.

这篇关于R:如何跨行对xts对象进行vapping?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆