R:如何跨行对xts对象进行vapping? [英] R: how to vapply across rows for xts object?
问题描述
我有以下xts对象.
I have the following xts object.
x <- structure(c(30440.5, 30441, 30441.5, 30441.5, 30441, 30439.5, 30440.5, 30441,
30441.5, NA, NA, 30439.5, NA, NA, NA, 30441.5, 30441, NA), .indexTZ = "",
class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"),
tclass = c("POSIXct", "POSIXt"), tzone = "",
index = structure(c(1519866931.1185, 1519866931.1255, 1519866931.1255,
1519866931.1905, 1519866931.1905, 1519866931.1915),
tzone = "", tclass = c("POSIXct", "POSIXt")),
.indexFormat = "%Y-%m-%d %H:%M:%OS",
.Dim = c(6L, 3L), .Dimnames = list(NULL, c("x", "y", "z")))
# x y z
# 2018-03-01 09:15:31.118 30440.5 30440.5 NA
# 2018-03-01 09:15:31.125 30441.0 30441.0 NA
# 2018-03-01 09:15:31.125 30441.5 30441.5 NA
# 2018-03-01 09:15:31.190 30441.5 NA 30441.5
# 2018-03-01 09:15:31.190 30441.0 NA 30441.0
# 2018-03-01 09:15:31.191 30439.5 30439.5 NA
我如何编写vapply
来获取mean(..., na.rm = TRUE)
跨行的均值,以便它返回这样的单列?
How can I write the vapply
to obtain the mean across rows with mean(..., na.rm = TRUE)
such that it returns a single column like this?
w
2018-03-01 09:15:31.118 30440.5
2018-03-01 09:15:31.125 30441.0
2018-03-01 09:15:31.125 30441.5
2018-03-01 09:15:31.190 30441.5
2018-03-01 09:15:31.190 30441.0
2018-03-01 09:15:31.191 30439.5
我只是无法正常工作.
我注意到很多答案都建议我不要使用 vapply
,而是使用其他功能.但是,根据此 answer ,vapply
实际上是最快的.那么哪个apply
功能在这里最好呢?
I am noticing that a lot of answers recommend me not to use vapply
and use other functions instead. However, according to this answer, vapply
is actually the fastest. So which apply
function is the best here ?
推荐答案
如果您希望每行的列均值,我将不使用vapply
.我将使用rowMeans
,并请注意,您必须将结果转换回xts.
I would not use vapply
if you want the mean of the columns for each row. I would use rowMeans
, and note that you have to convert the result back to xts.
(xmean <- xts(rowMeans(x, na.rm = TRUE), index(x)))
# [,1]
# 2018-02-28 19:15:31 30440.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30439.5
我将apply
用于没有专门实现的通用函数.请注意,如果函数返回多个值,则需要转置结果.
And I would use apply
for a generic function that doesn't have a specialized implementation. Note that you will need to transpose the result if the function returns more than one value.
(xmin <- as.xts(apply(x, 1, min, na.rm = TRUE), dateFormat = "POSIXct"))
# [,1]
# 2018-02-28 19:15:31 30440.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30439.5
(xrange <- as.xts(t(apply(x, 1, range, na.rm = TRUE)), dateFormat = "POSIXct"))
# [,1] [,2]
# 2018-02-28 19:15:31 30440.5 30440.5
# 2018-02-28 19:15:31 30441.0 30441.0
# 2018-02-28 19:15:31 30441.5 30441.5
# 2018-02-28 19:15:31 30441.5 30441.5
# 2018-02-28 19:15:31 30441.0 30441.0
# 2018-02-28 19:15:31 30439.5 30439.5
要解决为什么不使用vapply()
"的注释,这里有一些基准(使用OP链接到的代码审查Q/A中的数据):
To address the comment of "why not use vapply()
", here are some benchmarks (using the data from the code review Q/A the OP linked to):
set.seed(21)
xz <- xts(replicate(6, sample(c(1:100), 1000, rep = TRUE)),
order.by = Sys.Date() + 1:1000)
xrowmean <- function(x) { xts(rowMeans(x, na.rm = TRUE), index(x)) }
xapply <- function(x) { as.xts(apply(x, 1, mean, na.rm = TRUE), dateFormat = "POSIXct") }
xvapply <- function(x) { xts(vapply(seq_len(nrow(x)), function(i) {
mean(x[i,], na.rm = TRUE) }, FUN.VALUE = numeric(1)), index(x)) }
library(microbenchmark)
microbenchmark(xrowmean(xz), xapply(xz), xvapply(xz))
# Unit: microseconds
# expr min lq mean median uq max neval
# xrowmean(xz) 169.496 188.8505 207.1931 204.2455 219.4945 285.329 100
# xapply(xz) 33477.542 34203.3260 35698.0503 35076.4655 36821.1320 43910.353 100
# xvapply(xz) 32709.238 35010.1920 37514.7557 35884.3585 37972.7085 84409.961 100
那么,为什么不使用vapply()
?它并没有增加性能优势.它比apply()
版本更为冗长,并且尚不清楚,如果您可以控制对象的类型和所调用的函数,则预定返回值"的安全性会带来很多好处.也就是说,使用vapply()
不会对您造成任何伤害.在这种情况下,我只是选择apply()
.
So, why not use vapply()
? It doesn't add much in the way of performance benefit. It's quite a bit more verbose than the apply()
version, and it's not clear there's much benefit to the safety of the 'pre-specified return value' if you have control over the type of object and the function being called. That said, you're not going to do any harm by using vapply()
. I simply prefer apply()
for this case.
这篇关于R:如何跨行对xts对象进行vapping?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!