R:如何为 xts 对象跨行应用? [英] R: how to vapply across rows for xts object?
问题描述
我有以下 xts 对象.
I have the following xts object.
x <- structure(c(30440.5, 30441, 30441.5, 30441.5, 30441, 30439.5, 30440.5, 30441,
30441.5, NA, NA, 30439.5, NA, NA, NA, 30441.5, 30441, NA), .indexTZ = "",
class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"),
tclass = c("POSIXct", "POSIXt"), tzone = "",
index = structure(c(1519866931.1185, 1519866931.1255, 1519866931.1255,
1519866931.1905, 1519866931.1905, 1519866931.1915),
tzone = "", tclass = c("POSIXct", "POSIXt")),
.indexFormat = "%Y-%m-%d %H:%M:%OS",
.Dim = c(6L, 3L), .Dimnames = list(NULL, c("x", "y", "z")))
# x y z
# 2018-03-01 09:15:31.118 30440.5 30440.5 NA
# 2018-03-01 09:15:31.125 30441.0 30441.0 NA
# 2018-03-01 09:15:31.125 30441.5 30441.5 NA
# 2018-03-01 09:15:31.190 30441.5 NA 30441.5
# 2018-03-01 09:15:31.190 30441.0 NA 30441.0
# 2018-03-01 09:15:31.191 30439.5 30439.5 NA
我如何编写 vapply
以使用 mean(..., na.rm = TRUE)
获取跨行的平均值,以便它返回一个单列,如这是?
How can I write the vapply
to obtain the mean across rows with mean(..., na.rm = TRUE)
such that it returns a single column like this?
w
2018-03-01 09:15:31.118 30440.5
2018-03-01 09:15:31.125 30441.0
2018-03-01 09:15:31.125 30441.5
2018-03-01 09:15:31.190 30441.5
2018-03-01 09:15:31.190 30441.0
2018-03-01 09:15:31.191 30439.5
我就是无法让它工作.
我注意到很多答案都建议我不要使用 vapply
而是使用其他功能.然而,根据这个answer,vapply
实际上是最快的.那么哪个 apply
函数在这里最好?
I am noticing that a lot of answers recommend me not to use vapply
and use other functions instead. However, according to this answer, vapply
is actually the fastest. So which apply
function is the best here ?
推荐答案
如果您想要每行的列的平均值,我不会使用 vapply
.我会使用 rowMeans
,并注意你必须将结果转换回 xts.
I would not use vapply
if you want the mean of the columns for each row. I would use rowMeans
, and note that you have to convert the result back to xts.
(xmean <- xts(rowMeans(x, na.rm = TRUE), index(x)))
# [,1]
# 2018-02-28 19:15:31 30440.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30439.5
而且我会使用 apply
作为没有专门实现的通用函数.请注意,如果函数返回多个值,您将需要转置结果.
And I would use apply
for a generic function that doesn't have a specialized implementation. Note that you will need to transpose the result if the function returns more than one value.
(xmin <- as.xts(apply(x, 1, min, na.rm = TRUE), dateFormat = "POSIXct"))
# [,1]
# 2018-02-28 19:15:31 30440.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30439.5
(xrange <- as.xts(t(apply(x, 1, range, na.rm = TRUE)), dateFormat = "POSIXct"))
# [,1] [,2]
# 2018-02-28 19:15:31 30440.5 30440.5
# 2018-02-28 19:15:31 30441.0 30441.0
# 2018-02-28 19:15:31 30441.5 30441.5
# 2018-02-28 19:15:31 30441.5 30441.5
# 2018-02-28 19:15:31 30441.0 30441.0
# 2018-02-28 19:15:31 30439.5 30439.5
为了解决为什么不使用 vapply()
"的评论,这里有一些基准(使用来自 OP 链接到的代码审查 Q/A 的数据):
To address the comment of "why not use vapply()
", here are some benchmarks (using the data from the code review Q/A the OP linked to):
set.seed(21)
xz <- xts(replicate(6, sample(c(1:100), 1000, rep = TRUE)),
order.by = Sys.Date() + 1:1000)
xrowmean <- function(x) { xts(rowMeans(x, na.rm = TRUE), index(x)) }
xapply <- function(x) { as.xts(apply(x, 1, mean, na.rm = TRUE), dateFormat = "POSIXct") }
xvapply <- function(x) { xts(vapply(seq_len(nrow(x)), function(i) {
mean(x[i,], na.rm = TRUE) }, FUN.VALUE = numeric(1)), index(x)) }
library(microbenchmark)
microbenchmark(xrowmean(xz), xapply(xz), xvapply(xz))
# Unit: microseconds
# expr min lq mean median uq max neval
# xrowmean(xz) 169.496 188.8505 207.1931 204.2455 219.4945 285.329 100
# xapply(xz) 33477.542 34203.3260 35698.0503 35076.4655 36821.1320 43910.353 100
# xvapply(xz) 32709.238 35010.1920 37514.7557 35884.3585 37972.7085 84409.961 100
那么,为什么不使用 vapply()
呢?它不会增加太多的性能优势.它比 apply()
版本要冗长得多,如果您可以控制对象的类型和被调用的函数.也就是说,使用 vapply()
不会造成任何伤害.对于这种情况,我更喜欢 apply()
.
So, why not use vapply()
? It doesn't add much in the way of performance benefit. It's quite a bit more verbose than the apply()
version, and it's not clear there's much benefit to the safety of the 'pre-specified return value' if you have control over the type of object and the function being called. That said, you're not going to do any harm by using vapply()
. I simply prefer apply()
for this case.
这篇关于R:如何为 xts 对象跨行应用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!