为什么R dplyr :: mutate与自定义函数不一致 [英] Why is R dplyr::mutate inconsistent with custom functions
问题描述
这个问题是为什么,而不是方法。在下面的代码中,我试图理解为什么 dplyr :: mutate
计算一个自定义函数( f()
)包含整个向量,但不包含其他自定义函数( g()
)。 mutate
到底在做什么?
This question is a "why", not a how. In the following code I'm trying to understand why dplyr::mutate
evaluates one custom function (f()
) with the entire vector but not with the other custom function (g()
). What exactly is mutate
doing?
set.seed(1);sum(rnorm(100, c(0, 10, 100)))
f=function(m) {
set.seed(1)
sum(rnorm(100, mean=m))
}
g <- function(m) sin(m)
df <- data.frame(a=c(0, 10, 100))
y1 <- mutate(df, asq=a^2, fout=f(a), gout=g(a))
y2 <- rowwise(df) %>%
mutate(asq=a^2, fout=f(a), gout=g(a))
y3 <- group_by(df, a) %>%
summarize(asq=a^2, fout=f(a), gout=g(a))
对于所有三列, asq
, fout
和 gout
,在<$ c $中按行进行评估c> y2 和 y3
,结果相同。但是,所有三行的 y1 $ fout
均为3640.889,这是对 sum(rnorm(100,c(0,10,100 )))
。因此,函数 f()
正在评估每一行的整个向量。
For all three columns, asq
, fout
, and gout
, evaluation is rowwise in y2
and y3
and the results are identical. However, y1$fout
is 3640.889 for all three rows, which is the result of evaluating sum(rnorm(100, c(0, 10, 100)))
. So the function f()
is evaluating the entire vector for each row.
提出了一个密切相关的问题其他地方在R dplyr中更改/转换(通过自定义函数) ,但未解释为什么。
A closely related question has been asked elsewhere mutate/transform in R dplyr (Pass custom function), but the "why" was not explained.
推荐答案
sin
和 ^
是向量化的,因此它们本机对每个单独的值进行操作,而不是对值的整个向量进行操作。 f
未向量化。但是您可以执行 f = Vectorize(f)
,它也会对每个单独的值进行运算。
sin
and ^
are vectorized, so they natively operate on each individual value, rather than on the whole vector of values. f
is not vectorized. But you can do f = Vectorize(f)
and it will operate on each individual value as well.
y1 <- mutate(df, asq=a^2, fout=f(a), gout=g(a))
y1
a asq fout gout
1 0 0 3640.889 0.0000000
2 10 100 3640.889 -0.5440211
3 100 10000 3640.889 -0.5063656
f = Vectorize(f)
y1a <- mutate(df, asq=a^2, fout=f(a), gout=g(a))
y1a
a asq fout gout
1 0 0 10.88874 0.0000000
2 10 100 1010.88874 -0.5440211
3 100 10000 10010.88874 -0.5063656
Some additional info on vectorization here, here, and here.
这篇关于为什么R dplyr :: mutate与自定义函数不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!