与 dplyr 的分组关联(仅适用于控制台) [英] Grouped correlation with dplyr (works only on console)
问题描述
我正在尝试使用 dplyr
来计算分组相关性,但显然有些错误,因为下面的代码仅在控制台中有效:
I'm trying to use dplyr
to calculate grouped correlations, but something is clearly wrong since the code below works only in the console:
require(dplyr)
set.seed(123)
xx = data.frame(group = rep(1:4, 100), a = rnorm(400) , b = rnorm(400))
gp = group_by(xx, group)
summarize(gp, cor(a, b))
group cor(a, b)
1 1 -0.02073084
2 2 0.12803353
3 3 0.06236264
4 4 -0.06181904
如果我在 RStudio 中使用相同的代码,我会得到:
If i use the same code in RStudio, i get:
cor(a, b)
1 0.02739193
发生了什么?
推荐答案
您的体验与同时加载 plyr
和 dplyr
相关.由于两个包都有 summarize
函数,如果您没有明确指定要使用哪个包,可能会发生冲突.对于示例数据,这意味着:
What you experience is related to having both plyr
and dplyr
loaded at the same time. Since both packages have summarize
functions, there can be conflicts if you don't specify explicitly which package you want to use. For the example data, this means:
require(dplyr)
set.seed(123)
xx = data.frame(group = rep(1:4, 100), a = rnorm(400) , b = rnorm(400))
按预期使用 dplyr
:
gp = group_by(xx, group)
dplyr::summarize(gp, cor(a, b))
#Source: local data frame [4 x 2]
#
# group cor(a, b)
#1 1 -0.02073084
#2 2 0.12803353
#3 3 0.06236264
#4 4 -0.06181904
或者使用 plyr
gp = group_by(xx, group)
plyr::summarize(gp, cor(a, b))
# cor(a, b)
#1 0.02739193
所以要么避免同时加载两个包,要么使用 package::function 指定包.
So either avoid loading both packages or specify the package by using package::function.
这篇关于与 dplyr 的分组关联(仅适用于控制台)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!