与dplyr分组相关(仅在控制台上工作) [英] Grouped correlation with dplyr (works only on console)

查看:114
本文介绍了与dplyr分组相关(仅在控制台上工作)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 dplyr 来计算分组的相关性,但由于下面的代码仅在控制台中有效:

  require(dplyr)
set.seed(123)
xx = data.frame(group =代表(1:4,100),a = rnorm(400),b = rnorm(400))
gp = group_by(xx,group)
总结(gp,cor(a,b))

group cor(a,b)
1 1 -0.02073084
2 2 0.12803353
3 3 0.06236264
4 4 -0.06181904

如果我在RStudio中使用相同的代码,我得到:

  cor(a,b)
1 0.02739193

发生什么事?

解决方案

您遇到的问题与 plyr dplyr 同时加载。由于这两个程序包都有总结函数,如果不明确指定要使用的程序包,则可能会出现冲突。对于示例数据,这意味着:

  require(dplyr)
set.seed(123)
xx = data.frame(group = rep(1:4,100),a = rnorm(400),b = rnorm(400))

使用 dplyr 如下:

  gp = group_by(xx,group)
dplyr :: summarize(gp,cor(a,b))
#Source:本地数据框[4 x 2]

#group cor(a,b)
#1 1 -0.02073084
#2 2 0.12803353
#3 3 0.06236264
#4 4 -0.06181904

或者使用 plyr


$ b $
$ cor(a,b)
#cor(a,b)
#1 0.02739193

所以要么避免加载这两个包或使用包指定包: :功能。


I'm trying to use dplyr to calculate grouped correlations, but something is clearly wrong since the code below works only in the console:

require(dplyr)
set.seed(123)
xx = data.frame(group = rep(1:4, 100), a = rnorm(400) , b = rnorm(400))
gp = group_by(xx, group)
summarize(gp, cor(a, b))

  group   cor(a, b)
1     1 -0.02073084
2     2  0.12803353
3     3  0.06236264
4     4 -0.06181904

If i use the same code in RStudio, i get:

   cor(a, b)
1 0.02739193

What's happening?

解决方案

What you experience is related to having both plyr and dplyr loaded at the same time. Since both packages have summarize functions, there can be conflicts if you don't specify explicitly which package you want to use. For the example data, this means:

require(dplyr)
set.seed(123)
xx = data.frame(group = rep(1:4, 100), a = rnorm(400) , b = rnorm(400))

Using dplyr as intended:

gp = group_by(xx, group)
dplyr::summarize(gp, cor(a, b))
#Source: local data frame [4 x 2]
#
#  group   cor(a, b)
#1     1 -0.02073084
#2     2  0.12803353
#3     3  0.06236264
#4     4 -0.06181904

Or using plyr

gp = group_by(xx, group)
plyr::summarize(gp, cor(a, b))
#   cor(a, b)
#1 0.02739193

So either avoid loading both packages or specify the package by using package::function.

这篇关于与dplyr分组相关(仅在控制台上工作)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆