当我在 `dplyr` 之后加载 `plyr` 时,为什么汇总或变异不适用于 group_by? [英] Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?

查看:13
本文介绍了当我在 `dplyr` 之后加载 `plyr` 时,为什么汇总或变异不适用于 group_by?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:这个问题的标题已经过编辑,使其成为当 plyr 函数屏蔽了它们的 dplyr 对应项时的规范问题.其余的问题保持不变.

Note: The title of this question has been edited to make it the canonical question for issues when plyr functions mask their dplyr counterparts. The rest of the question remains unchanged.

假设我有以下数据:

dfx <- data.frame(
  group = c(rep('A', 8), rep('B', 15), rep('C', 6)),
  sex = sample(c("M", "F"), size = 29, replace = TRUE),
  age = runif(n = 29, min = 18, max = 54)
)

使用旧的 plyr 我可以创建一个小表格,用以下代码总结我的数据:

With the good old plyr I can create a little table summarizing my data with the following code:

require(plyr)
ddply(dfx, .(group, sex), summarize,
      mean = round(mean(age), 2),
      sd = round(sd(age), 2))

输出如下:

  group sex  mean    sd
1     A   F 49.68  5.68
2     A   M 32.21  6.27
3     B   F 31.87  9.80
4     B   M 37.54  9.73
5     C   F 40.61 15.21
6     C   M 36.33 11.33

我正在尝试将我的代码移动到 dplyr%>% 运算符.我的代码采用 DF,然后按组和性别对其进行分组,然后对其进行汇总.即:

I'm trying to move my code to dplyr and the %>% operator. My code takes DF then group it by group and sex and then summarise it. That is:

dfx %>% group_by(group, sex) %>% 
  summarise(mean = round(mean(age), 2), sd = round(sd(age), 2))

但我的输出是:

  mean   sd
1 35.56 9.92

我做错了什么?

推荐答案

这里的问题是你是先加载dplyr再加载plyr,所以plyr的函数summarise屏蔽了dplyr的函数summarise.发生这种情况时,您会收到此警告:

The problem here is that you are loading dplyr first and then plyr, so plyr's function summarise is masking dplyr's function summarise. When that happens you get this warning:

library(plyr)
    Loading required package: plyr
------------------------------------------------------------------------------------------
You have loaded plyr after dplyr - this is likely to cause problems.
If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
library(plyr); library(dplyr)
------------------------------------------------------------------------------------------

Attaching package: ‘plyr’

The following objects are masked from ‘package:dplyr’:

    arrange, desc, failwith, id, mutate, summarise, summarize

因此,为了让您的代码正常工作,请分离 plyr detach(package:plyr) 或重新启动 R 并先加载 plyr,然后再加载 dplyr(或仅加载 dplyr):

So in order for your code to work, either detach plyr detach(package:plyr) or restart R and load plyr first and then dplyr (or load only dplyr):

library(dplyr)
dfx %>% group_by(group, sex) %>% 
  summarise(mean = round(mean(age), 2), sd = round(sd(age), 2))
Source: local data frame [6 x 4]
Groups: group

  group sex  mean    sd
1     A   F 41.51  8.24
2     A   M 32.23 11.85
3     B   F 38.79 11.93
4     B   M 31.00  7.92
5     C   F 24.97  7.46
6     C   M 36.17  9.11

或者你可以在你的代码中显式调用dplyr的summary,这样无论你如何加载包都会调用正确的函数:

Or you can explicitly call dplyr's summarise in your code, so the right function will be called no matter how you load the packages:

dfx %>% group_by(group, sex) %>% 
  dplyr::summarise(mean = round(mean(age), 2), sd = round(sd(age), 2))

这篇关于当我在 `dplyr` 之后加载 `plyr` 时,为什么汇总或变异不适用于 group_by?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆