如何使用 dplyr 复制使用自定义函数的 ddply 行为? [英] How to replicate a ddply behavior that uses a custom function with dplyr?

查看:19
本文介绍了如何使用 dplyr 复制使用自定义函数的 ddply 行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用 dplyr 替换我所有的 plyr 调用.还有一些障碍,其中之一是 group_by 函数.我想它的作用与第二个 ddply 参数相同,并根据我列出的分组变量进行拆分、应用和组合.但情况似乎并非如此.这是一个相当简单的例子.

I'm trying to replace all my plyr calls with dplyr. There are still a few snags and one of them is with the group_by function. I imagine it acts the same way as the second ddply argument and does a split, apply and combine based on the grouping variables I list. But that doesn't appear to be the case. Here is a rather trivial example.

让我们定义一个愚蠢的函数

Let's define a silly function

mm <- function(x) return(x[1:5, ])

现在我们可以像这样分割iris数据集中的物种,并将这个函数应用到每一部分.

Now we can split the species in the irisdataset like so and apply this function to each piece.

ddply(iris, .(Species), mm)

这按预期工作.但是,当我对 dplyr 尝试相同的操作时,它没有按预期工作.

This works as intended. However, when I try the same with dplyr, it doesn't work as expected.

iris %>% group_by(Species) %>% mm

我做错了什么?

推荐答案

?do所示,您可以在表达式中引用带有.的组.以下将复制您的 ddply 输出:

As shown in ?do, you can refer to a group with . in your expression. The following will replicate your ddply output:

iris %>% group_by(Species) %>% do(.[1:5, ])

# Source: local data frame [15 x 5]
# Groups: Species
#
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 1           5.1         3.5          1.4         0.2     setosa
# 2           4.9         3.0          1.4         0.2     setosa
# 3           4.7         3.2          1.3         0.2     setosa
# 4           4.6         3.1          1.5         0.2     setosa
# 5           5.0         3.6          1.4         0.2     setosa
# 6           7.0         3.2          4.7         1.4 versicolor
# 7           6.4         3.2          4.5         1.5 versicolor
# 8           6.9         3.1          4.9         1.5 versicolor
# 9           5.5         2.3          4.0         1.3 versicolor
# 10          6.5         2.8          4.6         1.5 versicolor
# 11          6.3         3.3          6.0         2.5  virginica
# 12          5.8         2.7          5.1         1.9  virginica
# 13          7.1         3.0          5.9         2.1  virginica
# 14          6.3         2.9          5.6         1.8  virginica
# 15          6.5         3.0          5.8         2.2  virginica

更一般地,要将自定义函数应用于具有 dplyr 的组,您可以执行以下操作(感谢 @docendodiscimus):

More generally, to apply a custom function to groups with dplyr, you can do something like the following (thanks @docendodiscimus):

iris %>% group_by(Species) %>% do(mm(.))

这篇关于如何使用 dplyr 复制使用自定义函数的 ddply 行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆