如何复制使用dplyr自定义函数的ddply行为? [英] How to replicate a ddply behavior that uses a custom function with dplyr?

查看:93
本文介绍了如何复制使用dplyr自定义函数的ddply行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用 dplyr 替换我所有的 plyr 电话。还有一些阻碍,其中一个是使用 group_by 函数。我想象它的行为方式与第二个 ddply 参数相同,并根据我列出的分组变量进行分割,应用和组合。但这似乎并非如此。这是一个非常简单的例子。



我们定义一个愚蠢的功能

  mm<  -  function(x)return(x [1:5,])

现在我们可以在 iris 数据集中分割物种,如下所示,并将此函数应用于每个片段。

 code> ddply(iris,。(Species),mm)

但是,当我尝试与 dplyr 相同时,它不能按预期工作。

  iris%>%group_by(Species)%>%mm 
/ pre>

我做错了什么?

解决方案

如图所示在?do 中,您可以在表达式中引用的组。以下将复制您的 ddply 输出:

  iris%>% group_by(Species)%>%do(。[1:5,])

#来源:本地数据框[15 x 5]
#组:种类

#Sepal.Length Sepal.Width Petal.Length花瓣种类
#1 5.1 3.5 1.4 0.2 setosa
#2 4.9 3.0 1.4 0.2 setosa
#3 4.7 3.2 1.3 0.2 setosa
#4 4.6 3.1 1.5 0.2 setosa
#5 5.0 3.6 1.4 0.2 setosa
#6 7.0 3.2 4.7 1.4 versicolor
#7 6.4 3.2 4.5 1.5 versicolor
# 8 6.9 3.1 4.9 1.5 versicolor
#9 5.5 2.3 4.0 1.3 versicolor
#10 6.5 2.8 4.6 1.5 versicolor
#11 6.3 3.3 6.0 2.5 virginica
#12 5.8 2.7 5.1 1.9 virginica
#13 7.1 3.0 5.9 2.1 virginica
#14 6.3 2.9 5.6 1.8 virginica
#15 6.5 3.0 5.8 2.2 virginica

更一般来说,将自定义函数应用于具有 dplyr ,您可以执行以下操作(感谢@docendodiscimus):

  iris%>%group_by物种)%>%do(mm(。))


I'm trying to replace all my plyr calls with dplyr. There are still a few snags and one of them is with the group_by function. I imagine it acts the same way as the second ddply argument and does a split, apply and combine based on the grouping variables I list. But that doesn't appear to be the case. Here is a rather trivial example.

Let's define a silly function

mm <- function(x) return(x[1:5, ])

Now we can split the species in the irisdataset like so and apply this function to each piece.

ddply(iris, .(Species), mm)

This works as intended. However, when I try the same with dplyr, it doesn't work as expected.

iris %>% group_by(Species) %>% mm

What am I doing wrong?

解决方案

As shown in ?do, you can refer to a group with . in your expression. The following will replicate your ddply output:

iris %>% group_by(Species) %>% do(.[1:5, ])

# Source: local data frame [15 x 5]
# Groups: Species
#
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 1           5.1         3.5          1.4         0.2     setosa
# 2           4.9         3.0          1.4         0.2     setosa
# 3           4.7         3.2          1.3         0.2     setosa
# 4           4.6         3.1          1.5         0.2     setosa
# 5           5.0         3.6          1.4         0.2     setosa
# 6           7.0         3.2          4.7         1.4 versicolor
# 7           6.4         3.2          4.5         1.5 versicolor
# 8           6.9         3.1          4.9         1.5 versicolor
# 9           5.5         2.3          4.0         1.3 versicolor
# 10          6.5         2.8          4.6         1.5 versicolor
# 11          6.3         3.3          6.0         2.5  virginica
# 12          5.8         2.7          5.1         1.9  virginica
# 13          7.1         3.0          5.9         2.1  virginica
# 14          6.3         2.9          5.6         1.8  virginica
# 15          6.5         3.0          5.8         2.2  virginica

More generally, to apply a custom function to groups with dplyr, you can do something like the following (thanks @docendodiscimus):

iris %>% group_by(Species) %>% do(mm(.))

这篇关于如何复制使用dplyr自定义函数的ddply行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆