dplyr:如何对 group_by 的结果应用 do()? [英] dplyr: How to apply do() on result of group_by?

查看:29
本文介绍了dplyr:如何对 group_by 的结果应用 do()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 dplyr 将表格按一列分组,然后将函数应用于每组第二列中的一组值.

I'd like to use dplyr to group a table by one column, then apply a function to the set of values in the second column of each group.

例如,在下面的代码示例中,我想返回每个人所吃食物的所有 2 项组合.我无法弄清楚如何在 do() 函数中正确地为函数提供适当的列(食物).

For instance, in the code example below, I'd like to return all of the 2-item combinations of foods eaten by each person. I cannot figure out how to properly supply the function with the proper column (foods) in the do() function.

library(dplyr)

person = c( 'Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob' )
foods   = c( 'apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana' )
eaten  = data.frame(person, foods)

by_person = group_by(eaten, person)

# How to do this?
do( by_person, combn( x = foods, m = 2 ) )

请注意,?do 中的示例代码在我的机器上失败

Note that the example code in ?do fails on my machine

mods <- do(carriers, failwith(NULL, lm), formula = ArrDelay ~ date)

推荐答案

让我们这样定义eaten:

eaten <- data.frame(person, foods, stringsAsFactors = FALSE)

1)然后试试这个:

eaten %.% group_by(person) %.% do(function(x) combn(x$foods, m = 2))

给予:

[[1]]
     [,1]     [,2]       [,3]      
[1,] "apple"  "apple"    "banana"  
[2,] "banana" "cucumber" "cucumber"

[[2]]
     [,1]        [,2]        [,3]      
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber"  "banana"    "banana"  

2) 为了能够在不等待未来版本的 dplyr 的情况下做一些接近@Hadley 在评论中描述的事情,试试这个在 do2 找到的地方这里:

2) To be able to do something near to what @Hadley describes in the comments without waiting for a future version of dplyr try this where do2 is found here:

library(gsubfn)
eaten %.% group_by(person) %.% fn$do2(~ combn(.$foods, m = 2))

给予:

$Grace
     [,1]     [,2]       [,3]      
[1,] "apple"  "apple"    "banana"  
[2,] "banana" "cucumber" "cucumber"

$Rob
     [,1]        [,2]        [,3]      
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber"  "banana"    "banana"  

注意: 在帮助文件中给出代码的问题的最后一行对我来说也失败了.它的这种变化对我有用:do(jan, lm, formula = ArrDelay ~ date) .

Note: The last line of the question giving the code in the help file also fails for me. This variation of it works for me: do(jan, lm, formula = ArrDelay ~ date) .

这篇关于dplyr:如何对 group_by 的结果应用 do()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆