将分组的数据帧传递给dplyr中的函数 [英] pass grouped dataframe to own function in dplyr

查看:87
本文介绍了将分组的数据帧传递给dplyr中的函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正试图从plyr转移到dplyr.但是,我似乎仍然无法弄清楚如何在链接的dplyr函数中调用自己的函数.

I am trying to transfer from plyr to dplyr. However, I still can't seem to figure out how to call on own functions in a chained dplyr function.

我有一个数据帧,其中包含因数分解的ID变量和顺序变量.我想按ID拆分框架,按order变量对其进行排序,然后在新列中添加一个序列.

I have a data frame with a factorised ID variable and an order variable. I want to split the frame by the ID, order it by the order variable and add a sequence in a new column.

我的plyr函数如下所示:

My plyr functions looks like this:

f <- function(x) cbind(x[order(x$order_variable), ], Experience = 0:(nrow(x)-1))
data <- ddply(data, .(ID_variable), f)

在dplyr中,我看起来应该是这样的

In dplyr I though this should look something like this

f <- function(x) cbind(x[order(x$order_variable), ], Experience = 0:(nrow(x)-1))
data <- data %>% group_by(ID_variable) %>% f

有人可以告诉我如何修改dplyr调用以成功传递我自己的函数并获得与plyr函数提供的功能相同的功能吗?

Can anyone tell me how to modify my dplyr call to successfully pass my own function and get the same functionality my plyr function provides?

:如果我按此处所述使用dplyr公式,它会将对象传递给f.但是,虽然plyr似乎传递了许多不同的表(由ID变量分割),但dplyr并没有每个组传递一个表,而是传递整个表(作为带批注的dplyr对象的一种表),因此当我将体验变量,它会在整个表的长度(而不是单个组)的后面添加一个从0开始的计数器.

If I use the dplyr formula as described here, it DOES pass an object to f. However, while plyr seems to pass a number of different tables (split by the ID variable), dplyr does not pass one table per group but the ENTIRE table (as some kind of dplyr object where groups are annotated), thus when I cbind the Experience variable it appends a counter from 0 to the length of the entire table instead of the single groups.

我找到了一种使用这种方法在dplyr中获得相同功能的方法:

I have found a way to get the same functionality in dplyr using this approach:

data <- data %>%
    group_by(ID_variable) %>%
    arrange(ID_variable,order_variable) %>% 
    mutate(Experience = 0:(n()-1))

但是,我仍然热衷于学习如何将分组为不同表的分组变量传递给dplyr中的函数.

However, I would still be keen to learn how to pass grouped variables split into different tables to own functions in dplyr.

推荐答案

适用于那些从Google到这里的人.假设您编写了自己的打印功能.

For those who get here from google. Let's say you wrote your own print function.

printFunction <- function(dat) print(dat)
df <- data.frame(a = 1:6, b = 1:2)

按照这里的要求

df %>% 
    group_by(b) %>% 
    printFunction(.)

打印整个数据.要使dplyr打印按表分组的多个表,应使用do

prints entire data. To get dplyr print multiple tables grouped by, you should use do

df %>% 
    group_by(b) %>% 
    do(printFunction(.))

这篇关于将分组的数据帧传递给dplyr中的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆