dplyr group_by和mutate,如何访问数据帧? [英] dplyr group_by and mutate, how to access the data frame?

查看:83
本文介绍了dplyr group_by和mutate,如何访问数据帧?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用dplyr的 group_by和 mutate时,如果我理解正确,则根据group_by参数将数据帧分为不同的子数据帧。例如,使用以下代码:

When using dplyr's "group_by" and "mutate", if I understand correctly, the data frame is split in different sub-dataframes according to the group_by argument. For example, with the following code :

 set.seed(7)
 df <- data.frame(x=runif(10),let=rep(letters[1:5],each=2))
 df %>% group_by(let) %>% mutate(mean.by.letter = mean(x))

mean()依次应用于与a对应的5个子df的x列&之间的字母e。

mean() is applied successively to the column x of 5 sub-dfs corresponding to a letter between a & e.

因此您可以操纵子DFS的列,但可以访问子DFS本身吗?令我惊讶的是,如果我尝试:

So you can manipulate the columns of the sub-dfs but can you access the sub-dfs themselves ? To my surprise, if I try :

 set.seed(7)
 data <- data.frame(x=runif(10),let=rep(letters[1:5],each=2))
 data %>% group_by(let) %>% mutate(mean.by.letter = mean(.$x))

结果是不同的。从这一结果可以推断出。。 df不能连续表示子dfs,而仅表示数据(group_by函数不会更改任何内容)。

原因是我想使用一个接受数据的stat函数框架作为每个子df的参数。
谢谢!

the result is different. From this result, one can infer that the "." df doesn't represent successively the sub-dfs but just the "data" one (the group_by function doens't change anything).
The reason is that I want to use a stat function that take a data frame as an arguments on each of this sub-dfs. Thanks !

推荐答案

我们可以在 do

data %>%
    group_by(let ) %>% 
    do(mutate(., mean.by.letter = mean(.$x)))

这篇关于dplyr group_by和mutate,如何访问数据帧?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆