dplyr group_by和mutate,如何访问数据帧 [英] dplyr group_by and mutate, how access to the data frame

查看:211
本文介绍了dplyr group_by和mutate,如何访问数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当使用dplyr的group_by和mutate时,如果我理解正确,根据group_by参数将数据帧分成不同的子数据框。例如,使用以下代码:

  set.seed(7)
df< - data.frame( x = runif(10),let = rep(字母[1:5],each = 2))
df%>%group_by(let)%>%mutate(mean.by.letter = mean x))

mean()被连续应用到5个子dfs的列x,在一个& e。

所以你可以操纵子dfs的列,但你可以自己访问子dfs吗?令我惊讶的是,如果我尝试:

  set.seed(7)
data< - data.frame( x = runif(10),let = rep(字母[1:5],each = 2))
data%>%group_by(let)%>%mutate(mean.by.letter = mean 。$ x))

结果不同。从这个结果可以推断出。 df不是连续表示子dfs,而只是数据(group_by函数不会改变任何内容)。
原因是我想使用一个stat函数来获取数据帧作为每个这些子dfs上的参数。
Thanks!

解决方案

我们可以在 do / p>

  data%>%
group_by(let)%>%
do(mutate(。, mean.by.letter = mean(。$ x)))


When using dplyr's "group_by" and "mutate", if I understand correctly, the data frame is split in different sub-dataframes according to the group_by argument. For example, with the following code :

 set.seed(7)
 df <- data.frame(x=runif(10),let=rep(letters[1:5],each=2))
 df %>% group_by(let) %>% mutate(mean.by.letter = mean(x))

mean() is applied successively to the column x of 5 sub-dfs corresponding to a letter between a & e.

So you can manipulate the columns of the sub-dfs but can you access the sub-dfs themselves ? To my surprise, if I try :

 set.seed(7)
 data <- data.frame(x=runif(10),let=rep(letters[1:5],each=2))
 data %>% group_by(let) %>% mutate(mean.by.letter = mean(.$x))

the result is different. From this result, one can infer that the "." df doesn't represent successively the sub-dfs but just the "data" one (the group_by function doens't change anything).
The reason is that I want to use a stat function that take a data frame as an arguments on each of this sub-dfs. Thanks !

解决方案

We can use within do

data %>%
    group_by(let ) %>% 
    do(mutate(., mean.by.letter = mean(.$x)))

这篇关于dplyr group_by和mutate,如何访问数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆