dplyr返回每个组的全局平均值，而不是每个组的平均值 [英] dplyr returns global mean for each group, instead of each groups mean

查看：323 发布时间：2017/7/13 22:03:51 r dplyr

本文介绍了dplyr返回每个组的全局平均值，而不是每个组的平均值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

有人可以在这里解释我在做错什么：

  library（dplyr）
 temp< -data。 （a = c（1,2,3,1,2,3,1,2,3），b = c（1,2,3,1,2,3,1,2,3））
 temp％>％group_by（temp [，1]）％>％summarize（n = n（），mean = mean（temp [，2]，na.rm = T））
 
＃A tibble：3×3 
`temp [，1]`n意味着
< dbl> < INT> < DBL> 
 1 1 3 2 
 2 2 3 2 
 3 3 3 2

我预计会有这样的手段：

  1 1 
 2 2 
 3 3

而不是平均值似乎是全局平均值（col 2中的所有值除以实例数）= 18/9 = 2

如何获得我的期望值？

解决方案

您的问题是您正在计算 temp [，2] 而不是组中的列（ mean（temp [，2]，na.rm = T）根本不依赖于上下文）。您需要执行以下操作：

 > temp％>％group_by（temp [，1]）％>％summarize（n = n（），mean = mean（b，na.rm = T））
＃A tibble：3×3 
`temp [，1]`n意味着
< dbl> < INT> < DBL> 
 1 1 3 1 
 2 2 3 2 
 3 3 3 3

此外，在 group_by 中使用列名更常见：

 > temp％>％group_by（b）％>％summaryize（n = n（），mean = mean（b，na.rm = T））
＃A tibble：3×3 
bn 
< dbl> < INT> < DBL> 
 1 1 3 1 
 2 2 3 2 
 3 3 3 3

Can someone explain what I am doing wrong here:

library(dplyr)
temp<-data.frame(a=c(1,2,3,1,2,3,1,2,3),b=c(1,2,3,1,2,3,1,2,3))
temp%>%group_by(temp[,1])%>%summarise(n=n(),mean=mean(temp[,2],na.rm=T))

# A tibble: 3 × 3
  `temp[, 1]`     n  mean
        <dbl> <int> <dbl>
1           1     3     2
2           2     3     2
3           3     3     2

I expected the means to be:

1  1
2  2
3  3

instead the mean seems to be the global mean (all values in col 2 divided by the number of instances) = 18/9=2

How do I get the mean to be what I expected?

解决方案

Your problem is that you are calculating the mean of temp[,2]instead of the column in the group (mean(temp[,2],na.rm=T) does not depend on the context at all). You need to do the following:

> temp %>% group_by(temp[,1]) %>% summarise(n=n(), mean=mean(b, na.rm=T))
# A tibble: 3 × 3
  `temp[, 1]`     n  mean
        <dbl> <int> <dbl>
1           1     3     1
2           2     3     2
3           3     3     3

Furthermore it is more common to use the column name in the group_by as well:

> temp %>% group_by(b) %>% summarise(n=n(), mean=mean(b, na.rm=T))
# A tibble: 3 × 3
      b     n  mean
  <dbl> <int> <dbl>
1     1     3     1
2     2     3     2
3     3     3     3

这篇关于dplyr返回每个组的全局平均值，而不是每个组的平均值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

dplyr返回每个组的全局平均值，而不是每个组的平均值 [英] dplyr returns global mean for each group, instead of each groups mean

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

dplyr返回每个组的全局平均值，而不是每个组的平均值 [英] dplyr returns global mean for each group, instead of each groups mean

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭