dplyr：组数的平均值 [英] dplyr: mean of a group count

查看：153 发布时间：2017/7/13 22:12:47 r aggregate dplyr

本文介绍了dplyr：组数的平均值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

  x<  -  data％我想使用dplyr找到数据框中变量的平均长度>％group_by（Date，`％Bucket`）％>％summarize（count = n（））
 
日期％Bucket计数
（date）（fctr）（int）
 1 2015-01-05 <= 1 1566 
 2 2015-01-05（1-25）421 
 3 2015-01-05（25-50）461 
 4 2015-01-05（50-75）485 
 5 2015-01-05（75-100）662 
 6 2015-01-05（100-150）1693 
 7 2015 -01-05> 150 12359 
 8 2015-01-13 <= 1 1608 
 9 2015-01-13（1-25）441 
 10 2015-01-13（ 25-50] 425

如何聚合以查找每个％Bucket的平均值 $ dplyr ？

 在基数：
x<  -  as.data.frame（x）
 aggregate（count〜`％Bucket`，data = x，FUN = mean）
 
％ 
 1< = 1 2609.5294 
 2（1-25] 449.0000 
 3（25-50）528.7059 
 4 （50-75）593.2157 
 5（75-100] 763.0000 
 6（100-150）1758.6667 
 7> 150 12457.9216 
  pre> 
 
 聚合函数将采用dplyr在上面的每个数据桶中找到的计数，并将它们除以包含％Bucket的行数变量，并给出上面的答案。怎么可以用dplyr来完成这个？这不是关于完成问题，而是了解在这种情况下如何使用dplyr软件包。
 
 
 这种类型的事物的另一个例子是总结每个 group_by 变量的 n（），并列出最小长度在52周内该变量的计数。
 
 
 我正在努力，因为dplyr似乎是为了在列中找到一个值的平均值，但在这里我计数在列中给出一个变量的行数，并试图找到它的平均值，最小值，最大值等。
解决方案
 p>我们可以使用 dplyr 方法
  library（dplyr）
x％>％
 group_by（`％Bucket`）％>％
 summaryize（count = mean（count））
  
 
I am trying to find the mean length of a variable over a dataframe using dplyr:
x <- data %>% group_by(Date, `% Bucket`) %>% summarise(count = n())

Date          % Bucket count
(date)    (fctr) (int)
1  2015-01-05       <=1  1566
2  2015-01-05    (1-25]   421
3  2015-01-05   (25-50]   461
4  2015-01-05   (50-75]   485
5  2015-01-05  (75-100]   662
6  2015-01-05 (100-150]  1693
7  2015-01-05      >150 12359
8  2015-01-13       <=1  1608
9  2015-01-13    (1-25]   441
10 2015-01-13   (25-50]   425
How to aggregate to find average across each % Bucket over the year with dplyr?
in base:
x <- as.data.frame(x)
aggregate(count ~ `% Bucket`, data = x, FUN=mean)

% Bucket      count
1       <=1  2609.5294
2    (1-25]   449.0000
3   (25-50]   528.7059
4   (50-75]   593.2157
5  (75-100]   763.0000
6 (100-150]  1758.6667
7      >150 12457.9216
Aggregate function will take the count found by dplyr across each bucket above and sum them, dividing by the number of rows that contain that % Bucket variable and give the answer above. How can I accomplish this with dplyr though? This is not about completing the problem but understanding how the dplyr package would be used in such a scenario.

Another example of this type of thing would be summarise the n() of each group_by variable and also listing the minimum length "count" of that variable across the 52 weeks.

I am struggling because dplyr seems to be built to find a mean of a value in a column, but here I am counting the number of row occurrences given a variable in a column and trying to find the mean, min, max, etc. of it.
 解决方案 
We can use dplyr methods
library(dplyr)
x %>%
   group_by(`% Bucket`) %>%
   summarise(count= mean(count))


                        
这篇关于dplyr：组数的平均值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

dplyr：组数的平均值 [英] dplyr: mean of a group count

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

dplyr：组数的平均值 [英] dplyr: mean of a group count

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭