dplyr:组数的平均值 [英] dplyr: mean of a group count
问题描述
x< - data%我想使用dplyr找到数据框中变量的平均长度>%group_by(Date,`%Bucket`)%>%summarize(count = n())
日期%Bucket计数
(date)(fctr)(int)
1 2015-01-05 <= 1 1566
2 2015-01-05(1-25)421
3 2015-01-05(25-50)461
4 2015-01-05(50-75)485
5 2015-01-05(75-100)662
6 2015-01-05(100-150)1693
7 2015 -01-05> 150 12359
8 2015-01-13 <= 1 1608
9 2015-01-13(1-25)441
10 2015-01-13( 25-50] 425
如何聚合以查找每个%Bucket的平均值
$ dplyr
?
在基数:
pre>
x< - as.data.frame(x)
aggregate(count〜`%Bucket`,data = x,FUN = mean)
%
1< = 1 2609.5294
2(1-25] 449.0000
3(25-50)528.7059
4 (50-75)593.2157
5(75-100] 763.0000
6(100-150)1758.6667
7> 150 12457.9216
聚合函数将采用dplyr在上面的每个数据桶中找到的计数,并将它们除以包含
%Bucket的行数
变量,并给出上面的答案。怎么可以用dplyr来完成这个?这不是关于完成问题,而是了解在这种情况下如何使用dplyr软件包。
这种类型的事物的另一个例子是
总结
每个group_by
变量的n()
,并列出最小长度在52周内该变量的计数。
我正在努力,因为dplyr似乎是为了在列中找到一个值的平均值,但在这里我计数在列中给出一个变量的行数,并试图找到它的平均值,最小值,最大值等。
解决方案p>我们可以使用
dplyr
方法library(dplyr)
x%>%
group_by(`%Bucket`)%>%
summaryize(count = mean(count))
I am trying to find the mean length of a variable over a dataframe using dplyr:
x <- data %>% group_by(Date, `% Bucket`) %>% summarise(count = n()) Date % Bucket count (date) (fctr) (int) 1 2015-01-05 <=1 1566 2 2015-01-05 (1-25] 421 3 2015-01-05 (25-50] 461 4 2015-01-05 (50-75] 485 5 2015-01-05 (75-100] 662 6 2015-01-05 (100-150] 1693 7 2015-01-05 >150 12359 8 2015-01-13 <=1 1608 9 2015-01-13 (1-25] 441 10 2015-01-13 (25-50] 425
How to aggregate to find average across each
% Bucket
over the year withdplyr
?in base: x <- as.data.frame(x) aggregate(count ~ `% Bucket`, data = x, FUN=mean) % Bucket count 1 <=1 2609.5294 2 (1-25] 449.0000 3 (25-50] 528.7059 4 (50-75] 593.2157 5 (75-100] 763.0000 6 (100-150] 1758.6667 7 >150 12457.9216
Aggregate function will take the count found by dplyr across each bucket above and sum them, dividing by the number of rows that contain that
% Bucket
variable and give the answer above. How can I accomplish this with dplyr though? This is not about completing the problem but understanding how the dplyr package would be used in such a scenario.Another example of this type of thing would be
summarise
then()
of eachgroup_by
variable and also listing the minimum length "count" of that variable across the 52 weeks.I am struggling because dplyr seems to be built to find a mean of a value in a column, but here I am counting the number of row occurrences given a variable in a column and trying to find the mean, min, max, etc. of it.
解决方案We can use
dplyr
methodslibrary(dplyr) x %>% group_by(`% Bucket`) %>% summarise(count= mean(count))
这篇关于dplyr:组数的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!