如何计算R中数据帧中每组的均值/中位数 [英] how to calculate mean/median per group in a dataframe in r
本文介绍了如何计算R中数据帧中每组的均值/中位数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据帧,详细记录了客户的花费,如下所示:
I have a dataframe recording how much money a costomer spend in detail like the following:
custid, value
1, 1
1, 3
1, 2
1, 5
1, 4
1, 1
2, 1
2, 10
3, 1
3, 2
3, 5
如何使用均值,最大值,中位数,标准等来计算特征?使用一些套用功能吗?以及如何?
How to calcuate the charicteristics using mean,max,median,std, etc like the following? Use some apply function? And how?
custid, mean, max,min,median,std
1, ....
2,....
3,....
推荐答案
要添加到替代方案中,请参见"doBy"包中的summaryBy
,您可以在其中指定要应用的函数的list
.>
To add to the alternatives, here's summaryBy
from the "doBy" package, with which you can specify a list
of functions to apply.
library(doBy)
summaryBy(value ~ custid, data = mydf,
FUN = list(mean, max, min, median, sd))
# custid value.mean value.max value.min value.median value.sd
# 1 1 2.666667 5 1 2.5 1.632993
# 2 2 5.500000 10 1 5.5 6.363961
# 3 3 2.666667 5 1 2.0 2.081666
当然,您也可以坚持使用R:
Of course, you can also stick with base R:
myFun <- function(x) {
c(min = min(x), max = max(x),
mean = mean(x), median = median(x),
std = sd(x))
}
tapply(mydf$value, mydf$custid, myFun)
# $`1`
# min max mean median std
# 1.000000 5.000000 2.666667 2.500000 1.632993
#
# $`2`
# min max mean median std
# 1.000000 10.000000 5.500000 5.500000 6.363961
#
# $`3`
# min max mean median std
# 1.000000 5.000000 2.666667 2.000000 2.081666
cbind(custid = unique(mydf$custid),
do.call(rbind, tapply(mydf$value, mydf$custid, myFun)))
# custid min max mean median std
# 1 1 1 5 2.666667 2.5 1.632993
# 2 2 1 10 5.500000 5.5 6.363961
# 3 3 1 5 2.666667 2.0 2.081666
这篇关于如何计算R中数据帧中每组的均值/中位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文