在R组中,ddply与weighted.mean组合 [英] group by in R, ddply with weighted.mean

查看:236
本文介绍了在R组中,ddply与weighted.mean组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在R中做一个group by式加权平均值。基本上,下面的代码(使用Hadley的plyr包)运行良好。

I am trying to do a "group by" - style weighted mean in R. With some basic mean the following code (using the plyr package from Hadley) worked well.

ddply(mydf,.(period),mean)

如果我使用与weighted.mean相同的方法,我会得到以下错误'x'和'w'必须具有相同的长度,这是我不明白的,因为weighted.mean部分在ddply之外工作。

If I use the same approach with weighted.mean i get the following error "'x' and 'w' must have the same length" , which I do not understand because the weighted.mean part works outside ddply.

weighted.mean(mydf$mycol,mydf$myweight) # works just fine
ddply(mydf,.(period),weighted.mean,mydf$mycol,mydf$myweight) # returns the erros described above
ddply(mydf,.(period),weighted.mean(mydf$mycol,mydf$myweight)) # different code same story

我想写一个自定义函数,而不是使用weighted.mean,然后将它传递给ddply甚至从头开始写一些新的子集。就我而言,希望工作太多,但应该有一个更聪明的解决方案,以及已有的东西。

I thought of writing a custom function instead of using weighted.mean and then passing it to ddply or even writing something new from scratch with subset. In my case it would be too much work hopefully, but there should by a smarter solution with what´s already there.

thx提前给出任何建议!

thx for any suggestions in advance!

推荐答案

使用匿名功能:

Use an anonymous function:

> ddply(iris,"Species",function(X) data.frame(wmn=weighted.mean(X$Sepal.Length,
+                                                               X$Petal.Length),
+                                             mn=mean(X$Sepal.Length)))
     Species      wmn    mn
1     setosa 5.016963 5.006
2 versicolor 5.978075 5.936
3  virginica 6.641535 6.588
> 

计算一个Sepal.Length的加权平均值(由Petal.Length加权)以及未加权平均值并返回两者。

This computes a weighted mean of Sepal.Length (weighted by Petal.Length) as well as unweighted mean and returns both.

这篇关于在R组中,ddply与weighted.mean组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆