如何在箱形图上绘制加权平均值 [英] How to plot weighted means on a boxplot

查看:714
本文介绍了如何在箱形图上绘制加权平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

经过相当长时间的寻找解决方案和摆弄之后,我试图在boxplot上显示加权平均数(我以为我已经将此查询提交到ggplot2邮件列表,但已经超过4小时前我的问题没有在那里出现,所以担心我在我的文章中犯了一个错误,我在这里发布 - 因为我的问题是非常紧急的)。



我提供了一个玩具示例如下。

  #data 

值< - c(5,7,8, 6,7,9,10,6,7,10)
类别<-c(一,一,一,二,二,二,
三,三,三,三)
权重<-c(1,1.2,2,3,2.2,2.5,1.8,1.9,2.2,1.5)
df< - data.frame(值,类别,权重)

#未加权平均值类别
ddply(df,。(category),summarize,mean = round(mean值,na.rm = TRUE),2))

类别平均值
1一个6.67
2三个8.25
3两个7.33

#加权表示按类别
ddply(df,。(category),summary,
wmean = round(wtd.mean(value,weight,na.rm = TRUE),2))

category wmean
1一个7.00
2三个8.08
3两个7.26

#未加权意味着加入boxplot(工作正常)
ggplot(df,aes( x = category,y = value,weight = weight))+
geom_boxplot(width = 0.6,color = I(#3366FF))+
stat_summary(fun.y =mean,geom =point,shape = 23,
size = 3,fill =white)

我的问题是,如何在boxplot上显示加权平均值而不是未加权平均值?

解决方案

您可以保存加权平均值作为新的数据框,然后用它来绘制 geom_point()。参数 inherit.aes = FALSE 将确保在不继承 ggplot()调用中提供的信息的情况下绘制点。图书馆(plyr)
图书馆(ggplot2)
df.wm< b

  -ddply(df,。(category),summary,
wmean = round(wtd.mean(value,weight,na.rm = TRUE),2))

ggplot(df, aes(x = category,y = value,weight = weight))+
geom_boxplot(width = 0.6,color = I(#3366FF))+
geom_point(data = df.wm,aes (x = category,y = wmean),shape = 23,
size = 3,fill =white,inherit.aes = FALSE)


After considerable time searching for a solution and fiddling, I am reaching out for help in my attempt to display weighted means on a boxplot (I thought I had submitted this inquiry to the ggplot2 mailing list, but that was over 4 hours ago and my question has not surfaced there, so fearing I made a mistake in my post, I am posting here -- as my question is quite urgent).

I provide a toy example below.

#data

value <- c(5, 7, 8, 6, 7, 9, 10, 6, 7, 10)
category <- c("one", "one", "one", "two", "two", "two",
              "three", "three", "three","three")
weight <- c(1, 1.2, 2, 3, 2.2, 2.5, 1.8, 1.9, 2.2, 1.5)
df <- data.frame(value, category, weight)

#unweighted means by category
ddply(df, .(category), summarize, mean=round(mean(value, na.rm=TRUE), 2))

  category mean
1      one 6.67
2    three 8.25
3      two 7.33

#weighted means by category
ddply(df, .(category), summarize, 
          wmean=round(wtd.mean(value, weight, na.rm=TRUE), 2))

  category wmean
1      one  7.00
2    three  8.08
3      two  7.26

#unweighted means added to boxplot (which works fine)
ggplot(df, aes(x = category, y = value, weight = weight)) + 
   geom_boxplot(width=0.6,  colour = I("#3366FF")) + 
   stat_summary( fun.y ="mean", geom ="point", shape = 23, 
                 size = 3, fill ="white") 

My question is, how can I display weighted means on the boxplot instead of unweighted means?

解决方案

You can save weighted means as new data frame and then use it to plot geom_point(). Argument inherit.aes=FALSE will ensure that points are plotted without inheriting information provided in ggplot() call.

library(Hmisc)
library(plyr)
library(ggplot2)
df.wm<-ddply(df, .(category), summarize, 
             wmean=round(wtd.mean(value, weight, na.rm=TRUE), 2))

ggplot(df, aes(x = category, y = value, weight = weight)) + 
  geom_boxplot(width=0.6,  colour = I("#3366FF")) + 
  geom_point(data=df.wm,aes(x=category,y=wmean),shape = 23, 
             size = 3, fill ="white",inherit.aes=FALSE)

这篇关于如何在箱形图上绘制加权平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆