如何在ggplot2 boxplot中添加多组观察值并使用组均值? [英] How to add a number of observations per group and use group mean in ggplot2 boxplot?
问题描述
我正在做一个基本的盒子图,其中 y =年龄
和 x =患者组
age <-ggplot(data,aes(factor(group2),age))+ ylim(15,80)
age + geom_boxplot (fill =grey80,color =#3366FF)
我希望你能帮助我有几件事情:
1)是否有可能在每个组框图之上包含多组观察值(但不包括在X轴上,其中我的组标签是)而不必在油漆:)做到这一点?
我尝试过使用:
age + annotate(text,x =CON,y = 60 ,label =25)
其中 CON
是第一组,并且 y = 60
就在这个组的箱线之上。但是,该命令不起作用。我假设它有一些事情要做,它读取 x
作为连续而不是分类变量。
2)尽管箱形图使用均值而不是中值有很多问题,但我还没有找到适合我的代码? 3)关于同样的问题,你有没有办法在boxplot中包含平均群体属性?也许使用
age + stat_summary(fun.y = mean,color =red,geom =point)
然而这只包含了一个点的位置。或再次使用
age + annotate(text,x =CON,y = 30,label =30 )
其中 CON
是第一组, y = 30
是〜组年龄平均值。
知道如何灵活和丰富 ggplot2
语法是我希望有一个更优雅的方式来使用真正的统计输出,而不是注释
。
任何建议/链接都将非常感谢!
感谢!!
重新过后?使用 stat_summary
,根据要求:
#观察数目的函数# b $ b give.n< - function(x){
return(c(y = median(x)* 1.05,label = length(x)))
#用乘数寻找实验完美位置
}
#均值标签函数
mean.n < - function(x){
return(c(y = median(x) * 0.97,label = round(mean(x),2)))
#用乘数找到完美的位置
}
#plot
ggplot (mtcars,aes(factor(cyl),mpg,label = rownames(mtcars)))+
geom_boxplot(fill =grey80,color =#3366FF)+
stat_summary(fun.data = give.n,geom =text,fun.y = median)+
stat_summary(fun.data = mean.n,geom =text,fun.y = mean,color =red)
黑色数字是观察数量,红色数字是平均值。乔兰的答案告诉你如何把数字放在框的顶部
提示: https://stackoverflow.com/a/3483657/ 1036500
I am doing a basic boxplot where y=age
and x=Patient groups
age <- ggplot(data, aes(factor(group2), age)) + ylim(15, 80)
age + geom_boxplot(fill = "grey80", colour = "#3366FF")
I was hoping you could help me out with a few things:
1) Is it possible to include a number of observations per group above each group boxplot (but NOT on the X axis where my group labels are) without having to do this in paint :)? I have tried using:
age + annotate("text", x = "CON", y = 60, label = "25")
where CON
is the 1st group and y = 60
is ~ just above the boxplot for this group. However, the command didn't work. I assume it has something to do that it reads x
as a continuous rather than a categorical variable.
2) Also although there are plenty of questions about using the mean rather than the median for the boxplots, I still haven`t found a code that works for me?
3) On the same matter is there a way you could include the mean group stat in the boxplot? Perhaps using
age + stat_summary(fun.y=mean, colour="red", geom="point")
which however only includes a dot of where the mean lies. Or again using
age + annotate("text", x = "CON", y = 30, label = "30")
where CON
is the 1st group and y = 30
is ~ the group age mean.
Knowing how flexible and rich ggplot2
syntax is I was hoping that there is a more elegant way of using the real stats output rather than annotate
.
Any suggestions/links would be much appreciated!
Thanks!!
Is this anything like what you're after? With stat_summary
, as requested:
# function for number of observations
give.n <- function(x){
return(c(y = median(x)*1.05, label = length(x)))
# experiment with the multiplier to find the perfect position
}
# function for mean labels
mean.n <- function(x){
return(c(y = median(x)*0.97, label = round(mean(x),2)))
# experiment with the multiplier to find the perfect position
}
# plot
ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
geom_boxplot(fill = "grey80", colour = "#3366FF") +
stat_summary(fun.data = give.n, geom = "text", fun.y = median) +
stat_summary(fun.data = mean.n, geom = "text", fun.y = mean, colour = "red")
Black number is number of observations, red number is mean value. joran's answer shows you how to put the numbers at the top of the boxes
hat-tip: https://stackoverflow.com/a/3483657/1036500
这篇关于如何在ggplot2 boxplot中添加多组观察值并使用组均值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!