使用stat_summary通过观察值的数量对图进行注释 [英] Use stat_summary to annotate plot with number of observations
问题描述
如何使用 stat_summary
来为 n = x
标记一个图,其中 x
一个变量?以下是所需输出的示例:
How can I use stat_summary
to label a plot with n = x
where is x
a variable? Here's an example of the desired output:
我可以用这个效率很低的代码来做上面的绘图:
I can make that above plot with this rather inefficient code:
nlabels <- sapply(1:length(unique(mtcars$cyl)), function(i) as.vector(t(as.data.frame(table(mtcars$cyl))[,2][[i]])))
ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
geom_boxplot(fill = "grey80", colour = "#3366FF") +
geom_text(aes(x = 1, y = median(mtcars$mpg[mtcars$cyl==sort(unique(mtcars$cyl))[1]]), label = paste0("n = ",nlabels[[1]]) )) +
geom_text(aes(x = 2, y = median(mtcars$mpg[mtcars$cyl==sort(unique(mtcars$cyl))[2]]), label = paste0("n = ",nlabels[[2]]) )) +
geom_text(aes(x = 3, y = median(mtcars$mpg[mtcars$cyl==sort(unique(mtcars$cyl))[3]]), label = paste0("n = ",nlabels[[3]]) ))
这是这个问题的后续内容:如何在每个组中添加多个观察值并在ggplot2 boxplot中使用组均值?可以使用 stat_summary
以计算并显示观察次数,但我一直无法找到在 stat_summary $ c中包含
n =
的方法$ c>输出。看起来像 stat_summary
可能是进行这种标记的最有效的方法,但也可以使用其他方法。
This is a follow up to this question: How to add a number of observations per group and use group mean in ggplot2 boxplot? where I can use stat_summary
to calculate and display the number of observations, but I haven't been able to find a way to include n =
in the stat_summary
output. Seems like stat_summary
might be the most efficient way to do this kind of labelling, but other methods are welcome.
推荐答案
您可以在 stat_summary()
中使用自己的函数。这里 n_fun
计算y值的位置为 median()
,然后添加 label =
,其中包括 n =
和观测值的数量。使用 data.frame()
代替 c()
是很重要的,因为 paste0( )
会产生字符,但 y
的值是数字,但是 c()
字符。然后在 stat_summary()
中使用此函数,并在 geom =text
中使用。这将确保对于每个x值,位置和标签只能从这个级别的数据中获得。
You can make your own function to use inside the stat_summary()
. Here n_fun
calculate place of y value as median()
and then add label=
that consist of n=
and number of observations. It is important to use data.frame()
instead of c()
because paste0()
will produce character but y
value is numeric, but c()
would make both character. Then in stat_summary()
use this function and geom="text"
. This will ensure that for each x value position and label is made only from this level's data.
n_fun <- function(x){
return(data.frame(y = median(x), label = paste0("n = ",length(x))))
}
ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
geom_boxplot(fill = "grey80", colour = "#3366FF") +
stat_summary(fun.data = n_fun, geom = "text")
这篇关于使用stat_summary通过观察值的数量对图进行注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!