使用 stat_summary 用观察次数注释绘图 [英] Use stat_summary to annotate plot with number of observations

查看:26
本文介绍了使用 stat_summary 用观察次数注释绘图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何使用 stat_summary 来标记带有 n = x 的图,其中 x 是一个变量?以下是所需输出的示例:

How can I use stat_summary to label a plot with n = x where is x a variable? Here's an example of the desired output:

我可以用这个相当低效的代码制作上面的图:

I can make that above plot with this rather inefficient code:

nlabels <- sapply(1:length(unique(mtcars$cyl)), function(i) as.vector(t(as.data.frame(table(mtcars$cyl))[,2][[i]])))
ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
  geom_boxplot(fill = "grey80", colour = "#3366FF") + 
  geom_text(aes(x = 1, y = median(mtcars$mpg[mtcars$cyl==sort(unique(mtcars$cyl))[1]]), label = paste0("n = ",nlabels[[1]]) )) +
  geom_text(aes(x = 2, y = median(mtcars$mpg[mtcars$cyl==sort(unique(mtcars$cyl))[2]]), label = paste0("n = ",nlabels[[2]]) )) +
  geom_text(aes(x = 3, y = median(mtcars$mpg[mtcars$cyl==sort(unique(mtcars$cyl))[3]]), label = paste0("n = ",nlabels[[3]]) )) 

这是对这个问题的跟进:如何在 ggplot2 boxplot 中为每组添加多个观察值并使用组均值? 在哪里我可以使用 stat_summary 来计算和显示观察的数量,但我一直无法找到一种方法在 n = 中包含 >stat_summary 输出.看起来 stat_summary 可能是进行此类标记的最有效方法,但欢迎使用其他方法.

This is a follow up to this question: How to add a number of observations per group and use group mean in ggplot2 boxplot? where I can use stat_summary to calculate and display the number of observations, but I haven't been able to find a way to include n = in the stat_summary output. Seems like stat_summary might be the most efficient way to do this kind of labelling, but other methods are welcome.

推荐答案

您可以在stat_summary() 中创建自己的函数来使用.这里 n_fun 计算 y 值的位置为 median() 然后添加由 n= 组成的 label= 和观察次数.使用 data.frame() 而不是 c() 很重要,因为 paste0() 会产生字符但 y 值是数字,但 c() 会使两者都成为字符.然后在 stat_summary() 中使用这个函数和 geom="text".这将确保每个 x 值的位置和标签仅由该级别的数据制成.

You can make your own function to use inside the stat_summary(). Here n_fun calculate place of y value as median() and then add label= that consist of n= and number of observations. It is important to use data.frame() instead of c() because paste0() will produce character but y value is numeric, but c() would make both character. Then in stat_summary() use this function and geom="text". This will ensure that for each x value position and label is made only from this level's data.

n_fun <- function(x){
  return(data.frame(y = median(x), label = paste0("n = ",length(x))))
}

ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
  geom_boxplot(fill = "grey80", colour = "#3366FF") + 
  stat_summary(fun.data = n_fun, geom = "text")

这篇关于使用 stat_summary 用观察次数注释绘图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆