ggplot2:如何将样本编号添加到密度图? [英] ggplot2: how to add sample numbers to density plot?

查看:139
本文介绍了ggplot2:如何将样本编号添加到密度图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试生成一个标有样本量的(分组的)密度图.

I am trying to generate a (grouped) density plot labelled with sample sizes.

样本数据:

set.seed(100)
df <- data.frame(ab.class = c(rep("A", 200), rep("B", 200)),
                 val = c(rnorm(200, 0, 1), rnorm(200, 1, 1)))

将生成未标记的密度图,其外观如下:

The unlabelled density plot is generated and looks as follows:

ggplot(df, aes(x = val, group = ab.class)) +
  geom_density(aes(fill = ab.class), alpha = 0.4)

我想做的是在每个密度的峰值附近添加文本标签,以显示每个组中的样本数量.但是,我找不到以这种方式汇总数据的正确选项组合.

What I want to do is add text labels somewhere near the peak of each density, showing the number of samples in each group. However, I cannot find the right combination of options to summarise the data in this way.

我试图使此答案中建议的代码适合于类似箱图的问题: https://stackoverflow.com/a/15720769/1836013

I tried to adapt the code suggested in this answer to a similar question on boxplots: https://stackoverflow.com/a/15720769/1836013

n_fun <- function(x){
  return(data.frame(y = max(x), label = paste0("n = ",length(x))))
}

ggplot(df, aes(x = val, group = ab.class)) +
  geom_density(aes(fill = ab.class), alpha = 0.4) +
  stat_summary(geom = "text", fun.data = n_fun)

但是,此操作失败,并显示Error: stat_summary requires the following missing aesthetics: y.

However, this fails with Error: stat_summary requires the following missing aesthetics: y.

我还尝试为geom_density()stat_summary()的每一层在aes()内以及在ggplot()对象本身中添加y = ..density..,但没有一个解决了问题.

I also tried adding y = ..density.. within aes() for each of the geom_density() and stat_summary() layers, and in the ggplot() object itself... none of which solved the problem.

我知道这可以通过手动为每个组添加标签来实现,但是我希望有一个概括的解决方案,例如允许通过aes()设置标签颜色以匹配密度.

I know this could be achieved by manually adding labels for each group, but I was hoping for a solution that generalises, and e.g. allows the label colour to be set via aes() to match the densities.

我要去哪里错了?

推荐答案

fun.data返回的y不是aes. stat_summary抱怨他找不到y,如果y的全局设置不可用,则应在ggplot(df, aes(x = val, group = ab.class, y =stat_summary(aes(y =的全局设置中指定该y. fun.data根据通过aes的数据中给出的y来计算在每个x上显示点/文本/...的位置. (我不确定是否已经明确了这一点.不是英语为母语的人.)

The y in the return of fun.data is not the aes. stat_summary complains that he cannot find y, which should be specificed in global settings at ggplot(df, aes(x = val, group = ab.class, y = or stat_summary(aes(y = if global setting of y is not available. The fun.data compute where to display point/text/... at each x based on y given in the data through aes. (I am not sure whether I have made this clear. Not a native English speaker).

即使您已通过aes指定了y,也不会获得期望的结果,因为stat_summary在每个x上计算一个y.

Even if you have specified y through aes, you won't get desired results because stat_summary compute a y at each x.

但是,您可以通过geom_textannotate将文本添加到所需位置:

However, you can add text to desired positions by geom_text or annotate:

# save the plot as p
p <- ggplot(df, aes(x = val, group = ab.class)) +
    geom_density(aes(fill = ab.class), alpha = 0.4)

# build the data displayed on the plot.
p.data <- ggplot_build(p)$data[[1]]

# Note that column 'scaled' is used for plotting
# so we extract the max density row for each group
p.text <- lapply(split(p.data, f = p.data$group), function(df){
    df[which.max(df$scaled), ]
})
p.text <- do.call(rbind, p.text)  # we can also get p.text with dplyr.

# now add the text layer to the plot
p + annotate('text', x = p.text$x, y = p.text$y,
             label = sprintf('n = %d', p.text$n), vjust = 0)

这篇关于ggplot2:如何将样本编号添加到密度图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆