ggplot2:如何将样本编号添加到密度图? [英] ggplot2: how to add sample numbers to density plot?
问题描述
我正在尝试生成一个标有样本量的(分组的)密度图.
I am trying to generate a (grouped) density plot labelled with sample sizes.
样本数据:
set.seed(100)
df <- data.frame(ab.class = c(rep("A", 200), rep("B", 200)),
val = c(rnorm(200, 0, 1), rnorm(200, 1, 1)))
将生成未标记的密度图,其外观如下:
The unlabelled density plot is generated and looks as follows:
ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4)
我想做的是在每个密度的峰值附近添加文本标签,以显示每个组中的样本数量.但是,我找不到以这种方式汇总数据的正确选项组合.
What I want to do is add text labels somewhere near the peak of each density, showing the number of samples in each group. However, I cannot find the right combination of options to summarise the data in this way.
我试图使此答案中建议的代码适合于类似箱图的问题: https://stackoverflow.com/a/15720769/1836013
I tried to adapt the code suggested in this answer to a similar question on boxplots: https://stackoverflow.com/a/15720769/1836013
n_fun <- function(x){
return(data.frame(y = max(x), label = paste0("n = ",length(x))))
}
ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4) +
stat_summary(geom = "text", fun.data = n_fun)
但是,此操作失败,并显示Error: stat_summary requires the following missing aesthetics: y
.
However, this fails with Error: stat_summary requires the following missing aesthetics: y
.
我还尝试为geom_density()
和stat_summary()
的每一层在aes()
内以及在ggplot()
对象本身中添加y = ..density..
,但没有一个解决了问题.
I also tried adding y = ..density..
within aes()
for each of the geom_density()
and stat_summary()
layers, and in the ggplot()
object itself... none of which solved the problem.
我知道这可以通过手动为每个组添加标签来实现,但是我希望有一个概括的解决方案,例如允许通过aes()
设置标签颜色以匹配密度.
I know this could be achieved by manually adding labels for each group, but I was hoping for a solution that generalises, and e.g. allows the label colour to be set via aes()
to match the densities.
我要去哪里错了?
推荐答案
fun.data
返回的y
不是aes. stat_summary
抱怨他找不到y
,如果y
的全局设置不可用,则应在ggplot(df, aes(x = val, group = ab.class, y =
或stat_summary(aes(y =
的全局设置中指定该y
. fun.data
根据通过aes
的数据中给出的y
来计算在每个x
上显示点/文本/...的位置. (我不确定是否已经明确了这一点.不是英语为母语的人.)
The y
in the return of fun.data
is not the aes. stat_summary
complains that he cannot find y
, which should be specificed in global settings at ggplot(df, aes(x = val, group = ab.class, y =
or stat_summary(aes(y =
if global setting of y
is not available. The fun.data
compute where to display point/text/... at each x
based on y
given in the data through aes
. (I am not sure whether I have made this clear. Not a native English speaker).
即使您已通过aes
指定了y
,也不会获得期望的结果,因为stat_summary
在每个x
上计算一个y
.
Even if you have specified y
through aes
, you won't get desired results because stat_summary
compute a y
at each x
.
但是,您可以通过geom_text
或annotate
将文本添加到所需位置:
However, you can add text to desired positions by geom_text
or annotate
:
# save the plot as p
p <- ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4)
# build the data displayed on the plot.
p.data <- ggplot_build(p)$data[[1]]
# Note that column 'scaled' is used for plotting
# so we extract the max density row for each group
p.text <- lapply(split(p.data, f = p.data$group), function(df){
df[which.max(df$scaled), ]
})
p.text <- do.call(rbind, p.text) # we can also get p.text with dplyr.
# now add the text layer to the plot
p + annotate('text', x = p.text$x, y = p.text$y,
label = sprintf('n = %d', p.text$n), vjust = 0)
这篇关于ggplot2:如何将样本编号添加到密度图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!