使用ggplot在R中覆盖正常的desnity曲线 [英] Overlay normal desnity curves in R using ggplot

查看:486
本文介绍了使用ggplot在R中覆盖正常的desnity曲线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用 ggplot 覆盖R中的堆积直方图上的正常密度曲线。 bsa是数字测量,它们被记录为治疗和控制两个组。

我为这两组创建了堆积直方图。我得到一个关于映射需要是未评估映射列表的stat_function的错误。



有关如何做到这一点的任何建议,将不胜感激。

  ggplot(data = bsa,aes(x = bsa))+ geom_histogram(colors(distinct = TRUE))+ facet_grid(group〜。)+ 
stat_function(dnorm(x,mean(bsa $ bsa),sd bsa $ bsa)))+
ggtitle(按分组计算的BSA数量)


解决方案使用 stat_function(...)与方面是棘手的。 stat_function(...)需要一个参数 args = ... ,它需要是额外的命名列表(在你的情况下,意味着 sd )。问题是这些不能出现在 aes(...)中,所以你必须手动添加曲线。下面是一个例子。

  set.seed(1)#用于可重现的示例
df < - data.frame (bsa = rnorm(200,mean = rep(c(1,4),each = 100)),
group = rep(c(test,control),each = 100))
#按组计算均值和sd
stats< - 聚合(bsa〜group,df,function(x)c(mean = mean(x),sd = sd(x)))
stats< - data.frame(group = stats [,1],stats [,2])

library(ggplot2)
ggplot(df,aes(x = bsa))+
geom_histogram(aes(y = .. density ..,fill = group),color =grey30)+
with(stats [stats $ group ==control,],stat_function(data = df [df $ group ==control,],fun = dnorm,args = list(mean = mean,sd = sd)))+
with(stats [stats $ group ==test, ],stat_function(data = df [df $ group ==test,],fun = dnorm,args = list(mean = mean,sd = sd)))+
facet_grid(group〜。)



这是你的gly,所以我通常只是计算 ggplot 外部的曲线,并使用 geom_line(...)添加它们。

  x < -  with(df,seq(min(bsa),max(bsa),len = 100))
dfn< - do.call(rbind,lapply(1:nrow(stats),
函数(i)with(stats [i,],data.frame(group,x,y = dnorm(x ,mean = mean,sd = sd))))
ggplot(df,aes(x = bsa))+
geom_histogram(aes(y = .. density ..,fill = group), color =grey30)+
geom_line(data = dfn,aes(x,y))+
facet_grid(group〜。)



这使得 ggplot

注意,如果你想覆盖一个核心密度估计值,而不是一个正常的曲线,这个会更容易:

  ggplot(df,aes(x = bsa) )+ 
geom_histogram(aes(y = .. density ..,fill = group),color =grey30)+
stat_density(geom =line)+
facet_grid(group 〜。)


I'm trying to overlay normal density curves over my stacked histograms in R using ggplot. bsa are numerical measures and they are recorded for two groups, treatment and control.

I have created stacked histograms for the two groups. I get an error with stat_function about the mapping needing to be a list of unevaluated mappings.

Any advice on how to do this would be appreciated.

ggplot(data=bsa, aes(x=bsa)) +geom_histogram(colours(distinct=TRUE)) + facet_grid(group~.) +
  stat_function(dnorm(x, mean(bsa$bsa),sd(bsa$bsa)))+
  ggtitle("Histogram of BSA amounts by group")  

解决方案

Using stat_function(...) with facets is tricky. stat_function(...) takes an argument args=... which needs to be a named list of the extra arguments to the function (so in your case, mean and sd). The problem is that these cannot appear in aes(...) so you have to add the curves manually. Here is an example.

set.seed(1)   # for reproducible example
df <- data.frame(bsa=rnorm(200, mean=rep(c(1,4),each=100)), 
                 group=rep(c("test","control"),each=100))
# calculate mean and sd by group
stats <- aggregate(bsa~group, df, function(x) c(mean=mean(x), sd=sd(x)))
stats <- data.frame(group=stats[,1],stats[,2])

library(ggplot2)
ggplot(df, aes(x=bsa)) +
  geom_histogram(aes(y=..density..,fill=group), color="grey30")+
  with(stats[stats$group=="control",],stat_function(data=df[df$group=="control",],fun=dnorm, args=list(mean=mean, sd=sd)))+
  with(stats[stats$group=="test",],stat_function(data=df[df$group=="test",],fun=dnorm, args=list(mean=mean, sd=sd)))+
  facet_grid(group~.)

This is rather ugly, so I usually just calculae the curves external to ggplot and add them using geom_line(...).

x <- with(df, seq(min(bsa), max(bsa), len=100))
dfn <- do.call(rbind,lapply(1:nrow(stats), 
                            function(i) with(stats[i,],data.frame(group, x, y=dnorm(x,mean=mean,sd=sd)))))
ggplot(df, aes(x=bsa)) +
  geom_histogram(aes(y=..density..,fill=group), color="grey30")+
  geom_line(data=dfn, aes(x, y))+
  facet_grid(group~.)

This makes the ggplot code much more readable and produces pretty much the same thing.

Notice that if you wanted to overlay a kernel density estimate, rather than a normal curve, this would be a lot easier:

ggplot(df, aes(x=bsa)) +
  geom_histogram(aes(y=..density..,fill=group), color="grey30")+
  stat_density(geom="line")+
  facet_grid(group~.)

这篇关于使用ggplot在R中覆盖正常的desnity曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆