ggplot2:具有正常曲线的直方图 [英] ggplot2: histogram with normal curve

查看:227
本文介绍了ggplot2:具有正常曲线的直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我的公式:



我的直方图上覆盖了一条正常曲线

  data<  -  read.csv(path ...)

ggplot(data,aes(V2))+
geom_histogram(alpha = 0.3,fill ='white',color ='black',binwidth = .04)

I尝试了几件事:

  + stat_function(fun = dnorm)

没有改变任何东西

  + stat_density( geom =line,color =red)

...给了我一条红色的直线

  + geom_density()

不适合我,因为我想保留我的频率值在y轴上,并且不需要密度值。

有什么建议?



感谢您提供任何提示!



解决方案找到了!



+ geom_density(aes(y = 0.045 * .. count ..),color =black,adjust = 4)

解决方案

已回答



编辑

或者,为了更灵活的方法,使用方面并使用列出的方法此处,创建一个分隔符包含正常曲线数据的数据集并覆盖这些数据。

  library(plyr)

dd < - data.frame(
predict = rnorm(720,mean = 2,sd = 2),
state = rep(c(A,B,C)) = 240)


binwidth <-0.5

grid < - with(dd,seq(min(预测),max(预测),length = $ 100))
normaldens< - ddply(dd,state,function(df){
data.frame(
predict = grid,
normal_curve = dnorm(grid ,平均(df $预测),sd(df $预测))*长度(df $预测)* binwidth

})

ggplot(dd,aes(预测的))+
geom_histogram(breaks = seq(-3,10,binwidth),color =black,fill =white)+
geom_line(aes(y = normal_curve),data = normaldens ,color =red)+
facet_wrap(〜state)


I've been trying to superimpose a normal curve over my histogram with ggplot 2.

My formula:

data <- read.csv (path...)

ggplot(data, aes(V2)) + 
  geom_histogram(alpha=0.3, fill='white', colour='black', binwidth=.04)

I tried several things:

+ stat_function(fun=dnorm)  

....didn't change anything

+ stat_density(geom = "line", colour = "red")

...gave me a straight red line on the x-axis.

+ geom_density()  

doesn't work for me because I want to keep my frequency values on the y-axis, and want no density values.

Any suggestions?

Thanks in advance for any tips!

Solution found!

+geom_density(aes(y=0.045*..count..), colour="black", adjust=4)

解决方案

This has been answered here and partially here.

If you want the y-axis to have frequency counts, then the normal curve needs to be scaled according to the number of observations and the binwidth.

# Simulate some data. Individuals' heights in cm.
n        <- 1000
mean     <- 165
sd       <- 6.6
binwidth <- 2
height <- rnorm(n, mean, sd)


qplot(height, geom = "histogram", breaks = seq(130, 200, binwidth), 
      colour = I("black"), fill = I("white"),
      xlab = "Height (cm)", ylab = "Count") +
  # Create normal curve, adjusting for number of observations and binwidth
  stat_function( 
    fun = function(x, mean, sd, n, bw){ 
      dnorm(x = x, mean = mean, sd = sd) * n * bw
    }, 
    args = c(mean = mean, sd = sd, n = n, bw = binwidth))

EDIT

Or, for a more flexible approach that allows for use of facets and draws upon an approach listed here, create a separate dataset containing the data for the normal curves and overlay these.

library(plyr)

dd <- data.frame(
  predicted = rnorm(720, mean = 2, sd = 2),
  state = rep(c("A", "B", "C"), each = 240)
) 

binwidth <- 0.5

grid <- with(dd, seq(min(predicted), max(predicted), length = 100))
normaldens <- ddply(dd, "state", function(df) {
  data.frame( 
    predicted = grid,
    normal_curve = dnorm(grid, mean(df$predicted), sd(df$predicted)) * length(df$predicted) * binwidth
  )
})

ggplot(dd, aes(predicted))  + 
  geom_histogram(breaks = seq(-3,10, binwidth), colour = "black", fill = "white") + 
  geom_line(aes(y = normal_curve), data = normaldens, colour = "red") +
  facet_wrap(~ state)

这篇关于ggplot2:具有正常曲线的直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆