具有经验密度和范数函数的直方图叠加 [英] Overlay histogram with empirical density and dnorm function

查看:38
本文介绍了具有经验密度和范数函数的直方图叠加的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用经验和法线密度曲线覆盖ggplot直方图(y轴=计数).我试过了:

I want to overlay a ggplot histogram (y-axis = counts) with the empirical and normal density curve. I tried:

library(ggplot2) 
set.seed(1234) 
v <- as_tibble(rnorm(1000, 10, 2.5)) 
ggplot(v, aes(x = value)) +
        geom_histogram(aes(y = ..density..), 
                       bins = 40,  colour = "black", fill = "white") +
        geom_line(aes(y = ..density.., color = 'Empirical'), stat = 'density') +     
        stat_function(fun = dnorm, aes(color = 'Normal'),
                         args = list(mean = 10, sd = 2.5)) +
        scale_colour_manual(name = "Colors", values = c("red", "blue"))

但是它的密度为y标度,我希望频率为y轴.

But this has the density as y scale, and I want frequencies as y-axis.

我的第二次试验制作了以频率(计数)为y轴但仅以经验密度为图的图.

My second trial produced the plot with the frequencies (counts) as y-axis but only with the empirical density.

library(ggplot2)
set.seed(1234)
v <- as_tibble(rnorm(1000, 10, 2.5))
b  <- seq(0, 20, by = 0.5)
p1 <- ggplot(v, aes(x = value)) +
    geom_histogram(aes(y = ..count..), 
                   breaks = b,
                   binwidth = 0.5,  
                   colour = "black", 
                   fill = "white") +
    geom_line(aes(y = ..density.. * (1000 * 0.5),
                    color = 'Empirical'),
                    stat = 'density') +
    scale_colour_manual(name = "Colors", values = c("red", "blue"))

我无法在同一图中显示一条标准曲线.例如,当我尝试下一行时,我在x轴上得到了密度曲线(蓝线).

I could not manage to display also a dnorm curve in the same plot. When I tried for instance the next lines I got the density curve (blue line) on the x-axis.

p2 <- p1 + stat_function(fun = dnorm, aes(color = 'Normal'),
                     args = list(mean = 10, sd = 2.5))
p2  

我假设我必须根据二进制宽度(如经验线)调整曲线,但是我不知道该怎么做.

I assume that I have to adapt the curve with the binwidth (as with the empirical line) but I don't know how to do it.

我在SO中搜索了此问题,并且可以找到许多类似的问题.但是他们全部解决了我的第一个试验(密度为y轴),带有计数轴的经验叠加(第二个试验)或使用了我不熟悉的其他(基本)绘图命令.

I searched this problem in SO and could find many similar questions. But all of them addressed either my first trial (with density as y-axis), an empirical overlay with a count axis (my second trial) or used other (the base) plot commands I am not familiar with.

推荐答案

我按照来自@ user20650的链接重新编写了代码,并将@PatrickT的答案应用于我的问题.

I rewrote my code following the link from @user20650 and applied the answer by @PatrickT to my problem.

library(ggplot2)
n = 1000
mean = 10
sd = 2.5
binwidth = 0.5
set.seed(1234)
v <- as_tibble(rnorm(n, mean, sd))
b  <- seq(0, 20, by = binwidth)
ggplot(v, aes(x = value, mean = mean, sd = sd, binwidth = binwidth, n = n)) +
    geom_histogram(aes(y = ..count..), 
           breaks = b,
           binwidth = binwidth,  
           colour = "black", 
           fill = "white") +
    geom_line(aes(y = ..density.. * n * binwidth, colour = "Empirical"),
           size = 1, stat = 'density') +
    stat_function(fun = function(x) 
           {dnorm(x, mean = mean, sd = sd) * n * binwidth}, 
           aes(colour = "Normal"), size = 1) +
    labs(x = "Score", y = "Frequency") +
    scale_colour_manual(name = "Line colors", values = c("red", "blue"))

决定性的变化是在 stat-function 行中,其中提供了对n和binwidth的必要适应.此外,我不知道有人可以将参数传递给aes().

The decisive change is in the stat-function line, where the necessary adaption for n and binwidth is provided. Furthermore I did not know that one could pass parameters to aes().

这篇关于具有经验密度和范数函数的直方图叠加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆