"密度"曲线叠加在直方图上,其中纵轴是频率(又名计数)还是相对频率? [英] "Density" curve overlay on histogram where vertical axis is frequency (aka count) or relative frequency?
问题描述
当垂直轴是频率或相对频率时,是否有一种方法可以将类似于密度曲线的东西重叠? (不是一个实际的密度函数,因为该区域不需要集成到1)。下面的问题是类似的:
ggplot2:具有正常曲线的直方图,并且用户自行回答该想法以缩放 .. count ..
在 geom_density()
。然而这看起来不寻常。
以下代码会产生一个过度膨胀的密度线。
df1 < - data.frame(v = rnorm(164,mean = 9,sd = 1.5))
b1 < - seq(4.5,12,by = 0.1)
hist.1a< - ggplot(df1,aes(v))+
stat_bin(aes(y = ..count ..),color =black,fill =blue,
breaks = b1)+
geom_density(aes(y = ..count ..))
hist.1a
@ joran的回应/评论让我想到了适当的缩放因子是什么。为了后代的缘故,结果如下。
垂直轴为频率(又名计数)
因此,缩放以bin计数衡量的垂直轴的因子是bb
$ b
在这种情况下,使用 N = 164
和垃圾箱宽度为 0.1
,平滑线中的y的美学应该是:
y = ..density .. *(164 * 0.1)
因此,下面的代码产生一个密度 (aka计数)。
df1 < - data.frame(v = rnorm(164,mean = 9,sd = 1.5))
b1 < - seq(4.5,12,by = 0.1)
hist.1a < - ggplot(df1,aes(x = v))+
geom_histogram( aes(y = ..count ..),休息= b1,
填充=蓝色,颜色=黑色)+
geom_density(aes(y = ..density .. *(164 * 0.1)))
hist.1a
纵轴为相对频率
使用上面的代码,我们可以写出
hist.1b < - ggplot(df1,aes(x = v))+
geom_histogram(aes(y = ..count ../ 164 ),break = b1,
fill =blue,color =black)+
geom_density(aes(y = ..density .. *(0.1)))
hist。 1b
<$ p $
p>
hist.1c < - ggplot(df1,aes(x = v))+
geom_histogram(aes(y = ..density ..),breaks = b1,
fill =blue,color =black)+
geom_density(aes(y = ..density ..) )
hist.1c
Is there a method to overlay something analogous to a density curve when the vertical axis is frequency or relative frequency? (Not an actual density function, since the area need not integrate to 1.) The following question is similar:
ggplot2: histogram with normal curve, and the user self-answers with the idea to scale ..count..
inside of geom_density()
. However this seems unusual.
The following code produces an overinflated "density" line.
df1 <- data.frame(v = rnorm(164, mean = 9, sd = 1.5))
b1 <- seq(4.5, 12, by = 0.1)
hist.1a <- ggplot(df1, aes(v)) +
stat_bin(aes(y = ..count..), color = "black", fill = "blue",
breaks = b1) +
geom_density(aes(y = ..count..))
hist.1a
@joran's response/comment got me thinking about what the appropriate scaling factor would be. For posterity's sake, here's the result.
When Vertical Axis is Frequency (aka Count)
Thus, the scaling factor for a vertical axis measured in bin counts is
In this case, with N = 164
and the bin width as 0.1
, the aesthetic for y in the smoothed line should be:
y = ..density..*(164 * 0.1)
Thus the following code produces a "density" line scaled for a histogram measured in frequency (aka count).
df1 <- data.frame(v = rnorm(164, mean = 9, sd = 1.5))
b1 <- seq(4.5, 12, by = 0.1)
hist.1a <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..count..), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..*(164*0.1)))
hist.1a
When Vertical Axis is Relative Frequency
Using the above, we could write
hist.1b <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..count../164), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..*(0.1)))
hist.1b
When Vertical Axis is Density
hist.1c <- ggplot(df1, aes(x = v)) +
geom_histogram(aes(y = ..density..), breaks = b1,
fill = "blue", color = "black") +
geom_density(aes(y = ..density..))
hist.1c
这篇关于"密度"曲线叠加在直方图上,其中纵轴是频率(又名计数)还是相对频率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!