如何解释不同的ggplot2密度? [英] How to interpret the different ggplot2 densities?

查看：106 发布时间：2020/11/14 0:38:21 r ggplot2

本文介绍了如何解释不同的ggplot2密度?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我对ggplot中geom_density的以下变体的含义感到困惑:

I am confused about the meaning of the following variants of geom_density in ggplot:

有人可以解释这四个电话之间的区别吗?

Can someone please explain the difference between these four calls:

geom_density(aes_string(x=myvar))
geom_density(aes_string(x=myvar, y=..density..))
geom_density(aes_string(x=myvar, y=..scaled..))
geom_density(aes_string(x=myvar, y=..count../sum(..count..)))

geom_density(aes_string(x=myvar))
geom_density(aes_string(x=myvar, y=..density..))
geom_density(aes_string(x=myvar, y=..scaled..))
geom_density(aes_string(x=myvar, y=..count../sum(..count..)))

我的理解是:

geom_density会产生密度，其曲线下面积之和为1
geom_density与..density..基本上是相同的...?
..count../sum(..count..)会将峰高归一化，使其更像归一化的直方图，确保所有高度之和为1
..count..本身不带分母的情况下，只会将每个bin乘以其中的#个项目
..scaled..参数将使它成为最大值，因此密度的最大值为1.

geom_density alone will produce a density whose area under the curve sums to 1
geom_density with ..density.. basically does the same... ?
the ..count../sum(..count..) will normalize the peak heights to be more like a normalized histogram, ensuring that all the heights sum to 1
the ..count.. by itself without the denominator will just multiply each bin by # of items in it
the ..scaled.. parameter will make it so the maximum value of the density is 1.

我发现..scaled..非常违反直觉，并且如果我对它的解释是正确的，则从未见过使用过它，因此我想忽略它.我主要是在寻找geom_density与一种归一化密度图之间的区别的澄清，我假设这需要...count../...自变量.谢谢.

I find ..scaled.. very counterintuitive and have never seen it used if my interpretation of it is correct so I'd like to ignore that. I am mainly looking for a clarification of the differences between geom_density and a kind of normalized density plot, which I am assuming requires the ...count../... argument. thanks.

(相关:将ggplot2映射变量映射到y和使用stat ="bin" )

推荐答案

stat_density的默认美观度是..density..，因此默认情况下使用stat_density的geom_density调用将按以下方式绘制y = ..density..默认.

The default aesthetic for stat_density is ..density.., so a call to geom_density which uses stat_density by default, will plot y = ..density.. by default.

通过查看源代码

..scaled..定义为

densdf$scaled <- densdf$y / max(densdf$y, na.rm = TRUE)

如果愿意，可以忽略它.

Feel free to ignore it if you wish.

查看 stat_bin的源代码

结果是这样计算的

res <- within(results, {
    count[is.na(count)] <- 0
    density <- count / width / sum(abs(count), na.rm=TRUE)
    ncount <- count / max(abs(count), na.rm=TRUE)
    ndensity <- density / max(abs(density), na.rm=TRUE)
  })

因此，如果您想比较geom_histogram的结果(使用默认的stat = 'bin')，则可以设置y = ..density..，它将为您计算count / sum(count)(计算仓的宽度)

So if you want to compare the results of geom_histogram (using the default stat = 'bin'), then you can set y = ..density.. and it will calculate count / sum(count) for you (accounting for the width of the bins)

如果您想将geom_density(aes(y=..scaled..))与stat_bin进行比较，则可以使用geom_histogram(aes(y = ..ndensity..))

If you wanted to compare geom_density(aes(y=..scaled..)) with stat_bin, then you would use geom_histogram(aes(y = ..ndensity..))

也可以通过同时使用..count..来获得相同的比例，但是您需要调整stat_density中的adjust参数以获得适当的曲线近似值.

You could get them on the same scale by using ..count.. in both as well, however you would need to adjust the adjust parameter in stat_density to get the appropriately detailed approximation of the curve.

这篇关于如何解释不同的ggplot2密度?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何解释不同的ggplot2密度? [英] How to interpret the different ggplot2 densities?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何解释不同的ggplot2密度? [英] How to interpret the different ggplot2 densities?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭