将R ggplot中的直方图中的y轴标准化为按比例分组 [英] Normalizing y-axis in histograms in R ggplot to proportion by group

查看：1159 发布时间：2018/4/24 20:27:19 r ggplot2 histogram

本文介绍了将R ggplot中的直方图中的y轴标准化为按比例分组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的问题与正常化y-除了我有两组不同大小的数据，我希望每个比例都是相对于它的组大小而不是总大小。

为了使它更清晰，假设我在数据框中有两组数据：

  dataA <-rnorm（100,3，sd = 2）
 dataB <-rnorm（400,5，sd = 3）
 all <-data.frame（dataset = c（rep（'A '，length（dataA）），rep（'B'，length（dataB））），value = c（dataA，dataB））

我可以将这两个分布图一起打印出来：

  ggplot（all，aes（x =值，fill = dataset））+ geom_histogram（alpha = 0.5，position ='identity'，binwidth = 0.5）

而不是在Y轴上的频率，我可以有以下比例：

  ggplot（所有，AES（X =值，填补=数据集））+ geom_histogram（AES（Y = ..计数../总和（..计数..）），α= 0.5，位置= '身份'，binwidth = 0.5）

但是这给出了相对于总数据大小的比例（这里是500点）：是的可能相对于每个组的大小吗？

我的目标是使得可以直观地比较A和B之间给定分箱中的值的比例，从他们各自的大小。不同于我原创的意见也是值得欢迎的！

谢谢！
像这样？

p>

ggplot（all，aes（x = value，fill = dataset））+ geom_histogram（aes（y = 0.5 * ..density ..）， alpha = 0.5，position ='identity'，binwidth = 0.5）
使用 y = .. density .. 缩放直方图，使每个下方的面积为1，或 sum（binwidth * y）= 1 因此，您可以使用 y = binwidth * .. density .. 来使y代表每个垃圾箱中总计的比例。在你的情况下， binwidth = 0.5 。

IMO更易于理解：

<$ （aes（x = value，fill = dataset））+
geom_histogram（aes（y = 0.5 * .. density ..），binwidth = 0.5）+ p $ p> ggplot facet_wrap（〜dataset，nrow = 2）

My question is very similar to Normalizing y-axis in histograms in R ggplot to proportion, except that I have two groups of data of different size, and I would like that each proportion is relative to its group size instead of the total size.

To make it clearer, let's say I have two sets of data in a data frame:
dataA<-rnorm(100,3,sd=2) dataB<-rnorm(400,5,sd=3) all<-data.frame(dataset=c(rep('A',length(dataA)),rep('B',length(dataB))),value=c(dataA,dataB))
I can plot the two distributions together with:
ggplot(all,aes(x=value,fill=dataset))+geom_histogram(alpha=0.5,position='identity',binwidth=0.5)
and instead of the frequency on the Y axis I can have the proportion with:
ggplot(all,aes(x=value,fill=dataset))+geom_histogram(aes(y=..count../sum(..count..)),alpha=0.5,position='identity',binwidth=0.5)
But this gives the proportion relative to the total data size (500 points here): is it possible to have it relative to each group size?

My goal here is to make it possible to compare visually the proportion of values in a given bin between A and B, independently from their respective size. Ideas which differ from my original one are also welcome!

Thanks!
解决方案
Like this? [edited based on OP's comment]

ggplot(all,aes(x=value,fill=dataset))+ geom_histogram(aes(y=0.5*..density..), alpha=0.5,position='identity',binwidth=0.5)
Using y=..density.. scales the histograms so the area under each is 1, or sum(binwidth*y)=1. As a result, you would use y = binwidth*..density.. to have y represent the fraction of the total in each bin. In your case, binwidth=0.5.

IMO this is a little easier to interpret:

ggplot(all,aes(x=value,fill=dataset))+ geom_histogram(aes(y=0.5*..density..),binwidth=0.5)+ facet_wrap(~dataset,nrow=2)

这篇关于将R ggplot中的直方图中的y轴标准化为按比例分组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将R ggplot中的直方图中的y轴标准化为按比例分组 [英] Normalizing y-axis in histograms in R ggplot to proportion by group

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

将R ggplot中的直方图中的y轴标准化为按比例分组 [英] Normalizing y-axis in histograms in R ggplot to proportion by group

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭