ggplot中的标准化条高 [英] normalized bar heights in ggplot

查看:41
本文介绍了ggplot中的标准化条高的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将两套计数数据与ggplot进行比较.数据集的长度不同,我很难弄清楚如何将条形高度标准化为每个数据集中的行数.请参见下面的代码示例:

I am trying to compare two sets of count data with ggplot. The datasets are of different lengths and I am having trouble figuring out how to normalize the bar heights to the number of rows in each dataset. Please see the code examples below:

示例数据集

set.seed(47)
BG.restricted.hs = round(runif(100, min = 47, max = 1660380))
FG.hs = round(runif(1000, min = 0, max = 1820786))

dat = data.frame(x = c(BG.restricted.hs, FG.hs), 
             source = c(rep("BG", length(BG.restricted.hs)),
                        rep("FG", length(FG.hs))))
dat$bin = cut(dat$x, breaks = 200)

第一次尝试:没有规范化.条形高度因数据集大小而有很大差异!

First attempt: no normalization. Bar heights are very different due to the dataset sizes!

ggplot(dat, aes(x = bin, fill = source)) +
    geom_bar(position = "identity", alpha = 0.2) +
    theme_bw() +
    scale_x_discrete(breaks = NULL)

第二次尝试:尝试使用..count ..属性进行规范化

Second attempt: Tried normalization with the ..count.. property

ggplot(dat,aes(x = bin, fill = source))+
    geom_bar(aes(y = ..count../sum(..count..)), alpha=0.5, position='identity')

这产生了视觉上相同的结果,仅按比例缩放了整个y轴.似乎..count ..并没有查看源"列中的标签,尽管进行了数小时的实验,但我似乎无法找到一种方法来做到这一点.这可能吗?

This produced visually identical results with only the overall y axis scaled. It seems that ..count.. is not looking at the labels in the "source" column and I cannot seem to find a way to make it do so despite hours of experimenting. Is this possible?

推荐答案

我认为应该这样做.在 ggplot 调用中将 source 设置为组:

I believe this should do it. Setting the source as a group in the ggplot call:

ggplot(dat, aes(x = bin, y = ..density.., group = source, fill = source)) +
    geom_bar(alpha = 0.5, position = 'identity')

这篇关于ggplot中的标准化条高的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆