ggplot barplot:如何以日志缩放的y轴显示小的正数 [英] ggplot barplot : How to display small positive numbers with log scaled y-axis

查看:337
本文介绍了ggplot barplot:如何以日志缩放的y轴显示小的正数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

主要问题:我希望显示从0到1.0的数据作为向上的条(从0开始),但不希望间隔的间隔是等间隔的但是间隔日志。



我试图在下面的数据集中显示标记为mean的列作为ggplot中的条形图,但由于数字非常小,我想在对数刻度上显示y轴而不是对数变换数据本身。换句话说,我想要有y轴标签为0,1e-8,1e-6 1e-4 1e-2和1e-0的正方形条(即从0到1.0,但时间间隔为对数比例)。



以下解决方案不适用于条形反转。

 > print(df)
type mean sd se snp
V7 outer 1.596946e-07 2.967432e-06 1.009740e-08 A
V8 outer 7.472417e-07 6.598652e-06 2.245349e-08 B
V9外部1.352327e-07 2.515771e-06 8.560512e-09 C
V10外部2.307726e-07 3.235821e-06 1.101065e-08 D
V11外部4.598375e-06 1.653457 e-05 5.626284e-08 E
V12外5.963164e-07 5.372226e-06 1.828028e-08 F
V71中2.035414e-07 3.246161e-06 1.104584e-08 A
V81中间9.000131e-07 7.261463e-06 2.470886e-08 B
V91中间1.647716e-07 2.875840e-06 9.785733e-09 C
V101中3.290817e-07 3.886779e-06 1.322569e -08 D
V111中间6.371170e-06 1.986268e-05 6.758752e-08 E
V121中8.312429e-07 6.329386e-06 2.153725e-08 F
  

code> ggplot(data = df,aes(x = snp,y = mean,fill = type))+
geom_bar(stat =identity,位置= position_dodge(),width = 0.5)+
geom_errorbar(aes(ymin = mean-se,ymax = mean + se),width = .3,position = position_dodge(.45))

但是,我想让y轴日志缩放,因此我按如下方式添加scale_y_log10():

  ggplot(data = df,aes(x = snp,y = mean,fill = type))+ 
geom_bar(stat =identity,position = position_dodge(),width = 0.5)+ scale_y_log10()+
geom_errorbar(aes(ymin = mean-se,ymax = mean + se),width = .3,position = position_dodge( .45))

但奇怪的是,酒吧正在从上面掉下来,但我只希望他们上升(通常情况下),不知道我做错了什么。



谢谢

解决方案

这里有一些黑客来展示如果您尝试在日志范围内获取从零开始的小节,则会发生。我已经使用了 geom_segment 来说明,这样我就可以在任意范围内创建条(实际上是宽线段)。为了做到这一点,我还必须手动做所有的躲避,这就是为什么 x 映射看起来很奇怪。



在下面的例子中,比例从y = 1e-20到y = 1。 y轴间隔是对数比例的,这意味着从1e-20到1e-19的物理距离与从1e-8到1e-7的物理距离相同,即使这些间隔的大小相差一万亿分之一。



无法显示下降到零的条,因为对数刻度上的零是图下方的无限距离。例如,我们可以通过在下面的代码中将 1e-20 更改为 1e-100 来接近零。但这只会使数据值之间已经很小的物理距离变得更小,因此更难以区分。

从另一方面来看,这些横条也是误导性的,因为正如@hrbrmstr指出的那样,我们的大脑以线性方式处理沿线条的距离,但是每个距离增量在下面的示例中,沿着条每隔几毫米变化10倍。

  ggplot(data = df,aes(x = as.numeric) (snp)+ 0.3 *(as.numeric(type) -  1.5),
y = mean,color = type))+
geom_errorbar(aes(ymin = mean-se,ymax = mean + se) ,width = .3)+
geom_segment(aes(xend = as.numeric(snp)+ 0.3 *(as.numeric(type) - 1.5),
y = 1e-20,yend = mean) ,size = 5)+
scale_y_log10(limits = c(1e-20,1),breaks = 10 ^( - 100:0),expand = c(0,0))+
scale_x_continuous break = 1:6,labels = LETTERS [1:6])



如果你想坚持一个对数刻度,也许绘制点是一个更好的方法:

  pd = position = position_dodge(.5)
ggplot(data = df,aes(x = snp,y = mean,fill = type))+
geom_errorbar(aes(ymin = mean -se,ymax = mean + se,color = type),width = .3,position = pd)+
geom_point(aes(color = type),position = pd)+
scale_y_log10(limits = c(1e-7,1e-5),break = 10 ^( - 10:0))+
annotation_logticks(sides =l)


Main issue: I want to display the data from 0 to 1.0 as an upward bar (starting from 0) but do not want the intervals to be equally spaced but log spaced.

I am trying to display the column labeled "mean" in the dataset below as a bar plot in ggplot but as the numbers are very small, I would like to show the y-axis on a log scale rather than log transform the data itself. In other words, I want to have upright bars with y-axis labels as 0, 1e-8, 1e-6 1e-4 1e-2 and 1e-0 (i.e. from 0 to 1.0 but the intervals are log scaled).

The solution below does not work as the bars are inverted.

> print(df)
        type         mean           sd           se snp
V7    outer 1.596946e-07 2.967432e-06 1.009740e-08   A
V8    outer 7.472417e-07 6.598652e-06 2.245349e-08   B
V9    outer 1.352327e-07 2.515771e-06 8.560512e-09   C
V10   outer 2.307726e-07 3.235821e-06 1.101065e-08   D
V11   outer 4.598375e-06 1.653457e-05 5.626284e-08   E
V12   outer 5.963164e-07 5.372226e-06 1.828028e-08   F
V71  middle 2.035414e-07 3.246161e-06 1.104584e-08   A
V81  middle 9.000131e-07 7.261463e-06 2.470886e-08   B
V91  middle 1.647716e-07 2.875840e-06 9.785733e-09   C
V101 middle 3.290817e-07 3.886779e-06 1.322569e-08   D
V111 middle 6.371170e-06 1.986268e-05 6.758752e-08   E
V121 middle 8.312429e-07 6.329386e-06 2.153725e-08   F

The code below properly generates the grouped barplot with error bars

ggplot(data=df, aes(x=snp,y=mean,fill=type))+
  geom_bar(stat="identity",position=position_dodge(),width=0.5) + 
  geom_errorbar(aes(ymin=mean-se, ymax=mean+se),width=.3, position=position_dodge(.45)) 

However, I want to make the y-axis log scaled and so I add in scale_y_log10() as follows:

 ggplot(data=df, aes(x=snp,y=mean,fill=type))+
  geom_bar(stat="identity",position=position_dodge(),width=0.5) + scale_y_log10() +
  geom_errorbar(aes(ymin=mean-se, ymax=mean+se),width=.3, position=position_dodge(.45)) 

But strangely the bars are falling from above but I simply want them to be going up (as normally) and don't know what I am doing wrong.

Thank you

解决方案

Here's a bit of hacking to show what happens if you try to get bars that start at zero on a log scale. I've used geom_segment for illustration, so that I can create "bars" (wide line segments, actually) extending over arbitrary ranges. To make this work, I've also had to do all the dodging manually, which is why the x mapping looks weird.

In the example below, the scale goes from y=1e-20 to y=1. The y-axis intervals are log scaled, meaning that the physical distance from, say 1e-20 to 1e-19 is the same as the physical distance from, say, 1e-8 to 1e-7, even though the magnitudes of those intervals differ by a factor of one trillion.

Bars that go down to zero can't be displayed, because zero on the log scale is an infinite distance below the bottom of the graph. We could get closer to zero by, for example, changing 1e-20 to 1e-100 in the code below. But that will just make the already-small physical distances between the data values even smaller and thus even harder to distinguish.

The bars are also misleading in another way, because, as @hrbrmstr pointed out, our brains treat distance along the bar linearly, but the magnitude represented by each increment of distance along the bar changes by a factor of 10 about every few millimeters in the example below. The bars simply aren't encoding meaningful information about the data.

ggplot(data=df, aes(x=as.numeric(snp) + 0.3*(as.numeric(type) - 1.5), 
                    y=mean, colour=type)) +
  geom_errorbar(aes(ymin=mean-se, ymax=mean+se), width=.3) +
  geom_segment(aes(xend=as.numeric(snp) + 0.3*(as.numeric(type) - 1.5),
                   y=1e-20, yend=mean), size=5) +
  scale_y_log10(limits=c(1e-20, 1), breaks=10^(-100:0), expand=c(0,0)) +
  scale_x_continuous(breaks=1:6, labels=LETTERS[1:6])

If you want to stick with a log scale, maybe plotting points would be a better approach:

pd = position=position_dodge(.5)
ggplot(data=df, aes(x=snp,y=mean,fill=type))+
  geom_errorbar(aes(ymin=mean-se, ymax=mean+se, colour=type), width=.3, position=pd) +
  geom_point(aes(colour=type), position=pd) +
  scale_y_log10(limits=c(1e-7, 1e-5), breaks=10^(-10:0)) +
  annotation_logticks(sides="l")

这篇关于ggplot barplot:如何以日志缩放的y轴显示小的正数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆