axis.break和ggplot2或gap.plot?情节可能太复杂了 [英] axis.break and ggplot2 or gap.plot? plot may be too complex

查看:165
本文介绍了axis.break和ggplot2或gap.plot?情节可能太复杂了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用ggplot2创建了一个图.这与牛奶中的蛋白质含量有关.我有两组和4种治疗方法.我想展示组和处理,均值和错误栏之间的交互.蛋白质含量从2.6%开始.现在,我的y轴从那里开始没有间隙,但是我的主管希望有一个. 我尝试了plotrix库的axis.break(),但没有任何反应.我试图用gap.plot重建图形,但没有成功,但我必须承认我不是R英雄.

I created a plot with ggplot2. It's about milk protein content. I have two groups and 4 treatments. I want to show the interaction between group and treatment, means and errorbars. The protein content starts at 2.6%. Now my y-axis starts there without a gap, but my supervisor wants to have one. I tried axis.break() of the plotrix library, but nothing happened. I tried to rebuild the graphic with gap.plot but I was not successful, but I must admit that I'm no R-hero.

这是我图形的代码:

Protein<-ggplot(data=D, aes(x=treat, y=Prot,group=group, shape=group))+
  geom_line(aes(linetype=group), size=1, position=position_dodge(0.2))+
  geom_point(size=3, position=position_dodge(0.2))+
  geom_errorbar(aes(ymin=Prot-Prot_SD,ymax=Prot+Prot_SD), width=.2,      
position=position_dodge(0.2))+ 
  scale_shape_discrete(name='group\n', labels=c('1\n(n =   
22,19,16,20)\n','2\n(n = 15,12,14,12)'))+
  scale_linetype_discrete(name="group\n", labels=c('control\n(n =   
22,19,16,20)\n','free-contact\n(n = 15,12,14,12)'))+
  scale_x_discrete(labels=c('0', '1', '2', '3'))+
  labs(x='\ntreatment', y='protein content (%)\n')
ProtStar<-Protein+annotate("text", x=c(1,2,3,4), y=c(3.25,3.25,3.25,3.25),   
label=c("Aa","Aa","Ab","Ba"), size=4)
plot(ProtStar)

不幸的是,我没有足够的声誉来发布图像,但是您可能从代码中看到图形很复杂.

Unfortunately I do not have enough reputation to post images, but you might see from the code that the graphic is complex.

如果您有有用的建议,那就太好了. 非常感谢!

It would be fantastic if you would have useful suggestions. Thanks a lot!

推荐答案

TL; DR:在底部.

考虑以下数字:

ggplot(iris, aes(Species, Sepal.Length)) + geom_boxplot() + 
  theme_classic()

这是您的基本情节.现在,您必须考虑Y轴.

This is your basic plot. Now you have to consider the Y-axis.

ggplot(iris, aes(Species, Sepal.Length)) + geom_boxplot() + 
  theme_classic() +
  scale_y_continuous(limits = c(0,NA), expand = c(0,0))

这是强调即使数据中的实际点都没有低于某个特定值的基础为零的最小误导方式.牛奶蛋白百分比是数据的一个很好的例子,其中不可能出现负值,但您要强调这一点,但没有观察到接近零的值.

This is the least misleading way of emphasizing that there is a zero floor to the data, even if there are no actual points below a certain value. Percent milk protein is a good example of data where negative values are impossible and you want to emphasize that, but that no observations were near zero.

这也缩小了Y轴的解释范围,因此观察值之间的差异较小.如果您要强调这一点,那可能很好.但是,如果某些数据的自然范围很窄,则包括零(以及由此产生的空白空间)会产生误导.例如,如果牛奶蛋白始终在2.6%和2.7%之间,则零值不是数据的下限,而是与-50%一样不可能.

This also shrinks the explanatory range of the Y axis, so that there's less difference between the observations. If this is something you want to emphasize, that can be good. But if the natural range of some data is narrow, including the zero (and the resulting empty space) is misleading. For example, if milk protein is always between 2.6% and 2.7%, then the zero value is not a true floor for the data, but just as impossible as -50%.

ggplot(iris, aes(Species, Sepal.Length)) + geom_boxplot() + 
  theme_classic() +
  scale_y_continuous(limits = c(0,NA), expand = c(0,0)) + 
  theme(axis.line.y = element_blank()) +
  annotate(geom = "segment", x = -Inf, xend = -Inf, y = -Inf, yend = Inf) 

有很多原因不包括损坏的Y轴.许多人认为将一个内部数据范围包括在内是不道德的或令人误解的.但是,这种特殊情况超出了实际数据的范围.我认为规则可以为此有所调整.

There are many reasons not to include a broken Y axis. It's perceived by many as being unethical or misleading to include one inside ranges of data. But this particular case is at the outer limit, beyond the actual data. I think the rules can be bent a bit for that.

第一步是删除自动Y轴线,并使用annotate手动绘制.请注意,该图看起来与上一个相同.如果您选择的主题使用很多不同的大小,那您将度过一段糟糕的时光.

The first step is to remove the automatic Y axis line and draw it in "by hand" using annotate. Notice that the figure looks identical to the one previous. If your theme of choice uses a lot of different sizes, you're gonna have a bad time.

ggplot(iris, aes(Species, Sepal.Length)) + geom_boxplot() + 
  theme_classic() + 
  scale_y_continuous(limits = c(3.5,NA), expand = c(0,0), 
                     breaks = c(3.5, 4:7)) + 
  theme(axis.line.y = element_blank()) +
  annotate(geom = "segment", x = -Inf, xend = -Inf, y = -Inf, yend = Inf)

现在,您可以考虑实际数据从何处开始以及在哪里休息.您必须手动检查;例如min(iris$Sepal.Length)并考虑刻度线将移至何处.这是一个个人判断电话.

Now you can consider where the actual data begin and where is a good spot to put the break. You have to check by hand; e.g. min(iris$Sepal.Length) and consider where the tick marks will go. This is a personal judgment call.

我发现最低值为4.3.我知道我希望断裂线低于最小值,并且希望断裂线长约0.5个单位.因此,我选择在3.5处打一个勾号,然后在每个整数后面加上breaks = c(3.5, 4:7).

I found that the lowest value was at 4.3. I knew I wanted the break to be below the minimum, and I wanted the break to be about 0.5 units long. So I chose to put a tick mark at 3.5, and then each integer afterwards with breaks = c(3.5, 4:7).

ggplot(iris, aes(Species, Sepal.Length)) + geom_boxplot() + 
  theme_classic() + 
  scale_y_continuous(limits = c(3.5,NA), expand = c(0,0), 
                     breaks = c(3.5, 4:7), labels = c(0, 4:7)) + 
  theme(axis.line.y = element_blank()) +
  annotate(geom = "segment", x = -Inf, xend = -Inf, y = -Inf, yend = Inf)

现在,我们需要使用labels = c(0, 4:7)重新标记3.5刻度为假零.

Now we need to relabel the 3.5 tick to be a fake zero with labels = c(0, 4:7).

ggplot(iris, aes(Species, Sepal.Length)) + geom_boxplot() + 
  theme_classic() + 
  scale_y_continuous(limits = c(3.5,NA), expand = c(0,0), 
                     breaks = c(3.5, 4:7), labels = c(0, 4:7)) + 
  theme(axis.line.y = element_blank()) +
  annotate(geom = "segment", x = -Inf, xend = -Inf, y = -Inf, yend = Inf) +
  annotate(geom = "segment", x = -Inf, xend = -Inf, y =  3.5, yend = 4,
           linetype = "dashed", color = "white")

现在,我们在手动绘制的轴线上绘制一条白色虚线,从虚假的零(y = 3.5)到最低的真实刻度线(y = 4).

Now we draw on a white dotted line over the manually-drawn axis line, going from our fake zero (y=3.5) to the lowest true tick mark (y=4).

考虑图形语法是一种成熟的哲学;也就是说,每个元素背后都有深思熟虑的推理.这样做很挑剔的事实是有充分的理由的,您需要考虑自己的理由是否足以在另一方面发挥作用.

Consider that the grammar of graphics is a mature philosophy; that is to say, each element has thoughtful reasoning behind it. The fact that this is finicky to do is for good reasons, and you need to consider whether your own reasons are sufficient weight on the other side.

这篇关于axis.break和ggplot2或gap.plot?情节可能太复杂了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆