ggplot - 在装箱(不连续)x轴的箱形图上添加回归线 [英] ggplot - Add regression line on a boxplot with binned (non-continuous) x-axis

查看:1023
本文介绍了ggplot - 在装箱(不连续)x轴的箱形图上添加回归线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  df < -  data.frame(VPD.mean = rnorm(100,平均值= 2,sd = 0.8),处理= c(环境,升高),变量= rnorm(100,平均值= 50,sd = 10))
df $ group <如果因子(VPD.mean> 0&VPD.mean≤1,0-1,ifelse(
VPD.mean> 1&VPD.mean≤1.5,1- (
VPD.mean> = 2& VPD.mean< 2.5,ifelse(
VPD.mean> 1.5& VPD.mean< 2, (
VPD.mean> = 2.5& VPD.mean< 3,2.5-3,ifelse(
VPD.mean> = 3,> ; 3,NA)
))))))
df $ group < - 因子(df $ group,levels = c(0-1,1-1.5,1.5 -2,2-2.5,2.5-3,> 3))

我使用在合并VPD.mean后创建的组创建了boxplot,因此x轴是不连续的(参见下图):



我也想要t o添加回归线(平滑),因此我必须使用连续变量(VPD.mean)而不是装箱的(组)作为x轴。结果不好,因为平滑线与图的x轴不匹配。这是ggplot的代码:

  ggplot(df [!is.na(df $ group),],aes( ),
geom_boxplot(outlier.size = 0)+ geom_smooth(aes(x = VPD.mean))

在同一个图上绘制来自不同x轴的geom_smooth的解决方案是什么?
Thanks

解决方案

可以做你想问的问题,但这是一个非常糟糕的主意。

  set.seed(1)#用于可重现的示例
df < - data.frame(VPD.mean = rnorm(100,平均值= 2,sd = 0.8),处理= c(环境,升高),变量= rnorm(100,平均值= 50,sd = 10))
df $ group < - cut $ VPD.mean,
breaks = c(0,seq(1,3,by = 0.5),Inf),
labels = c(0-1,1-1.5, 1.5-2,2-2.5,2.5-3,> 3))
library(ggplot2)
ggplot(df [!is.na(df $ group), ])+
geom_boxplot(aes(x = factor(group),y = variable,fill = treatment),
position = position_dodge(.7),width = .8)+
geom_smooth (aes(x = as.integer(group),y = variable,color = treatment,fill = treatment),method = loess)



工作,或多或少,因为 ggplot 使用x轴的因子代码,因子l用于轴标签的evels。 as.integer(group)返回因子代码。如果你的垃圾箱的尺寸不是相同的(对于你的情况它们不是这样),这个情节可能会让人产生误解。

I have a dataset with this structure:

df<- data.frame (VPD.mean=rnorm(100,mean=2,sd=0.8), treatment=c("ambient","elevated"), variable=rnorm(100,mean=50,sd=10))
df$group <- with(df, as.factor (ifelse (VPD.mean>0 & VPD.mean<=1,"0-1",ifelse (
  VPD.mean>1 & VPD.mean<=1.5,"1-1.5",ifelse (
    VPD.mean >1.5 & VPD.mean<2, "1.5-2",ifelse (
      VPD.mean >=2 & VPD.mean<2.5, "2-2.5",ifelse (
        VPD.mean >=2.5 & VPD.mean <3,"2.5-3", ifelse(
          VPD.mean >=3,">3", NA)  
      )))))))
df$group<- factor(df$group,levels=c("0-1","1-1.5","1.5-2" ,"2-2.5","2.5-3",">3"))

I created a boxplot using the groups created after binning VPD.mean, and therefore the x-axis is non-continuous (see graph below):

I would also like to add a regression line (smooth), and therefore I would have to use the continuous variable (VPD.mean) instead of the binned one (groups) as x-axis. The result is not nice, because the smooth line doesn't match the x-axis of the graphs. This is the code for the ggplot:

ggplot(df[!is.na(df$group),], aes(group,variable,fill=treatment)) + 
  geom_boxplot(outlier.size = 0) + geom_smooth(aes(x=VPD.mean)) 

What's the solution to plot the geom_smooth from a different x-axis on the same graph? Thanks

解决方案

It is possible to do what you ask, but it is a stunningly bad idea.

set.seed(1)  # for reproducible example
df<- data.frame (VPD.mean=rnorm(100,mean=2,sd=0.8), treatment=c("ambient","elevated"), variable=rnorm(100,mean=50,sd=10))
df$group <- cut(df$VPD.mean,
                breaks=c(0,seq(1,3,by=0.5),Inf), 
                labels=c("0-1","1-1.5","1.5-2","2-2.5","2.5-3",">3"))
library(ggplot2)
ggplot(df[!is.na(df$group),]) +
  geom_boxplot(aes(x=factor(group),y=variable,fill=treatment),
               position=position_dodge(.7),width=.8)+
  geom_smooth(aes(x=as.integer(group),y=variable,color=treatment,fill=treatment),method=loess)

This works, more or less, because ggplot uses the factor codes for the x-axis, and the factor levels for the axis labels. as.integer(group) returns the factor codes. If your bins are not all the same size (and they are not, in your case), the plot can be misleading.

这篇关于ggplot - 在装箱(不连续)x轴的箱形图上添加回归线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆