带散点图覆盖的ggplot箱形图(相同变量) [英] ggplot boxplots with scatterplot overlay (same variables)

查看:303
本文介绍了带散点图覆盖的ggplot箱形图(相同变量)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一名本科研究员,在过去的几个月里我一直在教自己。我刚开始尝试ggplot,并遇到了一些麻烦。我制作了一系列的箱子图,在不同的声音接收站观察鱼的深度。我想添加一个散点图来显示接收站的深度。这是我到目前为止:

  data<  -  read.csv(..... MPS.csv ,header = TRUE)
df < - data.frame(f1 = factor(data $ Tagging.location),#$
f2 = factor(data $ Station),data $ Detection.depth)$数据帧(f2 =因子(数据$ Station),数据$深度)
df $ f1f2< - 交互(df $ f1,df $ f2)#$
plot1 <-ggplot(aes(y = data $ Detection.depth,x = f2,fill = f1),data = df)+#$
geom_boxplot()+ stat_summary(fun.data = give.n, geom =text,
position = position_dodge(height = 0,width = 0.75),size = 3)
plot1 + xlab(MPS Station)+ ylab(Depth(m)) +
theme(legend.title = element_blank())+ scale_y_reverse()+
coord_cartesian(ylim = c(150,-10))
plot2 < - ggplot(aes(y =数据$ depth,x = f2),data = df2)+ geom_point()
plot2 + scale_y_reverse()+ coord_cartesian(ylim = c(150,-10))+
xlab(MPS Station)+ ylab(深度(m))

因为我是这个论坛的新用户,所以我不允许上传这两幅图的图片。我的x轴是Stations(有12个选项),我的y轴是Depth(0-150 m)。箱形图通过标记网站(有2个选项)进行颜色编码。深度来自我的电子表格中的两个不同列,并且它们不能合并为一个。



我的目标是将这两个图相结合,通过添加plot2 (站深度散点图)到plot1箱形图(检测深度)。他们都在查看相同的变量(深度和站点),并且必须是相同的y轴比例。



我想我可以弄清楚一个混乱的解决方法,如果我正在使用R基础程序,但如果可能的话,我想正确学习ggplot。任何帮助都非常感谢!

解决方案

更新:原文中使用的语言张贴,并写了一个比必要的更复杂的答案。这是已清理的版本。



第1步:设置。在这里,我们确保两个数据帧中的深度值具有相同的变量名称(为了便于阅读)。

  df < - data.frame(f1 = factor(data $ Tagging.location),f2 = factor(data $ Station),depth = data $ Detection.depth)

df2 < - data.frame(f2 =因子(数据$ Station),depth = data $ depth)

第2步:现在你可以用'ggplot'函数来绘制这个图形,并使用`col = f1``参数来分割数据。我们将单独绘制探测数据,因为这需要一个盒形图,然后我们将用彩色点绘制台站的深度(假设每个台站只有一个深度)。我们通过引用'geom'函数内的数据来指定两个不同的图,而不是指定主'ggplot'函数内的数据。它应该看起来像这样:

  ggplot()+ geom_boxplot(data = df,aes(x = f2,y = depth ,color =blue)+ scale_y_reverse()

在这个绘图示例中,我们使用箱线图来表示检测数据并通过站点标签对这些箱线图进行着色。但是,我们使用特定颜色的点分别绘制站点,所以我们将能够清楚地看到它们与箱形图的关系。

您应该能够从这里调整图以满足您的需求。



我已经创建了一些虚拟数据并加载到图表中以向您展示它的外观。请记住,这是纯粹的随机数据,并没有什么意义。




I'm an undergrad researcher and I've been teaching myself R over the past few months. I just started trying ggplot, and have run into some trouble. I've made a series of boxplots looking at the depth of fish at different acoustic receiver stations. I'd like to add a scatterplot that shows the depths of the receiver stations. This is what I have so far:

data    <- read.csv(".....MPS.csv", header=TRUE)
df      <- data.frame(f1=factor(data$Tagging.location), #$
                      f2=factor(data$Station),data$Detection.depth)
df2     <- data.frame(f2=factor(data$Station), data$depth)
df$f1f2 <- interaction(df$f1, df$f2) #$
plot1   <- ggplot(aes(y = data$Detection.depth, x = f2, fill = f1), data = df) + #$
                  geom_boxplot() + stat_summary(fun.data = give.n, geom = "text", 
                  position = position_dodge(height = 0, width = 0.75), size = 3)
                  plot1+xlab("MPS Station") + ylab("Depth(m)") +
                  theme(legend.title=element_blank()) + scale_y_reverse() + 
                  coord_cartesian(ylim=c(150, -10))
plot2   <- ggplot(aes(y=data$depth, x=f2), data=df2) + geom_point()
                  plot2+scale_y_reverse() + coord_cartesian(ylim=c(150,-10)) + 
                  xlab("MPS Station") + ylab("Depth (m)")

Unfortunately, since I'm a new user in this forum, I'm not allowed to upload images of these two plots. My x-axis is "Stations" (which has 12 options) and my y-axis is "Depth" (0-150 m). The boxplots are colour-coded by tagging site (which has 2 options). The depths are coming from two different columns in my spreadsheet, and they cannot be combined into one.

My goal is to to combine those two plots, by adding "plot2" (Station depth scatterplot) to "plot1" boxplots (Detection depths). They are both looking at the same variables (depth and station), and must be the same y-axis scale.

I think I could figure out a messy workaround if I were using the R base program, but I would like to learn ggplot properly, if possible. Any help is greatly appreciated!

解决方案

Update: I was confused by the language used in the original post, and wrote a slightly more complicated answer than necessary. Here is the cleaned up version.

Step 1: Setting up. Here, we make sure the depth values in both data frames have the same variable name (for readability).

df <- data.frame(f1=factor(data$Tagging.location), f2=factor(data$Station), depth=data$Detection.depth)

df2 <- data.frame(f2=factor(data$Station), depth=data$depth)

Step 2: Now you can plot this with the 'ggplot' function and split the data by using the `col=f1`` argument. We'll plot the detection data separately, since that requires a boxplot, and then we'll plot the depths of the stations with colored points (assuming each station only has one depth). We specify the two different plots by referencing the data from within the 'geom' functions, instead of specifying the data inside the main 'ggplot' function. It should look something like this:

ggplot()+geom_boxplot(data=df, aes(x=f2, y=depth, col=f1)) + geom_point(data=df2, aes(x=f2, y=depth), colour="blue") + scale_y_reverse()

In this plot example, we use boxplots to represent the detection data and color those boxplots by the site label. The stations, however, we plot separately using a specific color of points, so we will be able to see them clearly in relation to the boxplots.

You should be able to adjust the plot from here to suit your needs.

I've created some dummy data and loaded into the chart to show you what it would look like. Keep in mind that this is purely random data and doesn't really make sense.

这篇关于带散点图覆盖的ggplot箱形图(相同变量)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆